Tuesday, September 25, 2012

Web Document Modeling

If we have meta-data of web document that generates the gem content of the page, it would be much easier to identify the relevance and similarity of several documents because it is faster in terms of computational complexity and more precise in terms of accuracy. Surely it would not substitute either the classic IR or AI-based document representations, but in the near future I hope to see the meta-data of web document help better model them. Will it?

Over-load Information problem

Over-Abundance of resources prevents user from retrieving the right information right away.

There are three information access paradigms that users undertake each time they need to meet particular information needs on the web hypertextual environment: Searching by surfing (or browsing), Searching by query and recommendation.

I was interested at the sentence “Studies show that users often start browsing from pages identified by less precise but more easily constructed queries, instead of spending time to fully specify their search goals”. That’s an undeniable truth, especially to me, because when I typed into the search bar, although I don’t think it’s language issue since I am not native-English user, It is not trivial to correctly type out what I originally meant. I believe it’s a problem brought usually by novice user of some aspect. And further, I think the familiarity level of certain object will decide the preciseness of the search goals. The more familiar you are with, the more immediately you can get your particular results back. For example, when I was coding in R, I wanted to use a certain package to do the string concatenate, then after typing string concatenation in R into search bar, tons of results were given back immediately, which was good. But it was too hard to identify the very page I was looking for, possibly for the reason that the letter “R” was everywhere in the hypertext space. So, the question raised to me was that how we do not be novice user even we are indeed. Is there a search engine that could refine the query and then search on the new query?

And the answer is yes, there is. So far as I know there actually is a patent named ” Systems and methods of refining a search query based on user-specified search keywords”. The patent was intended to refine user’s search query based on one or more data sources and loop this process until the user satisfied the customized search query.

User Profile for Personalized Information Access

 

Personalized Information Access addressed the information overload problem by building, managing, presenting customized information for individual users. This customization may take the form of filtering irrelevant information and/or identifying additional information of likely interest for the user.

First of all, I apologized and I truly respect the privacy problem rising from the term “tracking”, but that’s actually how I started thinking about this topic.

Different from group profile modeling, we can model the communication among the social networking. Like there is a saying, “to know a person, better to know person close to him”, if you wanted to know someone well, it’s better to get to know friends close to him and to see how he interacts with them while you directly interact with him. For example, Joey posted his new Air Jordan shoes on facebook, and Mike later liked it some much by clicking like on this post. If we could model this activity as an interest of Mike, we surely would recommend this type of shoes to Mike in case we are doing online recommendation.  Maybe this example sounds naïve, however it conveys a new approach for modeling user profile in my opinion. As the development of social networking is exploding, actually I could not have a picture in mind of how individual’s life is going to be changed, but one thing will happen for sure is that people are more and more connected to each other. Given by the huge amount of communicational information among them, the user profile modeling would be quite different then.

Interestingly, when I had come up this idea, I did a little bit research into this area. IBM actually had started this research, in the paper “Unified Modeling of User Activities on Social Networking Sites” by IBM research, the author attempted at the unified modeling of various such activities on social networking sites to predict user’s future post, followship and friendship. This paper relates to my opinion on the same motivation of using huge amount of communicational information available for user modeling.

That’s an interesting topic and I will do further research on it.

 

Adaptive Information System


The last part of this chapter captured my interests. In this part, the future work of adaptive navigation support for Virtual Environments was envisioned. I am also expecting the whole theory of adaptive navigation support to be applied in this domain. This actually reminds me that a video, recommended by Prof. Brusilovsky on facebook, talking about their educational virtual reality, Virtual Trillium Trial v2.0. The purpose of this Virtual Trillium Trial (www.virtualtrilliumtrail.com) is to allow you to freely go and explore, and discover, an infinitely beautiful, expansive, natural world. The Virtual Trillium Trail is a game that is both fun and beautiful, and ideal for young children. Where you can go and explore as long as you like. Where you are free to go off-trail and into the woods to learn about any of the flowers, plants, and trees that spark your curiosity. However I noticed that since the space in the virtual reality is actually very large, and there are about 20 concepts, 20 plants and about 1,000 facts, for young children, how to immediately find what they are interested in is a crucial question. As we are dealing with a very special group with their unique habits and preferences, we need to find a way to collect their interests and based on which provide their personal navigation support. Like any other adaptive educational system, for example GuizeGuide, once the user grabs the concepts of certain level correctly, there is no need to go back to some of the concepts that are the prerequisites. Once the user in the virtual reality has seen some plants. He can quickly skip that part and move on to next interesting concept. This is one application of adaptive system in this domain, we have many more to be explored with virtual reality. It is an absolutely promising area. I am looking forward to participating and seeing how research goes.

  • Social navigation


Might it be a problem if social navigation suggests all users with similar profiles or interests come to the same node or link? In case where all nodes in the hypermedia space associated corresponding human knowledge, will it reduce the diversity of human knowledge in a general view because knowledge has been acquired would be accessed again and again while knowledge has not been familiar with cloud would be isolated more. At the mean time, each single user was getting the same type of information all the time by adaptive system, will this be a problem if the user has been kind of addicted this type of interest, however which might not be a good idea for his own personal development? Some logic and philosophy questions are remained to answered for the further development of adaptive information system.

  • Collaborative filtering


Collaborative Filtering (CF) is some how luxury recommendation mechanisms for most online communities that endure cold-start problems, because there aren’t enough items or user ratings for the algorithm use. But as the community grows, the system could soon benefit from this mechanism. This reminds me the conference navigator 3, where the function that enables the user to rate each recommended presentation is added to the system. However I didn’t see any moves go beyond collecting the ratings. This might be an interesting way to improve the current recommendation approaches.

  • Case based recommendation


This chapter reminds me the paper I have read titled as “Visualization for the Masses: Learning from the Experts”, in which the author presents an innovative application of case-based recommender system that is designed to suggest visualization of complex datasets uploaded from users. The system described in the paper is an online browser based visualization tool named ManyEyes developed by the IBM Research and the IBM Cognos software group. This related to the core idea of case based recommendation-“The users would like the similar one that they liked before.” By this approach, the system assumes structured item information with a well defined set of features and feature values. Information in the system are represented as a case and the system recommends the cases that are most similar to a user’s preference. Nice paper, and worth to read about.

Adaptive museum from Yi-ling Lin is a very cool project. I like this idea and how this idea was implemented into an adaptive recommendation system. When I looked at the type of general recommendation target, I thought there is one more potential point that may be we can improve. When we compare the real physical museum, there is one situation that you come with friends or a group to enjoy the museum tour, you talk and discuss the master piece on the wall with your friends, and this is a great pleasure where you can share your idea and communicate with your friends immediately. So museum, in this way, has become a good hang-out place for certain type of people. While, this kind of people also would like to go on internet some time and it would be great to enjoy the museum even online with their friend, which will eventually be a potential improvement for the system in order to enhance the user experience. We could create the interface, to enable inviting friends to tour the museum together and online chatting within our system as well. Also we could apply group recommendation for the users.

Hybrid web recommendation system is one approach that combines multiple recommendation techniques in order to fully utilizes the available data, as well as to achieve some synergy between them, and thus to improve or produce better recommendation sets to users.

Currently I am exploring the use of mixed hybridization and weighted hybridization method in conference navigator system. Hopefully we could work out and build up this approach for the new users in the next conference to have better presentations recommended

Patterns of leadership

 

Prof. Pentland started the talk with the discussion of kahneman’s work, which generalize two approaches of human behavior learning based on the learning process. One approach is Attention learning, which is slow, serial, controlled, and rule-based learning. The other one is habitual learning, which is fast, parallel, automated, and associated learning. The second one is used all the time according kahnemen’s work.

 

After the discussion, Prof. Pentland introduced his research in last couple of years. Recently, researchers, his team included, have been studying on what they named as Honest Signals. He recognized Honest Signals as a biological basis for understanding interactions, while it is not an affect or cognition, it is largely unconscious signals and responses. Some of their research has been focus on the how signals shape conversations. They have recorded 2300 hours experiments with 800 people on the signal changing. Some experiments include monitoring the signaling change in real time hiring, dating, sales and salary negotiation. They reached a conclusion that one good presentation like pitching your vision or business plan largely does not matter with what you say but how you say with 79% accuracy. That is to say, if the presenter seems excited and to know something, it’s going to be a good presentation with a large chance. Further, he explained how signaling shapes communication patterns. Three types of signaling shape the conversion, including energy, namely means highly active signaling, engagement (influence of each person in the communication), and exploration (variable prosody). By combining the three signaling, they could predict the leader in a group with 80% accuracy.

 

What make the communication pattern important are its influential contributions to determine group performance. They had conducted a study on improving the productivity of a calling center. A common sense is that the more minutes workers talk to each other, the less minutes they are talking with customers, which leads to less productivities. So it seems obvious rule in such company that only allowing worker to have coffee break sequentially, one after another. But in their experiment, they showed increasing the exploration and engagement among workers, however, raises the productivity of their work. Based on this experiment, they suggested changing coffee break and increasing the conversation between workers thus reached a $15M/year saving for that calling center. One point Prof. Pentland stressed all the time was that “we are not so smart”, and mostly we learn by going around, looking things that seem to work and copying that. So one possible reason to explain the calling center problem was that people were gaining collective practicing, which ultimately contributed to group productivity.

 

By comparing creative group with group of bees, which has the same pattern of star-shaped network (exploration) and cohesive network (engagement), he proposed a hypothesis that group performance can be improved by shaping communication patterns. They had run a seven-day experiment within a group in an international company, where one Japanese and seven Germans worked together. On the first day, by recording the signaling in the group, there were a lot of explorations and engagement between Germans, however, Japanese worker seemed being isolated with very few communications and thus resulted low group productivity. After informing the group and attempting to change the communication patterns, on the seventh day, however, they reached a better working performance, where signaling showed communications were more equally distributed in the group.

 

Prof. Pentland revisited the point at the end that “we are not so smart” by explaining social learning accounts for 90% of things learned, while trying by own only contributes 10%. In one study, his team convinced that the patterns of buying and the patterns of health was shaped by the demographical distribution in San Francisco Bay Area.

One interesting finding was that they convinced that for the pattern of apps on their mobile phone, based on their data, being close friends doesn’t mean shaping the same pattern of apps, however people meet more together tends to have more similar pattern of apps. This reminds me online-recommendation mechanism like social recommendation; one could receive recommendation from their ‘friend’. I am really interested how it goes when compared to the recommendation from people you work with or often meet.

 

Mapping the Genome of Collective Intelligence

 

Professor Malone in this talk presented their recent work on understanding a wide variety of emerging examples of Collective Intelligence from the point of view other than just “cool” ideas. He introduced a new framework to help provide this understanding by identifying the underlying building blocks—to use a biological metaphor, the “genes”—that are at the heart of collective intelligence systems, the conditions under which each gene is useful, and the possibilities for combining and re-combining these gens to harness crowds effectively.

 

Professor Malone and his team had gathered nearly 250 cases examples of Web enabled collective intelligence, from which they identified a relatively small set of building blocks that are combined and recombined in various ways in different collective intelligence systems. They listed two pairs of related questions for each examples:

--Who is perfoming the task? Why are they doing it?

--What is being accomplished? How is it being done?

 

By answering one of the four questions, they defined 19 genes by which collective intelligence system are built liken to the gene from which the individual organisms develop. The full combination of the genes can be viewed as the “genome” of one collective intelligence system.

 

For the question “Who”, they differentiated and identified two basic genes by answering the question of who undertakes the activity in an organization. The first one is Hierarchy, by which they meant, like in traditional hierarchical organizations, someone in authority assigned the task to a particular person or a group. The second one is Crowd, where activities can be undertaken by anyone in a large group who chooses to do so.

 

For the question “Why”, they identified three basic why genes to cover the high level motivations. Money, being the first one, is the financial promise from the organization as motivator to the individuals. Love is also a important motivator since some people work for their own interest, enjoyment. Besides, another important motivator is Glory. For example, in some open source software communities, programmers are motivated to gain the recognition from peers.

 

For the question “What”, they concluded two basic genes as the high level task being done in each case of collective intelligence system. The first one is Create, it is the process of contribution, followed by the Decide, when the organization evaluate and select alternatives.

 

For the question “How” , as they mainly focused on how the crowds used in intelligence system, they pointed out four variations of the answers to how gene for crowds based on the answer to the question—“whether they made their contributions and decisions independently of each other”. The gene Collection occurs when items contributed by the crowds are created independently of each other, while the gene Collaboration happens when items are created strongly dependently of each other. The gene Group decision occurs when members of a crowd reach a decision for the group as a whole. Method genes used in this process are recognized as Voting, Consensus, Averaging and Prediction markets. The gene individual decision occurs when each member of the crowd makes the chose by each own while it is not necessarily hold for all. Method genes used in this process are recognized as Markets and Social Networks.

 

He then explained how these genes could be combined into genomes of complete intelligence system, Linux, for one example. Anyone in the crowd who wants to create new software modules can contribute their part. And one few get paid as money reward, others would take Love and Glory as their main motivations. The work is finally done by collaboration. For the Decide part, they reached a decision on which modules were included in next release by only a few participants motivated by Love and Glory. This part is done by hierarchy.

 

Professor Malone ended the speech by comparing his work to another similar work, which is done by Quinn. He said there was still much work to be done to identify all different genes for collective intelligence. Actually, I think this is an absolutely useful start, but later on in the future, two problems are needed to explain properly in order to achieve higher usability of this system. First one, based on the paper and the talk I did not see how the genes varied or organization reacted as the outside environment of the organization, since each one has its own existing situation. The second one is that, will this conclusion hold for different culture, or do we have the same gene tags as we have the crowds from different culture, different geographic location or across time bound looking back to time when the internet web started and looking into future how organization at that time works.

 

"Filling in the H in CHI"

 

Terry Winograd is a professor of computer science at Stanford University, and co-director of the Stanford Human-Computer Interaction Group.

 

Prof. Winograd started the talk by briefly introducing the initial research he had done in his early life into the field of Artificial Intelligence, for example the invention of ELIZA. At that time, the motivation drove his research in AI was how to make computer more like human. But when later on he realized that "What I came to realize is that the success of the communication depends on the real intelligence on the part of the listener, and that there are many other ways of communicating with a computer that can be more effective, given that it doesn’t have the intelligence" he shifted his research away from classical AI to Human-Computer-Interaction with the new motivation: ”How to make better interaction”.

 

He redefined the word ”human” based on the new dimensions of humanness over the years. Firstly, A human is a physical body, and the technology involved is Human Factors, which are the scientific understanding of the properties of human capability. Secondly, A human is a language interpreter, which is also recognized as the field study of Linguistic. Thirdly, A human is an information processor, ultimately a problem solver and decision make, which is the main study area of cognitive psychology. Then A human is also a worker in an organization, related to the study of Management and Business. Fifthly, by showing the photo of a scene of the social network gatherings in San Francisco Bay area in 1950’s, by John Coates, A human is also an information seeker, the technology related is information science and design. For the I part in HCI, he crossed out the most frequent question in Interaction “what do people do?” and threw a substitute question, which was again from the human side ”who am I?” He concluded that there are two main features of every single human being, which are personal identity and social identity. While Personal identity is the unique numerical identity of persons through time as most people familiar with, he in details explain the social identity of “I”, which is a combination of fame and fortune, family and friends, community and society.

 

In the final part, he listed the opportunities for doing research in HCI. First one, computer technologies as most research are technology driving, we are continuing exploring the new technologies. The second one is the human science, one important area in which is the understanding of what people do in social community. The third one is the interaction mechanism, the opportunity of research lies in the new approach of fixing problems. The last one is the design of the research. He encouraged the audience to ask themselves the question “How do I design my research?” instead of “What research should I do?” He also gave three tips on design the research. 1. Be ontological. 2. Challenge your assumptions and goals. 3. Be open to opportunities.

 

GIS and social computing

I am quite interested in this paper ‘Toward critical spatial thinking in the social sciences and humanities’, since I came from the GIS background in my undergraduate study, I had done some interesting projects during that time. My advisor at that time always reminded me try to combine spatial thinking with your research interests or what we have for the interests in the industry. Advised by him, I have done several projects I think related to spatial thinking and social sciences, one project is urban planning using GIS, we first modeled the city Zhuhai into spatial map, and then with the given urban planning indicators, such as the population, economic conditions, to re-plan the space allocation to achieve a better urban environment. The other one is the 3D visualization of the campus; firstly I collected thousands of elevation points at campus area and one aerial photograph with high resolution. After digitalizing these data, I conducted a 3D landscape as figure 1. Then I abstracted the models for every construction in campus as figure 2, and bonded all them together with the landscape, later I build up a virtual campus associated with university information, such as photographs, which turned out to a university management information system with 3D visualization. I think spatial thinking is a very fascinating and creative tool or way of knowledge representation, for example, when the student was the learning in history, with spatial thinking he can actually see how the one nation evolves, for example, like the territory. I don't how the elementary level of education works, but in China, most of time, if we have a teacher teaching history with amazing imagination, the class would be enjoyable and really helpful. But we don't really have that many teachers. It would be great we have such technology available for the students to interact with and learn through having fun.

 

Although I am now working mostly on research related to social computing, since it now has so much influenced on our social life and networking approaches and thus gains so much attention, I am for a long time wondering if I could combine the two discipline together to see how it will further change our social networking. I don’t really figure out how to do that, but will be very happy to continue doing research on it.

ImageFigure 1

ImageFigure 2

Image

Figure 3


 

"Online Social Networks Does Not Improve Adolescents' OfflineFriendships"



The effect of online social networking use by adolescents on their offline friendships has been a matter of intense debate since the prevalence of online social networking as the representative launch of Facebook in the year of 2004. Studies have been taken to examine the positive or negative effect of SNSs use on adolescents offline friendships, in another words, scholars are trying to find a prove or disagreement on the topic "Online Social Networks Does Not Improve Adolescents' Offline Friendships". Even E.Szwedo, Mikami and Allen have indicated in their study that in overall, maintaining a greater number of relationships online appears to have something akin to a leveling effect on adolescents' future levels of psychological adjustment, there is no evidence for the direct causality between the level of young adults' psychological adjustment and the intimacy or closeness of their offline friendships. Moreover, in the study of Pollet, Roberts and Dunbar's on the relationships between use of social media, network size and emotional closeness, their statistic results suggest more time spent using social media does not lead to a larger offline networks, or an emotionally close feeling to offline network members, compared to those people who do not use social media. For the reason of the Nie's "Time displacement" hypothesis, since each relationship needs to be actively maintained to prevent from decaying, the more time spent on maintaining online relationships sheds the chances of time spent on interacting with offline friends. Nonetheless, the computer-mediated-communication is not able to reach the equivalent level of face-to-face communication, in terms of signaling
affect, due to the lack of visual, auditory, and contextual cues.

However, as the technology related to social networking enhancement tools progresses tremendously, the gap between CMC and FTF is decreasing rapidly, it becomes much easier and even more convenient to maintain relationship online nowadays, for instance, the introduction of FaceTime, Skype, or Google Hangout has bridged the gap between friends due to distance or other reasons. The question now turns to that whether time displacement issue surpasses the benefits brought by the online social networking on maintaining or improving offline friendships and the functional purposes of which.

For one advantage is that SNSs has provided a valuable source of information for offline relations, found by Courtois and Vanwynsberghe in their study on how SNSs relieve the social anxiety while communicating with offline friends. By drawing upon the context of their friends’ online profile, adolescents gain more knowledge about their friends, whether they are distant acquaintances, or geographically close friends who they meet on a regular basis. Especially for people at this age with relatively higher degree of social anxiety when interacting with others, the removal of this entropy has been a remarkably positive mediation for building solid relationship. And as a bypass of the reduction of social anxiety, a more suitable context would be easily identified and engaged within the relationships, as they know each other well and thus know what types of topics they are willing to talk about.

One of the other advantages has been recognized as the social support for offline friendships due to the primary purpose of SNSs use, which might also be counterpart evidence against the “time displacement hypothesis”. In Reich, Subrahmanyam and Espinoza’s research, they found about three quarters of adolescents’ SNSs friend network are whom their subjects most interacted offline and also suggested that probably the use of SNSs would strengthen adolescents’ offline relationships.

Personally I tend to believe that the use of SNSs could improve adolescents’ offline friendships, even though given the results of studies into this topic, there is no direct evidence to conclude the positive correlation between each other, probably due to that I fell into my logic trap where the more you communicate with your friends with respect to the higher quality and more chances of the conversation, consequently the better friendship you will have.

Anyway, I am also inclined to specify the offline friendships into two different types, heterogeneous social networks where people are socializing with distant peers, and homogeneous groups with local friend. It would be interesting to examine different proportion of distant friends versus local friends influences the intimacy, closeness and satisfaction
level gained from friendship, since as I observed most Chinese students tend to use SNSs quite often but with distant friends at most of times, while somehow, they are excluding themselves from the local community or networking, which might lead to a low level of self-identify on the new environment they are in.