Home About us Contact | |||
User Sessions (user + session)
Selected AbstractsSubsessions: A granular approach to click path analysisINTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 7 2004Ernestina Menasalvas The fiercely competitive web-based electronic commerce (e-commerce) environment has made necessary the application of intelligent methods to gather and analyze information collected from consumer web sessions. Knowledge about user behavior and session goals can be discovered from the information gathered about user activities, as tracked by web clicks. Most current approaches to customer behavior analysis study the user session by examining each web page access. However, the abstraction of subsessions provides a more granular view of user activity. Here, we propose a method of increasing the granularity of the user session analysis by isolating useful subsessions within sessions. Each subsession represents a high-level user activity such as performing a purchase or searching for a particular type of information. Given a set of previously identified subsessions, we can determine at which point the user begins a preidentified subsession by tracking user clicks. With this information we can (1) optimize the user experience by precaching pages or (2) provide an adaptive user experience by presenting pages according to our estimation of the user's ultimate goal. To identify subsessions, we present an algorithm to compute frequent click paths from which subsessions then can be isolated. The algorithm functions by scanning all user sessions and extracting all frequent subpaths by using a distance function to determining subpath similarity. Each frequent subpath represents a subsession. An analysis of the pages represented by the subsession provides additional information about semantically related activities commonly performed by users. © 2004 Wiley Periodicals, Inc. [source] Cross-validation of neural network applications for automatic new topic identificationJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 3 2008H. Cenk Ozmutlu The purpose of this study is to provide results from experiments designed to investigate the cross-validation of an artificial neural network application to automatically identify topic changes in Web search engine user sessions by using data logs of different Web search engines for training and testing the neural network. Sample data logs from the FAST and Excite search engines are used in this study. The results of the study show that identification of topic shifts and continuations on a particular Web search engine user session can be achieved with neural networks that are trained on a different Web search engine data log. Although FAST and Excite search engine users differ with respect to some user characteristics (e.g., number of queries per session, number of topics per session), the results of this study demonstrate that both search engine users display similar characteristics as they shift from one topic to another during a single search session. The key finding of this study is that a neural network that is trained on a selected data log could be universal; that is, it can be applicable on all Web search engine transaction logs regardless of the source of the training data log. [source] Using clustering techniques to detect usage patterns in a Web-based information systemJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 11 2001Hui-Min Chen Different users of a Web-based information system will have different goals and different ways of performing their work. This article explores the possibility that we can automatically detect usage patterns without demographic information about the individuals. First, a set of 47 variables was defined that can be used to characterize a user session. The values of these variables were computed for approximately 257,000 sessions. Second, principal component analysis was employed to reduce the dimensions of the original data set. Third, a two-stage, hybrid clustering method was proposed to categorize sessions into groups. Finally, an external criteria-based test of cluster validity was performed to verify the validity of the resulting usage groups (clusters). The proposed methodology was demonstrated and tested for validity using two independent samples of user sessions drawn from the transaction logs of the University of California's MELVYL® on-line library catalog system (www.melvyl.ucop.edu). The results indicate that there were six distinct categories of use in the MELVYL system: knowledgeable and sophisticated use, unsophisticated use, highly interactive use with good search performance, known-item searching, help-intensive searching, and relatively unsuccessful use. Their characteristics were interpreted and compared qualitatively. The analysis shows that each group had distinct patterns of use of the system, which justifies the methodology employed in this study. [source] Markovian analysis for automatic new topic identification in search engine transaction logsAPPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Issue 6 2009Huseyin C. Ozmutlu Abstract Topic analysis of search engine user queries is an important task, since successful exploitation of the topic of queries can result in the design of new information retrieval algorithms for more efficient search engines. Identification of topic changes within a user search session is a key issue in analysis of search engine user queries. This study presents an application of Markov chains in the area of search engine research to automatically identify topic changes in a user session by using statistical characteristics of queries, such as time intervals, query reformulation patterns and the continuation/shift status of the previous query. The findings show that Markov chains provide fairly successful results for automatic new topic identification with a high level of estimation for topic continuations and shifts. Copyright © 2009 John Wiley & Sons, Ltd. [source] Subsessions: A granular approach to click path analysisINTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 7 2004Ernestina Menasalvas The fiercely competitive web-based electronic commerce (e-commerce) environment has made necessary the application of intelligent methods to gather and analyze information collected from consumer web sessions. Knowledge about user behavior and session goals can be discovered from the information gathered about user activities, as tracked by web clicks. Most current approaches to customer behavior analysis study the user session by examining each web page access. However, the abstraction of subsessions provides a more granular view of user activity. Here, we propose a method of increasing the granularity of the user session analysis by isolating useful subsessions within sessions. Each subsession represents a high-level user activity such as performing a purchase or searching for a particular type of information. Given a set of previously identified subsessions, we can determine at which point the user begins a preidentified subsession by tracking user clicks. With this information we can (1) optimize the user experience by precaching pages or (2) provide an adaptive user experience by presenting pages according to our estimation of the user's ultimate goal. To identify subsessions, we present an algorithm to compute frequent click paths from which subsessions then can be isolated. The algorithm functions by scanning all user sessions and extracting all frequent subpaths by using a distance function to determining subpath similarity. Each frequent subpath represents a subsession. An analysis of the pages represented by the subsession provides additional information about semantically related activities commonly performed by users. © 2004 Wiley Periodicals, Inc. [source] Cross-validation of neural network applications for automatic new topic identificationJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 3 2008H. Cenk Ozmutlu The purpose of this study is to provide results from experiments designed to investigate the cross-validation of an artificial neural network application to automatically identify topic changes in Web search engine user sessions by using data logs of different Web search engines for training and testing the neural network. Sample data logs from the FAST and Excite search engines are used in this study. The results of the study show that identification of topic shifts and continuations on a particular Web search engine user session can be achieved with neural networks that are trained on a different Web search engine data log. Although FAST and Excite search engine users differ with respect to some user characteristics (e.g., number of queries per session, number of topics per session), the results of this study demonstrate that both search engine users display similar characteristics as they shift from one topic to another during a single search session. The key finding of this study is that a neural network that is trained on a selected data log could be universal; that is, it can be applicable on all Web search engine transaction logs regardless of the source of the training data log. [source] Stochastic modeling of usage patterns in a web-based information systemJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 7 2002Hui-Min Chen Users move from one state (or task) to another in an information system's labyrinth as they try to accomplish their work, and the amount of time they spend in each state varies. This article uses continuous-time stochastic models, mainly based on semi-Markov chains, to derive user state transition patterns (both in rates and in probabilities) in a Web-based information system. The methodology was demonstrated with 126,925 search sessions drawn from the transaction logs of the University of California's MELVYL® library catalog system (www.melvyl.ucop.edu). First, user sessions were categorized into six groups based on their similar use of the system. Second, by using a three-layer hierarchical taxonomy of the system Web pages, user sessions in each usage group were transformed into a sequence of states. All the usage groups but one have third-order sequential dependency in state transitions. The sole exception has fourth-order sequential dependency. The transition rates as well as transition probabilities of the semi-Markov model provide a background for interpreting user behavior probabilistically, at various levels of detail. Finally, the differences in derived usage patterns between usage groups were tested statistically. The test results showed that different groups have distinct patterns of system use. Knowledge of the extent of sequential dependency is beneficial because it allows one to predict a user's next move in a search space based on the past moves that have been made. It can also be used to help customize the design of the user interface to the system to facilitate interaction. The group CL6 labeled "knowledgeable and sophisticated usage" and the group CL7 labeled "unsophisticated usage" both had third-order sequential dependency and had the same most-frequently occurring search pattern: screen display, record display, screen display, and record display. The group CL8 called "highly interactive use with good search results" had fourth-order sequential dependency, and its most frequently occurring pattern was the same as CL6 and CL7 with one more screen display action added. The group CL13, called "known-item searching" had third-order sequential dependency, and its most frequently occurring pattern was index access, search with retrievals, screen display, and record display. Group CL14 called "help intensive searching," and CL18 called "relatively unsuccessful" both had third-order sequential dependency, and for both groups the most frequently occurring pattern was index access, search without retrievals, index access, and again, search without retrievals. [source] Using clustering techniques to detect usage patterns in a Web-based information systemJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 11 2001Hui-Min Chen Different users of a Web-based information system will have different goals and different ways of performing their work. This article explores the possibility that we can automatically detect usage patterns without demographic information about the individuals. First, a set of 47 variables was defined that can be used to characterize a user session. The values of these variables were computed for approximately 257,000 sessions. Second, principal component analysis was employed to reduce the dimensions of the original data set. Third, a two-stage, hybrid clustering method was proposed to categorize sessions into groups. Finally, an external criteria-based test of cluster validity was performed to verify the validity of the resulting usage groups (clusters). The proposed methodology was demonstrated and tested for validity using two independent samples of user sessions drawn from the transaction logs of the University of California's MELVYL® on-line library catalog system (www.melvyl.ucop.edu). The results indicate that there were six distinct categories of use in the MELVYL system: knowledgeable and sophisticated use, unsophisticated use, highly interactive use with good search performance, known-item searching, help-intensive searching, and relatively unsuccessful use. Their characteristics were interpreted and compared qualitatively. The analysis shows that each group had distinct patterns of use of the system, which justifies the methodology employed in this study. [source] Using support vector machines for automatic new topic identificationPROCEEDINGS OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE & TECHNOLOGY (ELECTRONIC), Issue 1 2007Seda Ozmutlu Recent studies on automatic new topic identification in Web search engine user sessions demonstrated that learning algorithms such as neural networks and regression have been fairly successful in automatic new topic identification. In this study, we investigate whether another learning algorithm, Support Vector Machines (SVM) are successful in terms of identifying topic shifts and continuations. Sample data logs from the Norwegian search engine FAST (currently owned by Overture) and Excite are used in this study. Findings of this study suggest that support vector machines' performance depends on the characteristics of the dataset it is applied on. [source] |