Home About us Contact | |||
Search Engines (search + engines)
Selected AbstractsQuantitative comparisons of search engine resultsJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 11 2008Mike Thelwall Search engines are normally used to find information or Web sites, but Webometric investigations use them for quantitative data such as the number of pages matching a query and the international spread of those pages. For this type of application, the accuracy of the hit count estimates and range of URLs in the full results are important. Here, we compare the applications programming interfaces of Google, Yahoo!, and Live Search for 1,587 single word searches. The hit count estimates were broadly consistent but with Yahoo! and Google, reporting 5,6 times more hits than Live Search. Yahoo! tended to return slightly more matching URLs than Google, with Live Search returning significantly fewer. Yahoo!'s result URLs included a significantly wider range of domains and sites than the other two, and there was little consistency between the three engines in the number of different domains. In contrast, the three engines were reasonably consistent in the number of different top-level domains represented in the result URLs, although Yahoo! tended to return the most. In conclusion, quantitative results from the three search engines are mostly consistent but with unexpected types of inconsistency that users should be aware of. Google is recommended for hit count estimates but Yahoo! is recommended for all other Webometric purposes. [source] MEMORY ORGANIZATION AS THE MISSING LINK BETWEEN CASE-BASED REASONING AND INFORMATION RETRIEVAL IN BIOMEDICINECOMPUTATIONAL INTELLIGENCE, Issue 3-4 2006Isabelle Bichindaritz Mémoire proposes a general framework for reasoning from cases in biology and medicine. Part of this project is to propose a memory organization capable of handling large cases and case bases as occur in biomedical domains. This article presents the essential principles for an efficient memory organization based on pertinent work in information retrieval (IR). IR systems have been able to scale up to terabytes of data taking advantage of large databases research to build Internet search engines. They search for pertinent documents to answer a query using term-based ranking and/or global ranking schemes. Similarly, case-based reasoning (CBR) systems search for pertinent cases using a scoring function for ranking the cases. Mémoire proposes a memory organization based on inverted indexes which may be powered by databases to search and rank efficiently through large case bases. It can be seen as a first step toward large-scale CBR systems, and in addition provides a framework for tight cooperation between CBR and IR. [source] An evidence-based iterative content trust algorithm for the credibility of online newsCONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 15 2009Guosun Zeng Abstract People encounter more information than they can possibly use every day. But all information is not necessarily of equal value. In many cases, certain information appears to be better, or more trustworthy, than other information. And the challenge that most people then face is to judge which information is more credible. In this paper we propose a new problem called Corroboration Trust, which studies how to find credible news events by seeking more than one source to verify information on a given topic. We design an evidence-based corroboration trust algorithm called TrustNewsFinder, which utilizes the relationships between news articles and related evidence information (person, location, time and keywords about the news). A news article is trustworthy if it provides many pieces of trustworthy evidence, and a piece of evidence is likely to be true if it is provided by many trustworthy news articles. Our experiments show that TrustNewsFinder successfully finds true events among conflicting information and identifies trustworthy news better than the popular search engines. Copyright © 2009 John Wiley & Sons, Ltd. [source] Clinical practice recommendations for depressionACTA PSYCHIATRICA SCANDINAVICA, Issue 2009G. S. Malhi Objective:, To provide clinically relevant evidence-based recommendations for the management of depression in adults that are informative, easy to assimilate and facilitate clinical decision making. Method:, A comprehensive literature review of over 500 articles was undertaken using electronic database search engines (e.g. MEDLINE, PsychINFO and Cochrane reviews). In addition articles, book chapters and other literature known to the authors were reviewed. The findings were then formulated into a set of recommendations that were developed by a multidisciplinary team of clinicians who routinely deal with mood disorders. The recommendations then underwent consultative review by a broader advisory panel that included experts in the field, clinical staff and patient representatives. Results:, The clinical practice recommendations for depression (Depression CPR) summarize evidence-based treatments and provide a synopsis of recommendations relating to each phase of the illness. They are designed for clinical use and have therefore been presented succinctly in an innovative and engaging manner that is clear and informative. Conclusion:, These up-to-date recommendations provide an evidence-based framework that incorporates clinical wisdom and consideration of individual factors in the management of depression. Further, the novel style and practical approach should promote uptake and implementation. [source] Clinical practice recommendations for bipolar disorderACTA PSYCHIATRICA SCANDINAVICA, Issue 2009G. S. Malhi Objective:, To provide clinically relevant evidence-based recommendations for the management of bipolar disorder in adults that are informative, easy to assimilate and facilitate clinical decision-making. Method:, A comprehensive literature review of over 500 articles was undertaken using electronic database search engines (e.g. MEDLINE, PsychINFO and Cochrane reviews). In addition articles, book chapters and other literature known to the authors were reviewed. The findings were then formulated into a set of recommendations that were developed by a multidisciplinary team of clinicians who routinely deal with mood disorders. These preliminary recommendations underwent extensive consultative review by a broader advisory panel that included experts in the field, clinical staff and patient representatives. Results:, The clinical practice recommendations for bipolar disorder (bipolar CPR) summarise evidence-based treatments and provide a synopsis of recommendations relating to each phase of the illness. They are designed for clinical use and have therefore been presented succinctly in an innovative and engaging manner that is clear and informative. Conclusion:, These up-to-date recommendations provide an evidence-based framework that incorporates clinical wisdom and consideration of individual factors in the management of bipolar disorder. Further, the novel style and practical approach should promote their uptake and implementation. [source] Human search engines & choice navigatorsHEALTH INFORMATION & LIBRARIES JOURNAL, Issue 3 2004Bob Gann No abstract is available for this article. [source] CASRdb: calcium-sensing receptor locus-specific database for mutations causing familial (benign) hypocalciuric hypercalcemia, neonatal severe hyperparathyroidism, and autosomal dominant hypocalcemia,HUMAN MUTATION, Issue 2 2004Svetlana Pidasheva Abstract Familial hypocalciuric hypercalcemia (FHH) is caused by heterozygous loss-of-function mutations in the calcium-sensing receptor (CASR), in which the lifelong hypercalcemia is generally asymptomatic. Homozygous loss-of-function CASR mutations manifest as neonatal severe hyperparathyroidism (NSHPT), a rare disorder characterized by extreme hypercalcemia and the bony changes of hyperparathyroidism, which occur in infancy. Activating mutations in the CASR gene have been identified in several families with autosomal dominant hypocalcemia (ADH), autosomal dominant hypoparathyroidism, or hypocalcemic hypercalciuria. Individuals with ADH may have mild hypocalcemia and relatively few symptoms. However, in some cases seizures can occur, especially in younger patients, and these often happen during febrile episodes due to intercurrent infection. Thus far, 112 naturally-occurring mutations in the human CASR gene have been reported, of which 80 are unique and 32 are recurrent. To better understand the mutations causing defects in the CASR gene and to define specific regions relevant for ligand-receptor interaction and other receptor functions, the data on mutations were collected and the information was centralized in the CASRdb (www.casrdb.mcgill.ca), which is easily and quickly accessible by search engines for retrieval of specific information. The information can be searched by mutation, genotype,phenotype, clinical data, in vitro analyses, and authors of publications describing the mutations. CASRdb is regularly updated for new mutations and it also provides a mutation submission form to ensure up-to-date information. The home page of this database provides links to different web pages that are relevant to the CASR, as well as disease clinical pages, sequence of the CASR gene exons, and position of mutations in the CASR. The CASRdb will help researchers to better understand and analyze the mutations, and aid in structure,function analyses. Hum Mutat 24:107,111, 2004. © 2004 Wiley-Liss, Inc. [source] Principles of evidence-based management using stage I,II melanoma as a modelINTERNATIONAL JOURNAL OF DERMATOLOGY, Issue 11 2002Tsu-Yi Chuang MD Evidence-Based Medicine (EBM) is the practice of integrating best research evidence with clinical expertise and patent values. 1 The term, Evidence-Based Medicine, was named in 1992 by a group led by Gordon Guyatt at McMaster University in Canada. The practice of EBM arose from the awareness of: 1the daily need for valid information pertinent to clinical practice; 2the inadequacy of traditional sources, like textbooks, for such information; 3the disparity between clinical enhancing skills and declining up-to-date knowledge and eventually, clinical performance; and 4the inability to spend more time in finding and assimilating evidence pertinent to clinical practice. EBM simply emphasizes three As: Access, Appraisal and Application. Access requires refining a clinical question into a searchable term and an answerable question and using search engines to track down the information. Appraisal is using epidemiological principles and methods to critically review evidence for its validity and applicability. Application is integrating the critically appraised evidence with clinical expertise and each patient's unique situation. The outcomes following such practices are then assayed. The last step involves evaluating the effectiveness and efficiency in executing the first two As and seeking ways for improvement. In this article, we describe the concept and steps of practising EBM and utilize melanoma as an example to illustrate how we integrate the best evidence to outline the management plan for stage I-II melanoma. [source] Applying aggregation operators for information access systems: An application in digital librariesINTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 12 2008Enrique Herrera-Viedma Nowadays, the information access on the Web is a main problem in the computer science community. Any major advance in the field of information access on the Web requires the collaboration of different methodologies and research areas. In this paper, the concept of aggregation operator playing a role for information access on the Web is analyzed. We present some Web methodologies, as search engines, recommender systems, and Web quality evaluation models and analyze the way aggregation operators help toward the success of their activities. We also show an application of the aggregation operators in digital libraries. In particular, we introduce a Web information system to analyze the quality of digital libraries that implements an important panel of aggregation operators to obtain the quality assessments. © 2008 Wiley Periodicals, Inc. [source] Buddy: Harnessing the power of the internetINTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 1 2008Douglas Boulware The Internet has become a way of life. In the past, when someone wanted to perform research he or she would go to the library, and proceed to the card catalogue to locate a book or to a set of periodicals for magazines. Today we sit in front of our computer, launch our Internet browser, bring up a search engine, and perform various searches by entering a simple keyword, phrase, or more complex Boolean expression. The Internet can be a great asset by significantly cutting the time-consuming burden of finding relevant documents/papers; however, how do we know where and what we are searching? How many times do we perform a query and get irrelevant documents? In this article we investigate today's search engines, what is meant by coverage, and what metasearch engines bring to the table. We also look at both their abilities and deficiencies and present a capability that attempts to put more power at the finger tips of the user. This capability is what we call "Buddy." © 2008 Wiley Periodicals, Inc. [source] Interactive knowledge management for agent-assisted web navigationINTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 10 2007Vincenzo Loia Web information may currently be acquired by activating search engines. However, our daily experience is not only that web pages are often either redundant or missing but also that there is a mismatch between information needs and the web's responses. If we wish to satisfy more complex requests, we need to extract part of the information and transform it into new interactive knowledge. This transformation may either be performed by hand or automatically. In this article we describe an experimental agent-based framework skilled to help the user both in managing achieved information and in personalizing web searching activity. The first process is supported by a query-formulation facility and by a friendly structured representation of the searching results. On the other hand, the system provides a proactive support to the searching on the web by suggesting pages, which are selected according to the user's behavior shown in his navigation activity. A basic role is played by an extension of a classical fuzzy-clustering algorithm that provides a prototype-based representation of the knowledge extracted from the web. These prototypes lead both the proactive suggestion of new pages, mined through web spidering, and the structured representation of the searching results. © 2007 Wiley Periodicals, Inc. Int J Int Syst 22: 1101,1122, 2007. [source] UK interventions to control medicines wastage: a critical reviewINTERNATIONAL JOURNAL OF PHARMACY PRACTICE, Issue 3 2010Katherine Gwenda White Abstract Objective The objective is to evaluate the scope of medicines wastage in the UK, assigning a value to the costs at both a national and individual patient level to assess the cost-effectiveness of the pharmacy interventions that have been introduced to curb wastage. Methods Publicly available information was assessed in a desk-based systematic review using online search engines and publication databases. Data on community prescribing trends and costs in England from 1997 to 2008 from the Department of Health, and published reports from Primary Care Trusts (PCTs) comprise the core information that has been analysed. Key findings The commonly used upper wastage estimate of 10% is likely to be overstated, because it pre-dates major measures to curb wastage and over-prescribing. In pilot programmes, medicines use reviews have achieved cost savings of up to 20%. Awareness campaigns aimed at patients appear to be effective. Twenty-eight-day repeat prescribing has resulted in year-on-year reductions on the quantity of medication issued per prescription item to reach an average prescription length of 40 days in 2008. The increasing availability of generic medications has seen significant reductions in net ingredient costs. Nearly two-thirds of prescriptions are now issued as generics, with an average net ingredient cost of £3.83. Pharmacy charges to dispense a prescription item in 2008 averaged £1.81, so that pharmacy charges make up around one-third of the cost of most prescription items dispensed. If all 842.5 million prescription items issued by the NHS in England in 2008 had been 28-day repeat-dispensing items, this would have added a projected £700 million to the actual pharmacy costs of around £1.5 billion. Conclusions Unnecessary spending on pharmacy charges has the potential to outstrip the estimated cost of medicines wastage in the UK. The cost-effectiveness of restricted prescription lengths for the cheaper, mostly generic medications merits an urgent re-examination. [source] Genetic testing for cancer predisposition and implications for nursing practice: narrative reviewJOURNAL OF ADVANCED NURSING, Issue 4 2010Elizabeth Kathryn Bancroft bancroft e.k. (2010) Genetic testing for cancer predisposition and implications for nursing practice: narrative review. Journal of Advanced Nursing66(4), 710,737. Abstract Title.,Genetic testing for cancer predisposition and implications for nursing practice:narrative review. Aim., This paper is a report of a review of literature on the psychological and social implications of genetic testing for cancer predisposition and how recent developments in knowledge about genetics may affect clinical practice. Background., Knowledge about the genetics of disease has grown since the completion of the Human Genome Project. Many common genetic changes that predispose to cancer have been found. Identifying genetically ,at risk' individuals is going to become a feature of healthcare and nursing practice over the next decade. The psychological and social effects of this knowledge on patients and their families are important considerations. Data sources., A search of the British Nursing Index, CINAHL, EMBASE and PUBMED databases was conducted between June 2007 and December 2008 without date limits. Grey literature was sought using search engines and through searching relevant websites. Review methods., A narrative review of studies published in English was conducted. The studies were reviewed for relevance and inclusion criteria; their methodological quality was not evaluated. Results., Seventy-eight papers met the inclusion criteria and fell into three thematic categories: social impact, psychological impact and interest in and uptake of genetic testing. To date, research has focussed on high-risk cancer genes. Conclusion., Genetic testing raises social, ethical and psychological concerns. Further research is required to determine how healthcare professionals can support the integration of genetics into clinical practice. Nurses will become increasingly involved in genetic testing and will play a key role in providing information, support and follow-up for individuals identified as being at higher risk. [source] Medication communication: a concept analysisJOURNAL OF ADVANCED NURSING, Issue 4 2010Elizabeth Manias manias e. (2010) Medication communication: a concept analysis. Journal of Advanced Nursing66(4), 933,943. Abstract Title.,Medication communication: a concept analysis. Aim., This paper is a report of a concept analysis of medication communication with a particular focus on how it applies to nursing. Background., Medication communication is a vital component of patient safety, quality of care, and patient and family engagement. Nevertheless, this concept has been consistently taken-for-granted without adequate analysis, definition or clarification in the quality and patient safety literature. Data sources., A literature search was undertaken using bibliographic databases, internet search engines, and hand searches. Literature published in English between January 1988 and June 2009 was reviewed. Walker and Avant's approach was used to guide the concept analysis. Discussion., Medication communication is a dynamic and complex process. Defining attributes consider who speaks, who is silent, what is said, what aspects of medication care are prioritized, the use of body language in conversations, and actual words used. Open communication occurs if there is cooperation among individuals in implementing plans of care. Antecedents involve environmental influences such as ward culture and geographical space, and sociocultural influences such as beliefs about the nature of interactions. Consequences involve patient and family engagement in communication, evidence of appropriate medication use, the frequency and type of medication-related adverse events, and the presence of medication adherence. Empirical referents typically do not reflect specific aspects of medication communication. Conclusion., This concept analysis can be used by nurses to guide them in understanding the complexities surrounding medication communication, with the ultimate goal of improving patient safety, quality of care, and facilitating patient and family engagement. [source] A concept analysis of malnutrition in the elderlyJOURNAL OF ADVANCED NURSING, Issue 1 2001Cheryl Chia-Hui Chen RN MSN GNP A concept analysis of malnutrition in the elderly Purpose.,Malnutrition is a frequent and serious problem in the elderly. Today there is no doubt that malnutrition contributes significantly to morbidity and mortality in the elderly. Unfortunately, the concept of malnutrition in the elderly is poorly defined. The purpose of this paper is to clarify the meaning of malnutrition in the elderly and to develop the theoretical underpinnings, thereby facilitating communication regarding the phenomenon and enhancing research efforts. Scope, sources used.,Critical review of literature is the approach used to systematically build and develop the theoretical propositions. Conventional search engines such as Medline, PsyINFO, and CINAHL were used. The bibliography of obtained articles was also reviewed and additional articles identified. Key wards used for searching included malnutrition, geriatric nutrition, nutritional status, nutrition assessment, elderly, ageing, and weight loss. Conclusions.,The definition of malnutrition in the elderly is defined as following: faulty or inadequate nutritional status; undernourishment characterized by insufficient dietary intake, poor appetite, muscle wasting and weight loss. In the elderly, malnutrition is an ominous sign. Without intervention, it presents as a downward trajectory leading to poor health and decreased quality of life. Malnutrition in the elderly is a multidimensional concept encompassing physical and psychological elements. It is precipitated by loss, dependency, loneliness and chronic illness and potentially impacts morbidity, mortality and quality of life. [source] Current Approaches to the Assessment and Management of Anger and Aggression in Youth: A ReviewJOURNAL OF CHILD AND ADOLESCENT PSYCHIATRIC NURSING, Issue 4 2007APRN-BC, Christie S. Blake RN BACKGROUND:,Anger and its expression represent a major public health problem for children and adolescents today. Prevalence reports show that anger-related problems such as oppositional behavior, verbal and physical aggression, and violence are some of the more common reasons children are referred for mental health services. METHODS:,An extensive review of the literature was conducted using the following online search engines: Cochrane, MEDLINE, PsychINFO, and PubMed. Published and unpublished articles that met the following criteria were included in the review: (a) experimental or quasi-experimental research designs; (b) nonpharmacologic, therapy-based interventions; and (c) study participants between 5 and 17 years of age. RESULTS:,Cognitive-behavioral and skills-based approaches are the most widely studied and empirically validated treatments for anger and aggression in youth. Commonly used therapeutic techniques include affective education, relaxation training, cognitive restructuring, problem-solving skills, social skills training, and conflict resolution. These techniques, tailored to the individual child's and/or family's needs, can foster the development of more adaptive and prosocial behavior. [source] The quality of patient-orientated Internet information on oral lichen planus: a pilot studyJOURNAL OF EVALUATION IN CLINICAL PRACTICE, Issue 5 2010Pía López-Jornet PhD MD DDS Abstract Objective, This study examines the accessibility and quality Web pages related with oral lichen planus. Methods, Sites were identified using two search engines (Google and Yahoo!) and the search terms ,oral lichen planus' and ,oral lesion lichenoid'. The first 100 sites in each search were visited and classified. The web sites were evaluated for content quality by using the validated DISCERN rating instrument. JAMA benchmarks and ,Health on the Net' seal (HON). Results, A total of 109 000 sites were recorded in Google using the search terms and 520 000 in Yahoo! A total of 19 Web pages considered relevant were examined on Google and 20 on Yahoo! As regards the JAMA benchmarks, only two pages satisfied the four criteria in Google (10%), and only three (15%) in Yahoo! As regards DISCERN, the overall quality of web site information was poor, no site reaching the maximum score. In Google 78.94% of sites had important deficiencies, and 50% in Yahoo!, the difference between the two search engines being statistically significant (P = 0.031). Only five pages (17.2%) on Google and eight (40%) on Yahoo! showed the HON code. Conclusion, Based on our review, doctors must assume primary responsibility for educating and counselling their patients. [source] Maintenance issues in the Web site development processJOURNAL OF SOFTWARE MAINTENANCE AND EVOLUTION: RESEARCH AND PRACTICE, Issue 2 2002M. Taylor Abstract There appears to be few actual case studies in academic or professional literature regarding the overall process of developing a company Web site and even fewer regarding the maintenance of company Web sites. In this paper, we examine the maintenance issues in the Web site development process based on detailed case studies in seven U.K. organizations from the engineering, financial services, retail, manufacturing and education sectors over a two year period. This research indicated that there are numerous issues in Web site design and construction that impact upon future Web site maintenance activities. In particular, this research examined the impact of dynamic Web site data, Web site structure, specific coding for different user groups/Internet browsers/navigators/Internet search engines, Web site documentation and Web site development and testing standards upon future Web site maintenance work. Copyright © 2002 John Wiley & Sons, Ltd. [source] The use of the nicotine inhaler in smoking cessationJOURNAL OF THE AMERICAN ACADEMY OF NURSE PRACTITIONERS, Issue 3 2006CCRN (Staff Nurse), Jenny Sigel Burkett RN Abstract Purpose: To raise awareness among nurse practitioners (NPs) about the nicotine inhaler by providing clinical and practical information about the use of the nicotine inhaler as a treatment option for smoking cessation. Data sources: This included data-based and review articles in the medical literature, tobacco use and dependence clinical practice guideline, and Medline and Cinahl search engines. Criteria for search keywords were "nicotine inhaler" and "nicotine replacement therapy." Initial search was done in December 2004. Conclusions: The nicotine inhaler has been tested as safe and efficacious in the treatment of tobacco cessation. Clinical trials show the nicotine inhaler to be useful alone or as an adjunct to other pharmacological therapies. Current national guidelines recommend that the nicotine inhaler be used in smoking cessation therapy. Implications for practice: The nicotine inhaler is appropriate for many different smokers, including certain types of cardiac patients. NPs can include the nicotine inhaler in a group of nicotine replacement therapies to ensure that smokers are successful in tobacco cessation. [source] Scatter matters: Regularities and implications for the scatter of healthcare information on the WebJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 4 2010Suresh K. Bhavnani Abstract Despite the development of huge healthcare Web sites and powerful search engines, many searchers end their searches prematurely with incomplete information. Recent studies suggest that users often retrieve incomplete information because of the complex scatter of relevant facts about a topic across Web pages. However, little is understood about regularities underlying such information scatter. To probe regularities within the scatter of facts across Web pages, this article presents the results of two analyses: (a) a cluster analysis of Web pages that reveals the existence of three page clusters that vary in information density and (b) a content analysis that suggests the role each of the above-mentioned page clusters play in providing comprehensive information. These results provide implications for the design of Web sites, search tools, and training to help users find comprehensive information about a topic and for a hypothesis describing the underlying mechanisms causing the scatter. We conclude by briefly discussing how the analysis of information scatter, at the granularity of facts, complements existing theories of information-seeking behavior. [source] Mobile information retrieval with search results clustering: Prototypes and evaluationsJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 5 2009Claudio Carpineto Web searches from mobile devices such as PDAs and cell phones are becoming increasingly popular. However, the traditional list-based search interface paradigm does not scale well to mobile devices due to their inherent limitations. In this article, we investigate the application of search results clustering, used with some success for desktop computer searches, to the mobile scenario. Building on CREDO (Conceptual Reorganization of Documents), a Web clustering engine based on concept lattices, we present its mobile versions Credino and SmartCREDO, for PDAs and cell phones, respectively. Next, we evaluate the retrieval performance of the three prototype systems. We measure the effectiveness of their clustered results compared to a ranked list of results on a subtopic retrieval task, by means of the device-independent notion of subtopic reach time together with a reusable test collection built from Wikipedia ambiguous entries. Then, we make a cross-comparison of methods (i.e., clustering and ranked list) and devices (i.e., desktop, PDA, and cell phone), using an interactive information-finding task performed by external participants. The main finding is that clustering engines are a viable complementary approach to plain search engines both for desktop and mobile searches especially, but not only, for multitopic informational queries. [source] Quantitative comparisons of search engine resultsJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 11 2008Mike Thelwall Search engines are normally used to find information or Web sites, but Webometric investigations use them for quantitative data such as the number of pages matching a query and the international spread of those pages. For this type of application, the accuracy of the hit count estimates and range of URLs in the full results are important. Here, we compare the applications programming interfaces of Google, Yahoo!, and Live Search for 1,587 single word searches. The hit count estimates were broadly consistent but with Yahoo! and Google, reporting 5,6 times more hits than Live Search. Yahoo! tended to return slightly more matching URLs than Google, with Live Search returning significantly fewer. Yahoo!'s result URLs included a significantly wider range of domains and sites than the other two, and there was little consistency between the three engines in the number of different domains. In contrast, the three engines were reasonably consistent in the number of different top-level domains represented in the result URLs, although Yahoo! tended to return the most. In conclusion, quantitative results from the three search engines are mostly consistent but with unexpected types of inconsistency that users should be aware of. Google is recommended for hit count estimates but Yahoo! is recommended for all other Webometric purposes. [source] Cross-validation of neural network applications for automatic new topic identificationJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 3 2008H. Cenk Ozmutlu The purpose of this study is to provide results from experiments designed to investigate the cross-validation of an artificial neural network application to automatically identify topic changes in Web search engine user sessions by using data logs of different Web search engines for training and testing the neural network. Sample data logs from the FAST and Excite search engines are used in this study. The results of the study show that identification of topic shifts and continuations on a particular Web search engine user session can be achieved with neural networks that are trained on a different Web search engine data log. Although FAST and Excite search engine users differ with respect to some user characteristics (e.g., number of queries per session, number of topics per session), the results of this study demonstrate that both search engine users display similar characteristics as they shift from one topic to another during a single search session. The key finding of this study is that a neural network that is trained on a selected data log could be universal; that is, it can be applicable on all Web search engine transaction logs regardless of the source of the training data log. [source] A classification of mental models of undergraduates seeking information for a course essay in history and psychology: Preliminary investigations into aligning their mental models with online thesauriJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 13 2007Charles Cole The article reports a field study which examined the mental models of 80 undergraduates seeking information for either a history or psychology course essay when they were in an early, exploration stage of researching their essay. This group is presently at a disadvantage when using thesaurus-type schemes in indexes and online search engines because there is a disconnect between how domain novice users of IR systems represent a topic space and how this space is represented in the standard IR system thesaurus. The study attempted to (a) ascertain the coding language used by the 80 undergraduates in the study to mentally represent their topic and then (b) align the mental models with the hierarchical structure found in many thesauri. The intervention focused the undergraduates' thinking about their topic from a topic statement to a thesis statement. The undergraduates were asked to produce three mental model diagrams for their real-life course essay at the beginning, middle, and end of the interview, for a total of 240 mental model diagrams, from which we created a 12-category mental model classification scheme. Findings indicate that at the end of the intervention, (a) the percentage of vertical mental models increased from 24 to 35% of all mental models; but that (b) 3rd-year students had fewer vertical mental models than did 1st-year undergraduates in the study, which is counterintuitive. The results indicate that there is justification for pursuing our research based on the hypothesis that rotating a domain novice's mental model into a vertical position would make it easier for him or her to cognitively connect with the thesaurus's hierarchical representation of the topic area. [source] Mining related queries from Web search engine query logs using an improved association rule mining modelJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 12 2007Xiaodong Shi With the overwhelming volume of information, the task of finding relevant information on a given topic on the Web is becoming increasingly difficult. Web search engines hence become one of the most popular solutions available on the Web. However, it has never been easy for novice users to organize and represent their information needs using simple queries. Users have to keep modifying their input queries until they get expected results. Therefore, it is often desirable for search engines to give suggestions on related queries to users. Besides, by identifying those related queries, search engines can potentially perform optimizations on their systems, such as query expansion and file indexing. In this work we propose a method that suggests a list of related queries given an initial input query. The related queries are based in the query log of previously submitted queries by human users, which can be identified using an enhanced model of association rules. Users can utilize the suggested related queries to tune or redirect the search process. Our method not only discovers the related queries, but also ranks them according to the degree of their relatedness. Unlike many other rival techniques, it also performs reasonably well on less frequent input queries. [source] Web links and search engine ranking: The case of Google and the query "jew"JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 12 2006Judit Bar-Ilan The World Wide Web has become one of our more important information sources, and commercial search engines are the major tools for locating information; however, it is not enough for a Web page to be indexed by the search engines,it also must rank high on relevant queries. One of the parameters involved in ranking is the number and quality of links pointing to the page, based on the assumption that links convey appreciation for a page. This article presents the results of a content analysis of the links to two top pages retrieved by Google for the query "jew" as of July 2004: the "jew" entry on the free online encyclopedia Wikipedia, and the home page of "Jew Watch," a highly anti-Semitic site. The top results for the query "jew" gained public attention in April 2004, when it was noticed that the "Jew Watch" homepage ranked number 1. From this point on, both sides engaged in "Googlebombing" (i.e., increasing the number of links pointing to these pages). The results of the study show that most of the links to these pages come from blogs and discussion links, and the number of links pointing to these pages in appreciation of their content is extremely small. These findings have implications for ranking algorithms based on link counts, and emphasize the huge difference between Web links and citations in the scientific community. [source] Finding nuggets in documents: A machine learning approachJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 6 2006Yi-fang Brook Wu Document keyphrases provide a concise summary of a document's content, offering semantic metadata summarizing a document. They can be used in many applications related to knowledge management and text mining, such as automatic text summarization, development of search engines, document clustering, document classification, thesaurus construction, and browsing interfaces. Because only a small portion of documents have keyphrases assigned by authors, and it is time-consuming and costly to manually assign keyphrases to documents, it is necessary to develop an algorithm to automatically generate keyphrases for documents. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified phrases to assign weights to the candidate keyphrases. The logic of our algorithm is: The more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. KIP's learning function can enrich the glossary database by automatically adding new identified keyphrases to the database. KIP's personalization feature will let the user build a glossary database specifically suitable for the area of his/her interest. The evaluation results show that KIP's performance is better than the systems we compared to and that the learning function is effective. [source] Strategy hubs: Domain portals to help find comprehensive informationJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 1 2006Suresh K. Bhavnani Recent studies suggest that the wide variability in type, detail, and reliability of online information motivate expert searchers to develop procedural search knowledge. In contrast to prior research that has focused on finding relevant sources, procedural search knowledge focuses on how to order multiple relevant sources with the goal of retrieving comprehensive information. Because such procedural search knowledge is neither spontaneously inferred from the results of search engines, nor from the categories provided by domain-specific portals, the lack of such knowledge leads most novice searchers to retrieve incomplete information. In domains like healthcare, such incomplete information can lead to dangerous consequences. To address the above problem, a new kind of domain portal called a Strategy Hub was developed and tested. Strategy Hubs provide critical search procedures and associated high-quality links to enable users to find comprehensive and accurate information. We begin by describing how we collaborated with physicians to systematically identify generalizable search procedures to find comprehensive information about a disease, and how these search procedures were made available through the Strategy Hub. A controlled experiment suggests that this approach can improve the ability of novice searchers in finding comprehensive and accurate information, when compared to general-purpose search engines and domain-specific portals. We conclude with insights on how to refine and automate the Strategy Hub design, with the ultimate goal of helping users find more comprehensive information when searching in unfamiliar domains. [source] Analysis of the query logs of a Web site search engineJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 13 2005Michael Chau A large number of studies have investigated the transaction log of general-purpose search engines such as Excite and AltaVista, but few studies have reported on the analysis of search logs for search engines that are limited to particular Web sites, namely, Web site search engines. In this article, we report our research on analyzing the search logs of the search engine of the Utah state government Web site. Our results show that some statistics, such as the number of search terms per query, of Web users are the same for general-purpose search engines and Web site search engines, but others, such as the search topics and the terms used, are considerably different. Possible reasons for the differences include the focused domain of Web site search engines and users' different information needs. The findings are useful for Web site developers to improve the performance of their services provided on the Web and for researchers to conduct further research in this area. The analysis also can be applied in e-government research by investigating how information should be delivered to users in government Web sites. [source] A temporal comparison of AltaVista Web searchingJOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 6 2005Bernard J. Jansen Major Web search engines, such as AltaVista, are essential tools in the quest to locate online information. This article reports research that used transaction log analysis to examine the characteristics and changes in AltaVista Web searching that occurred from 1998 to 2002. The research questions we examined are (1) What are the changes in AltaVista Web searching from 1998 to 2002? (2) What are the current characteristics of AltaVista searching, including the duration and frequency of search sessions? (3) What changes in the information needs of AltaVista users occurred between 1998 and 2002? The results of our research show (1) a move toward more interactivity with increases in session and query length, (2) with 70% of session durations at 5 minutes or less, the frequency of interaction is increasing, but it is happening very quickly, and (3) a broadening range of Web searchers' information needs, with the most frequent terms accounting for less than 1% of total term usage. We discuss the implications of these findings for the development of Web search engines. [source] |