Home About us Contact

Retrieval Performance (retrieval + performance)

Distribution by Scientific Domains

Information Science and Computing	62%

Selected Abstracts

Does compression affect image retrieval performance?

INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, Issue 2-3 2008
Gerald Schaefer
Abstract Image retrieval and image compression are both fields of intensive research. As lossy image compression degrades the visual quality of images and hence changes the actual pixel values of an image, low level image retrieval descriptors which are based on statistical properties of pixel values will change, too. In this article we investigate how image compression affects the performance of low-level colour descriptors. Several image retrieval algorithms are evaluated on a speciated image database compressed at different image quality levels. Extensive experiments reveal that while distribution-based colour descriptors are fairly stable with respect to image compression a drop in retrieval performance can nevertheless be observed for JPEG compressed images. On the other hand, after application of JPEG2000 compression only a negligible performance drop is observed even at high compression ratios. © 2008 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 18, 101,112, 2008 [source]

Using clustering methods to improve ontology-based query term disambiguation

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 7 2006
Ernesto William De Luca
In this article we describe results of our research on the disambiguation of user queries using ontologies for categorization. We present an approach to cluster search results by using classes or "Sense Folders" (prototype categories) derived from the concepts of an assigned ontology, in our case WordNet. Using the semantic relations provided from such a resource, we can assign categories to prior, not annotated documents. The disambiguation of query terms in documents with respect to a user-specific ontology is an important issue in order to improve the retrieval performance for the user. Furthermore, we show that a clustering process can enhance the semantic classification of documents, and we discuss how this clustering process can be further enhanced using only the most descriptive classes of the ontology. © 2006 Wiley Periodicals, Inc. Int J Int Syst 21: 693,709, 2006. [source]

Unified linear subspace approach to semantic analysis

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 1 2010
Dandan Li
The Basic Vector Space Model (BVSM) is well known in information retrieval. Unfortunately, its retrieval effectiveness is limited because it is based on literal term matching. The Generalized Vector Space Model (GVSM) and Latent Semantic Indexing (LSI) are two prominent semantic retrieval methods, both of which assume there is some underlying latent semantic structure in a dataset that can be used to improve retrieval performance. However, while this structure may be derived from both the term space and the document space, GVSM exploits only the former and LSI the latter. In this article, the latent semantic structure of a dataset is examined from a dual perspective; namely, we consider the term space and the document space simultaneously. This new viewpoint has a natural connection to the notion of kernels. Specifically, a unified kernel function can be derived for a class of vector space models. The dual perspective provides a deeper understanding of the semantic space and makes transparent the geometrical meaning of the unified kernel function. New semantic analysis methods based on the unified kernel function are developed, which combine the advantages of LSI and GVSM. We also prove that the new methods are stable because although the selected rank of the truncated Singular Value Decomposition (SVD) is far from the optimum, the retrieval performance will not be degraded significantly. Experiments performed on standard test collections show that our methods are promising. [source]

Mobile information retrieval with search results clustering: Prototypes and evaluations

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 5 2009
Claudio Carpineto
Web searches from mobile devices such as PDAs and cell phones are becoming increasingly popular. However, the traditional list-based search interface paradigm does not scale well to mobile devices due to their inherent limitations. In this article, we investigate the application of search results clustering, used with some success for desktop computer searches, to the mobile scenario. Building on CREDO (Conceptual Reorganization of Documents), a Web clustering engine based on concept lattices, we present its mobile versions Credino and SmartCREDO, for PDAs and cell phones, respectively. Next, we evaluate the retrieval performance of the three prototype systems. We measure the effectiveness of their clustered results compared to a ranked list of results on a subtopic retrieval task, by means of the device-independent notion of subtopic reach time together with a reusable test collection built from Wikipedia ambiguous entries. Then, we make a cross-comparison of methods (i.e., clustering and ranked list) and devices (i.e., desktop, PDA, and cell phone), using an interactive information-finding task performed by external participants. The main finding is that clustering engines are a viable complementary approach to plain search engines both for desktop and mobile searches especially, but not only, for multitopic informational queries. [source]

Data fusion according to the principle of polyrepresentation

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 4 2009
Birger Larsen
We report data fusion experiments carried out on the four best-performing retrieval models from TREC 5. Three were conceptually/algorithmically very different from one another; one was algorithmically similar to one of the former. The objective of the test was to observe the performance of the 11 logical data fusion combinations compared to the performance of the four individual models and their intermediate fusions when following the principle of polyrepresentation. This principle is based on cognitive IR perspective (Ingwersen & Järvelin, 2005) and implies that each retrieval model is regarded as a representation of a unique interpretation of information retrieval (IR). It predicts that only fusions of very different, but equally good, IR models may outperform each constituent as well as their intermediate fusions. Two kinds of experiments were carried out. One tested restricted fusions, which entails that only the inner disjoint overlap documents between fused models are ranked. The second set of experiments was based on traditional data fusion methods. The experiments involved the 30 TREC 5 topics that contain more than 44 relevant documents. In all tests, the Borda and CombSUM scoring methods were used. Performance was measured by precision and recall, with document cutoff values (DCVs) at 100 and 15 documents, respectively. Results show that restricted fusions made of two, three, or four cognitively/algorithmically very different retrieval models perform significantly better than do the individual models at DCV100. At DCV15, however, the results of polyrepresentative fusion were less predictable. The traditional fusion method based on polyrepresentation principles demonstrates a clear picture of performance at both DCV levels and verifies the polyrepresentation predictions for data fusion in IR. Data fusion improves retrieval performance over their constituent IR models only if the models all are quite conceptually/algorithmically dissimilar and equally and well performing, in that order of importance. [source]

Context-based generic cross-lingual retrieval of documents and automated summaries

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 2 2005
Wai Lam
We develop a context-based generic cross-lingual retrieval model that can deal with different language pairs. Our model considers contexts in the query translation process. Contexts in the query as well as in the documents based on co-occurrence statistics from different granularity of passages are exploited. We also investigate cross-lingual retrieval of automatic generic summaries. We have implemented our model for two different cross-lingual settings, namely, retrieving Chinese documents from English queries as well as retrieving English documents from Chinese queries. Extensive experiments have been conducted on a large-scale parallel corpus enabling studies on retrieval performance for two different cross-lingual settings of full-length documents as well as automated summaries. [source]

Lanczos and the Riemannian SVD in information retrieval applications

NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, Issue 4 2005
Ricardo D. Fierro
Abstract Variations of the latent semantic indexing (LSI) method in information retrieval (IR) require the computation of singular subspaces associated with the k dominant singular values of a large m × n sparse matrix A, where k,min(m,n). The Riemannian SVD was recently generalized to low-rank matrices arising in IR and shown to be an effective approach for formulating an enhanced semantic model that captures the latent term-document structure of the data. However, in terms of storage and computation requirements, its implementation can be much improved for large-scale applications. We discuss an efficient and reliable algorithm, called SPK-RSVD-LSI, as an alternative approach for deriving the enhanced semantic model. The algorithm combines the generalized Riemannian SVD and the Lanczos method with full reorthogonalization and explicit restart strategies. We demonstrate that our approach performs as well as the original low-rank Riemannian SVD method by comparing their retrieval performance on a well-known benchmark document collection. Copyright 2004 John Wiley & Sons, Ltd. [source]