Data Classification (data + classification)

Distribution by Scientific Domains

Selected Abstracts

Fast Volume Rendering and Data Classification Using Multiresolution in Min-Max Octrees

Feng Dong
Large-sized volume datasets have recently become commonplace and users are now demanding that volume-rendering techniques to visualise such data provide acceptable results on relatively modest computing platforms. The widespread use of the Internet for the transmission and/or rendering of volume data is also exerting increasing demands on software providers. Multiresolution can address these issues in an elegant way. One of the fastest volume-rendering alrogithms is that proposed by Lacroute & Levoy 1 , which is based on shear-warp factorisation and min-max octrees (MMOs). Unfortunately, since an MMO captures only a single resolution of a volume dataset, this method is unsuitable for rendering datasets in a multiresolution form. This paper adapts the above algorithm to multiresolution volume rendering to enable near-real-time interaction to take place on a standard PC. It also permits the user to modify classification functions and/or resolution during rendering with no significant loss of rendering speed. A newly-developed data structure based on the MMO is employed, the multiresolution min-max octree, M 3 O, which captures the spatial coherence for datasets at all resolutions. Speed is enhanced by the use of multiresolution opacity transfer functions for rapidly determining and discarding transparent dataset regions. Some experimental results on sample volume datasets are presented. [source]

Towards closing the analysis gap: Visual generation of decision supporting schemes from raw data

T. May
Abstract The derivation, manipulation and verification of analytical models from raw data is a process which requires a transformation of information across different levels of abstraction. We introduce a concept for the coupling of data classification and interactive visualization in order to make this transformation visible and steerable for the human user. Data classification techniques generate mappings that formally group data items into categories. Interactive visualization includes the user into an iterative refinement process. The user identifies and selects interesting patterns to define these categories. The following step is the transformation of a visible pattern into the formal definition of a classifier. In the last step the classifier is transformed back into a pattern that is blended with the original data in the same visual display. Our approach allows in intuitive assessment of a formal classifier and its model, the detection of outliers and the handling of noisy data using visual pattern-matching. We instantiated the concept using decision trees for classification and KVMaps as the visualization technique. The generation of a classifier from visual patterns and its verification is transformed from a cognitive to a mostly pre-cognitive task. [source]

Discovering hidden knowledge in data classification via multivariate analysis

EXPERT SYSTEMS, Issue 2 2010
Yisong Chen
Abstract: A new classification algorithm based on multivariate analysis is proposed to discover and simulate the grading policy on school transcript data sets. The framework comprises three major steps. First, factor analysis is adopted to separate the scores of several different subjects into grading-related ones and grading-unrelated ones. Second, multidimensional scaling is employed for dimensionality reduction to facilitate subsequent data visualization and interpretation. Finally, a support vector machine is trained to classify the filtered data into different grades. This work provides an attractive framework for intelligent data analysis and decision making. It also exhibits the advantages of high classification accuracy and supports intuitive data interpretation. [source]

Augmentation of a nearest neighbour clustering algorithm with a partial supervision strategy for biomedical data classification

EXPERT SYSTEMS, Issue 1 2009
Sameh A. Salem
Abstract: In this paper, a partial supervision strategy for a recently developed clustering algorithm, the nearest neighbour clustering algorithm (NNCA), is proposed. The proposed method (NNCA-PS) offers classification capability with a smaller amount of a priori knowledge, where a small number of data objects from the entire data set are used as labelled objects to guide the clustering process towards a better search space. Experimental results show that NNCA-PS gives promising results of 89% sensitivity at 95% specificity when used to segment retinal blood vessels, and a maximum classification accuracy of 99.5% with 97.2% average accuracy when applied to a breast cancer data set. Comparisons with other methods indicate the robustness of the proposed method in classification. Additionally, experiments on parallel environments indicate the suitability and scalability of NNCA-PS in handling larger data sets. [source]

Microarray data classification using inductive logic programming and gene ontology background information

Einar Ryeng
Abstract There exists many databases containing information on genes that are useful for background information in machine learning analysis of microarray data. The gene ontology and gene ontology annotation projects are among the most comprehensive of these. We demonstrate how inductive logic programming (ILP) can be used to build classification rules for microarray data which naturally incorporates the gene ontology and annotations to it as background knowledge without removing the inherent graph structure of the ontology. The ILP rules generated are parsimonious and easy to interpret. Copyright 2010 John Wiley & Sons, Ltd. [source]


ABSTRACT Soyfortified paneer (SFP) samples prepared from blends containing different proportions of buffalo milk of varying fat content and soy milk (7.5 B) were evaluated organoleptically for assessing the quality attributes like body and texture, flavor and taste, color and appearance and the overall acceptability. Sensory data were analyzed using fuzzy logic approach, which addresses the problem of data classification in a unified qualitative and quantitative manner. Results of the study indicated that the fuzzy multiattribute decision making approach provide an adequate and reliable system for product formulation and comparison, based on sensory data. The developed fuzzy mathematical model performed remarkably well in the evaluation and ranking of various SFP samples. The SFP sample made from blend of buffalo milk (4.5% fat) and soy milk (7.5 B) in the proportion of 90:10 was found to be the most acceptable one for different classes of consumers irrespective of their preferences for a particular sensory quality attribute. [source]

A Bayesian Hierarchical Model for Classification with Selection of Functional Predictors

BIOMETRICS, Issue 2 2010
Hongxiao Zhu
Summary In functional data classification, functional observations are often contaminated by various systematic effects, such as random batch effects caused by device artifacts, or fixed effects caused by sample-related factors. These effects may lead to classification bias and thus should not be neglected. Another issue of concern is the selection of functions when predictors consist of multiple functions, some of which may be redundant. The above issues arise in a real data application where we use fluorescence spectroscopy to detect cervical precancer. In this article, we propose a Bayesian hierarchical model that takes into account random batch effects and selects effective functions among multiple functional predictors. Fixed effects or predictors in nonfunctional form are also included in the model. The dimension of the functional data is reduced through orthonormal basis expansion or functional principal components. For posterior sampling, we use a hybrid Metropolis,Hastings/Gibbs sampler, which suffers slow mixing. An evolutionary Monte Carlo algorithm is applied to improve the mixing. Simulation and real data application show that the proposed model provides accurate selection of functional predictors as well as good classification. [source]