Classification Performance (classification + performance)

Distribution by Scientific Domains


Selected Abstracts


Using feedforward neural networks and forward selection of input variables for an ergonomics data classification problem

HUMAN FACTORS AND ERGONOMICS IN MANUFACTURING & SERVICE INDUSTRIES, Issue 1 2004
Chuen-Lung Chen
A method was developed to accurately predict the risk of injuries in industrial jobs based on datasets not meeting the assumptions of parametric statistical tools, or being incomplete. Previous research used a backward-elimination process for feedforward neural network (FNN) input variable selection. Simulated annealing (SA) was used as a local search method in conjunction with a conjugate-gradient algorithm to develop an FNN. This article presents an incremental step in the use of FNNs for ergonomics analyses, specifically the use of forward selection of input variables. Advantages to this approach include enhancing the effectiveness of the use of neural networks when observations are missing from ergonomics datasets, and preventing overspecification or overfitting of an FNN to training data. Classification performance across two methods involving the use of SA combined with either forward selection or backward elimination of input variables was comparable for complete datasets, and the forward-selection approach produced results superior to previously used methods of FNN development, including the error back-propagation algorithm, when dealing with incomplete data. © 2004 Wiley Periodicals, Inc. Hum Factors Man 14: 31,49, 2004. [source]


Differential impact of brain damage on the access mode to memory representations: an information theoretic approach

EUROPEAN JOURNAL OF NEUROSCIENCE, Issue 10 2007
Rosapia Lauro-Grotto
Abstract Different access modes to information stored in long-term memory can lead to different distributions of errors in classification tasks. We have designed a famous faces memory classification task that allows for the extraction of a measure of metric content, an index of the relevance of semantic cues for classification performance. High levels of metric content are indicative of a relatively preferred semantic access mode, while low levels, and similar correct performance, suggest a preferential episodic access mode. Compared with normal controls, the metric content index was increased in patients with Alzheimer's disease (AD), decreased in patients with herpes simplex encephalitis, and unvaried in patients with insult in the prefrontal cortex. Moreover, the metric content index was found to correlate with a measure of the severity of dementia in patients with AD, and to track the progression of the disease. These results underline the role of the medial-temporal lobes and of the temporal cortex, respectively, for the episodic and semantic routes to memory retrieval. Moreover, they confirm the reliability of information theoretic measures for characterizing the structure of the surviving memory representations in memory-impaired patient populations. [source]


Assessing the predictive performance of artifIcial neural network-based classifiers based on different data preprocessing methods, distributions and training mechanisms

INTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE & MANAGEMENT, Issue 4 2005
Adrian Costea
We analyse the implications of three different factors (preprocessing method, data distribution and training mechanism) on the classification performance of artificial neural networks (ANNs). We use three preprocessing approaches: no preprocessing, division by the maximum absolute values and normalization. We study the implications of input data distributions by using five datasets with different distributions: the real data, uniform, normal, logistic and Laplace distributions. We test two training mechanisms: one belonging to the gradient-descent techniques, improved by a retraining procedure, and the other is a genetic algorithm (GA), which is based on the principles of natural evolution. The results show statistically significant influences of all individual and combined factors on both training and testing performances. A major difference with other related studies is the fact that for both training mechanisms we train the network using as starting solution the one obtained when constructing the network architecture. In other words we use a hybrid approach by refining a previously obtained solution. We found that when the starting solution has relatively low accuracy rates (80,90%) the GA clearly outperformed the retraining procedure, whereas the difference was smaller to non-existent when the starting solution had relatively high accuracy rates (95,98%). As reported in other studies, we found little to no evidence of crossover operator influence on the GA performance. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Ranking and selecting terms for text categorization via SVM discriminate boundary

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 2 2010
Tien-Fang Kuo
The problem of natural language document categorization consists of classifying documents into predetermined categories based on their contents. Each distinct term, or word, in the documents is a feature for representing a document. In general, the number of terms may be extremely large and the dozens of redundant terms may be included, which may reduce the classification performance. In this paper, a support vector machine (SVM)-based feature ranking and selecting method for text categorization is proposed. The contribution of each term for classification is calculated based on the nonlinear discriminant boundary, which is generated by the SVM. The results of experiments on several real-world data sets show that the proposed method is powerful enough to extract a smaller number of important terms and achieves a higher classification performance than existing feature selecting methods based on latent semantic indexing and ,2 statistics values. © 2009 Wiley Periodicals, Inc. [source]


Corporate Failure Prediction Modeling: Distorted by Business Groups' Internal Capital Markets?

JOURNAL OF BUSINESS FINANCE & ACCOUNTING, Issue 5-6 2006
Nico Dewaelheyns
However, in view of the importance of business groups in Continental Europe, ignoring group ties may have a negative impact on predictive reliability. We find that models encompassing both bankruptcy variables defined at subsidiary level and at group level have a substantially better fit and classification performance. Furthermore we find that the group's support causes improved survival chances for subsidiaries, especially when these subsidiaries belong to the group's core business. Overall our results are consistent with existing theoretical and empirical findings from the internal capital markets literature. [source]


OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification,

JOURNAL OF CHEMOMETRICS, Issue 8-10 2006
Max Bylesjö
Abstract The characteristics of the OPLS method have been investigated for the purpose of discriminant analysis (OPLS-DA). We demonstrate how class-orthogonal variation can be exploited to augment classification performance in cases where the individual classes exhibit divergence in within-class variation, in analogy with soft independent modelling of class analogy (SIMCA) classification. The prediction results will be largely equivalent to traditional supervised classification using PLS-DA if no such variation is present in the classes. A discriminatory strategy is thus outlined, combining the strengths of PLS-DA and SIMCA classification within the framework of the OPLS-DA method. Furthermore, resampling methods have been employed to generate distributions of predicted classification results and subsequently assess classification belief. This enables utilisation of the class-orthogonal variation in a proper statistical context. The proposed decision rule is compared to common decision rules and is shown to produce comparable or less class-biased classification results. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Detection and classification of latent defects and diseases on raw French fries with multispectral imaging

JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE, Issue 13 2005
Jacco C Noordam
Abstract This paper describes an application of both multispectral imaging and red/green/blue (RGB) colour imaging for the discrimination between different defect and diseases on raw French fries. Four different potato cultivars generally used for French fries production are selected from which fries are cut. Both multispectral images and RGB colour images are classified with parametric and non-parametric classifiers. The effect of applying different preprocessing techniques on the spectra was also investigated. The best classification results in terms of accuracy, yield and purity are obtained with a modified version of standard normal variate (snv_mod) preprocessing for different classifiers and potato cultivars. The classification results of the multispectral images are compared with RGB images. The results show that the support vector classifier gives the best classification performance for the snv_mod preprocessed multispectral images and k-nearest neighbours classifier gives the best classification performance for raw RGB images. The detection of the latent greening defect in French fries with the exploration of multispectral images shows the additional value of multispectral imaging for French fries. A comparison between the multispectral images and the RGB colour images confirms this since this type of defect is not visible in the colour images. Copyright © 2005 Society of Chemical Industry [source]


Do prior knowledge, personality and visual perceptual ability predict student performance in microscopic pathology?

MEDICAL EDUCATION, Issue 6 2010
Laura Helle
Medical Education 2010:44:621,629 Objectives, There has been long-standing controversy regarding aptitude testing and selection for medical education. Visual perception is considered particularly important for detecting signs of disease as part of diagnostic procedures in, for example, microscopic pathology, radiology and dermatology and as a component of perceptual motor skills in medical procedures such as surgery. In 1968 the Perceptual Ability Test (PAT) was introduced in dental education. The aim of the present pilot study was to explore possible predictors of performance in diagnostic classification based on microscopic observation in the context of an undergraduate pathology course. Methods, A pre- and post-test of diagnostic classification performance, test of visual perceptual skill (Test of Visual Perceptual Skills, 3rd edition [TVPS-3]) and a self-report instrument of personality (Big Five Personality Inventory) were administered. In addition, data on academic performance (performance in histology and cell biology, a compulsory course taken the previous year, in addition to performance on the microscopy examination and final examination) were collected. Results, The results indicated that one personality factor (Conscientiousness) and one element of visual perceptual ability (spatial relationship awareness) predicted performance on the pre-test. The only factor to predict performance on the post-test was performance on the pre-test. Similarly, the microscopy examination score was predicted by the pre-test score, in addition to the histology and cell biology grade. The course examination score was predicted by two personality factors (Conscientiousness and lack of Openness) and the histology and cell biology grade. Conclusions, Visual spatial ability may be related to performance in the initial phase of training in microscopic pathology. However, from a practical point of view, medical students are able to learn basic microscopic pathology using worked-out examples, independently of measures of personality or visual perceptual ability. This finding should reassure students about their abilities to improve with training independently of their scores on tests on basic abilities and personality. [source]


Profiling MS proteomics data using smoothed non-linear energy operator and Bayesian additive regression trees

PROTEINS: STRUCTURE, FUNCTION AND BIOINFORMATICS, Issue 17 2009
Shan He
Abstract This paper proposes a novel profiling method for SELDI-TOF and MALDI-TOF MS data that integrates a novel peak detection method based on modified smoothed non-linear energy operator, correlation-based peak selection and Bayesian additive regression trees. The peak detection and classification performance of the proposed approach is validated on two publicly available MS data sets, namely MALDI-TOF simulation data and high-resolution SELDI-TOF ovarian cancer data. The results compared favorably with three state-of-the-art peak detection algorithms and four machine-learning algorithms. For the high-resolution ovarian cancer data set, seven biomarkers (m/z windows) were found by our method, which achieved 97.30 and 99.10% accuracy at 25th and 75th percentiles, respectively, from 50 independent cross-validation samples, which is significantly better than other profiling and dimensional reduction methods. The results show that the method is capable of finding parsimonious sets of biologically meaningful biomarkers with better accuracy than existing methods. Supporting Information material and MATLAB/R scripts to implement the methods described in the article are available at: http://www.cs.bham.ac.uk/szh/SourceCode-for-Proteomics.zip [source]


Ridge directional singular points for fingerprint recognition and matching

APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Issue 1 2006
Issam Dagher
Abstract In this paper, a new approach to extract singular points in a fingerprint image is presented. It is usually difficult to locate the exact position of a core or a delta due to the noisy nature of fingerprint images. These points are the most widely used for fingerprint classification and matching. Image enhancement, thinning, cropping, and alignment are used for minutiae extraction. Based on the Poincaré curve obtained from the directional image, our algorithm extracts the singular points in a fingerprint with high accuracy. It examines ridge directions when singular points are missing. The algorithm has been tested for classification performance on the NIST-4 fingerprint database and found to give better results than the neural networks algorithm. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Supervised classification and tunnel vision

APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Issue 2 2005
David J. Hand
Abstract In recent decades many highly sophisticated methods have been developed for supervised classification. These developments involve complex models requiring complicated iterative parameter estimation schemes, and can achieve unprecedented performance in terms of misclassification rate. However, in focusing efforts on the single performance criterion of misclassification rate, researchers have abstracted the problem beyond the bounds of practical usefulness, to the extent that the supposed performance improvements are irrelevant in comparison with other factors influencing performance. Examples of such factors are given. An illustration is provided of a new method which, for the particular problem of credit scoring, improves a relevant measure of classification performance while maintaining interpretability. Copyright © 2005 John Wiley & Sons, Ltd. [source]


SELECTING EFFECTIVE FEATURES AND RELATIONS FOR EFFICIENT MULTI-RELATIONAL CLASSIFICATION

COMPUTATIONAL INTELLIGENCE, Issue 3 2010
Jun He
Feature selection is an essential data processing step to remove irrelevant and redundant attributes for shorter learning time, better accuracy, and better comprehensibility. A number of algorithms have been proposed in both data mining and machine learning areas. These algorithms are usually used in a single table environment, where data are stored in one relational table or one flat file. They are not suitable for a multi-relational environment, where data are stored in multiple tables joined to one another by semantic relationships. To address this problem, in this article, we propose a novel approach called,FARS,to conduct both,Feature,And,Relation,Selection for efficient multi-relational classification. Through this approach, we not only extend the traditional feature selection method to select relevant features from multi-relations, but also develop a new method to reconstruct the multi-relational database schema and eliminate irrelevant tables to improve classification performance further. The results of the experiments conducted on both real and synthetic databases show that,FARS,can effectively choose a small set of relevant features, thereby enhancing classification efficiency and prediction accuracy significantly. [source]