Classifier

Distribution by Scientific Domains
Distribution within Chemistry

Kinds of Classifier

  • Baye classifier
  • multiple classifier
  • svm classifier


  • Selected Abstracts


    Fault detection and classification technique in EHV transmission lines based on artificial neural networks

    EUROPEAN TRANSACTIONS ON ELECTRICAL POWER, Issue 5 2005
    Tahar Bouthiba
    Abstract This paper investigates a new approach based on Artificial Neural Networks (ANNs) for real-time fault detection and classification in power transmission lines which can be used in digital power system protection. The Fault Detector and Classifier (FDC) consists of four independent ANNs. The technique uses consecutive magnitude current and voltage data at one terminal as inputs to the corresponding ANN. The ANN outputs are used to indicate simultaneously the presence and the type of the fault. The FDC is tested under different fault types, fault locations, fault resistances and fault inception angles. All test results show that the proposed FDC can be used for very high speed digital relaying. Copyright 2005 John Wiley & Sons, Ltd. [source]


    A classifying procedure for signalling turning points

    JOURNAL OF FORECASTING, Issue 3 2004
    Lasse Koskinen
    Abstract A Hidden Markov Model (HMM) is used to classify an out-of-sample observation vector into either of two regimes. This leads to a procedure for making probability forecasts for changes of regimes in a time series, i.e. for turning points. Instead of estimating past turning points using maximum likelihood, the model is estimated with respect to known past regimes. This makes it possible to perform feature extraction and estimation for different forecasting horizons. The inference aspect is emphasized by including a penalty for a wrong decision in the cost function. The method, here called a ,Markov Bayesian Classifier (MBC)', is tested by forecasting turning points in the Swedish and US economies, using leading data. Clear and early turning point signals are obtained, contrasting favourably with earlier HMM studies. Some theoretical arguments for this are given. Copyright 2004 John Wiley & Sons, Ltd. [source]


    Glass analysis for forensic purposes,a comparison of classification methods

    JOURNAL OF CHEMOMETRICS, Issue 5-6 2007
    Grzegorz Zadora
    Abstract One of the purposes of the chemical analysis of glass fragments (pieces of glass of linear dimension ca. 0.5,mm) for forensic purposes is a classification of those fragments into use categories, for example windows, car headlights and containers. The object of this research was to check the efficiency of Nave Bayes Classifiers (NBCs) and Support Vector Machines (SVMs) to the application of the classification of glass objects when those objects may be described by the major and minor elemental concentrations obtained by Scanning Electron Microscopy coupled with an Energy Dispersive X-ray spectrometer which is routinely used in many forensic institutes. Copyright 2007 John Wiley & Sons, Ltd. [source]


    Learning-based 3D face detection using geometric context

    COMPUTER ANIMATION AND VIRTUAL WORLDS (PREV: JNL OF VISUALISATION & COMPUTER ANIMATION), Issue 4-5 2007
    Yanwen Guo
    Abstract In computer graphics community, face model is one of the most useful entities. The automatic detection of 3D face model has special significance to computer graphics, vision, and human-computer interaction. However, few methods have been dedicated to this task. This paper proposes a machine learning approach for fully automatic 3D face detection. To exploit the facial features, we introduce geometric context, a novel shape descriptor which can compactly encode the distribution of local geometry and can be evaluated efficiently by using a new volume encoding form, named integral volume. Geometric contexts over 3D face offer the rich and discriminative representation of facial shapes and hence are quite suitable to classification. We adopt an AdaBoost learning algorithm to select the most effective geometric context-based classifiers and to combine them into a strong classifier. Given an arbitrary 3D model, our method first identifies the symmetric parts as candidates with a new reflective symmetry detection algorithm. Then uses the learned classifier to judge whether the face part exists. Experiments are performed on a large set of 3D face and non-face models and the results demonstrate high performance of our method. Copyright 2007 John Wiley & Sons, Ltd. [source]


    Towards closing the analysis gap: Visual generation of decision supporting schemes from raw data

    COMPUTER GRAPHICS FORUM, Issue 3 2008
    T. May
    Abstract The derivation, manipulation and verification of analytical models from raw data is a process which requires a transformation of information across different levels of abstraction. We introduce a concept for the coupling of data classification and interactive visualization in order to make this transformation visible and steerable for the human user. Data classification techniques generate mappings that formally group data items into categories. Interactive visualization includes the user into an iterative refinement process. The user identifies and selects interesting patterns to define these categories. The following step is the transformation of a visible pattern into the formal definition of a classifier. In the last step the classifier is transformed back into a pattern that is blended with the original data in the same visual display. Our approach allows in intuitive assessment of a formal classifier and its model, the detection of outliers and the handling of noisy data using visual pattern-matching. We instantiated the concept using decision trees for classification and KVMaps as the visualization technique. The generation of a classifier from visual patterns and its verification is transformed from a cognitive to a mostly pre-cognitive task. [source]


    An Adaptive Conjugate Gradient Neural Network,Wavelet Model for Traffic Incident Detection

    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, Issue 4 2000
    H. Adeli
    Artificial neural networks are known to be effective in solving problems involving pattern recognition and classification. The traffic incident-detection problem can be viewed as recognizing incident patterns from incident-free patterns. A neural network classifier has to be trained first using incident and incident-free traffic data. The dimensionality of the training input data is high, and the embedded incident characteristics are not easily detectable. In this article we present a computational model for automatic traffic incident detection using discrete wavelet transform, linear discriminant analysis, and neural networks. Wavelet transform and linear discriminant analysis are used for feature extraction, denoising, and effective preprocessing of data before an adaptive neural network model is used to make the traffic incident detection. Simulated as well as actual traffic data are used to test the model. For incidents with a duration of more than 5 minutes, the incident-detection model yields a detection rate of nearly 100 percent and a false-alarm rate of about 1 percent for two- or three-lane freeways. [source]


    Medical association rule mining using genetic network programming

    ELECTRONICS & COMMUNICATIONS IN JAPAN, Issue 2 2008
    Kaoru Shimada
    Abstract An efficient algorithm for building a classifier is proposed based on an important association rule mining using genetic network programming (GNP). The proposed method measures the significance of the association via the chi-squared test. Users can define the conditions of important association rules for building a classifier flexibly. The definition can include not only the minimum threshold chi-squared value, but also the number of attributes in the association rules. Therefore, all the extracted important rules can be used for classification directly. GNP is one of the evolutionary optimization techniques, which uses the directed graph structure as genes. Instead of generating a large number of candidate rules, our method can obtain a sufficient number of important association rules for classification. In addition, our method suits association rule mining from dense databases such as medical datasets, where many frequently occurring items are found in each tuple. In this paper, we describe an algorithm for classification using important association rules extracted by GNP with acquisition mechanisms and present some experimental results of medical datasets. 2008 Wiley Periodicals, Inc. Electron Comm Jpn, 91(2): 46,54, 2008; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/eej.10022 [source]


    Expansion of cumulant-based classifier to frequency shift keying modulations and to the use of support vector machines

    EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, Issue 1 2008
    H. Mustafa
    This paper proposes an expansion of the cumulant-based classifier of digital modulations to frequency shift keying (FSK) modulations. Cumulant estimates are calculated when the FSK modulation is present. The features obtained from the cumulant estimators are used in a support vector machine (SVM) classifier. The performance of the SVM classifier is compared to other classifiers. Among these other classifiers is the cumulant-based tree classifier which uses thresholds defined by the asymptotic values of the cumulant estimators. The simulation results show that using SVM classifier improves the performance. Copyright 2007 John Wiley & Sons, Ltd. [source]


    Detection and delineation of P and T waves in 12-lead electrocardiograms

    EXPERT SYSTEMS, Issue 1 2009
    Sarabjeet Mehta
    Abstract: This paper presents an efficient method for the detection and delineation of P and T waves in 12-lead electrocardiograms (ECGs) using a support vector machine (SVM). Digital filtering techniques are used to remove power line interference and baseline wander. An SVM is used as a classifier for the detection and delineation of P and T waves. The performance of the algorithm is validated using original simultaneously recorded 12-lead ECG recordings from the standard CSE (Common Standards for Quantitative Electrocardiography) ECG multi-lead measurement library. A significant detection rate of 95.43% is achieved for P wave detection and 96.89% for T wave detection. Delineation performance of the algorithm is validated by calculating the mean and standard deviation of the differences between automatic and manual annotations by the referee cardiologists. The proposed method not only detects all kinds of morphologies of QRS complexes, P and T waves but also delineates them accurately. The onsets and offsets of the detected P and T waves are found to be within the tolerance limits given in the CSE library. [source]


    Financial decision support using neural networks and support vector machines

    EXPERT SYSTEMS, Issue 4 2008
    Chih-Fong Tsai
    Abstract: Bankruptcy prediction and credit scoring are the two important problems facing financial decision support. The multilayer perceptron (MLP) network has shown its applicability to these problems and its performance is usually superior to those of other traditional statistical models. Support vector machines (SVMs) are the core machine learning techniques and have been used to compare with MLP as the benchmark. However, the performance of SVMs is not fully understood in the literature because an insufficient number of data sets is considered and different kernel functions are used to train the SVMs. In this paper, four public data sets are used. In particular, three different sizes of training and testing data in each of the four data sets are considered (i.e. 3:7, 1:1 and 7:3) in order to examine and fully understand the performance of SVMs. For SVM model construction, the linear, radial basis function and polynomial kernel functions are used to construct the SVMs. Using MLP as the benchmark, the SVM classifier only performs better in one of the four data sets. On the other hand, the prediction results of the MLP and SVM classifiers are not significantly different for the three different sizes of training and testing data. [source]


    An early warning system for detection of financial crisis using financial market volatility

    EXPERT SYSTEMS, Issue 2 2006
    Kyong Joo Oh
    Abstract: This study proposes an early warning system (EWS) for detection of financial crisis with a daily financial condition indicator (DFCI) designed to monitor the financial markets and provide warning signals. The proposed EWS differs from other commonly used EWSs in two aspects: (i) it is based on dynamic daily movements of the financial markets; and (ii) it is established as a pattern classifier, which identifies predefined unstable states in terms of financial market volatility. Indeed it issues warning signals on a daily basis by judging whether the financial market has entered a predefined unstable state or not. The major strength of a DFCI is that it can issue timely warning signals while other conventional EWSs must wait for the next round input of monthly or quarterly information. Construction of a DFCI consists of two steps where machine learning algorithms are expected to play a significant role, i.e. (i) establishing sub-DFCIs on various daily financial variables by an artificial neural network, and (ii) integrating the sub-DFCIs into an integrated DFCI by a genetic algorithm. The DFCI for the Korean financial market is built as an empirical case study. [source]


    Neural network ensembles: combining multiple models for enhanced performance using a multistage approach

    EXPERT SYSTEMS, Issue 5 2004
    Shuang Yang
    Abstract: Neural network ensembles (sometimes referred to as committees or classifier ensembles) are effective techniques to improve the generalization of a neural network system. Combining a set of neural network classifiers whose error distributions are diverse can generate better results than any single classifier. In this paper, some methods for creating ensembles are reviewed, including the following approaches: methods of selecting diverse training data from the original source data set, constructing different neural network models, selecting ensemble nets from ensemble candidates and combining ensemble members' results. In addition, new results on ensemble combination methods are reported. [source]


    Spatial prediction of nitrate pollution in groundwaters using neural networks and GIS: an application to South Rhodope aquifer (Thrace, Greece)

    HYDROLOGICAL PROCESSES, Issue 3 2009
    Dr A. Gemitzi
    Abstract Neural network techniques combined with Geographical Information Systems (GIS), are used in the spatial prediction of nitrate pollution in groundwaters. Initially, the most important parameters controlling groundwater pollution by nitrates are determined. These include hydraulic conductivity of the aquifer, depth to the aquifer, land uses, soil permeability, and fine to coarse grain ratio in the unsaturated zone. All these parameters were quantified in a GIS environment, and were standardized in a common scale. Subsequently, a neural network classification was applied, using a multi-layer perceptron classifier with the back propagation (BP) algorithm, in order to categorize the examined area into categories of groundwater nitrate pollution potential. The methodology was applied to South Rhodope aquifer (Thrace, Greece). The calculation was based on information from 214 training sites, which correspond to monitored nitrate concentrations in groundwaters in the area. The predictive accuracy of the model developed reached 86% in the training samples, 74% in the overall sample and 71% in the test samples. This indicates that this methodology is promising to describe the spatial pattern of nitrate pollution. Copyright 2008 John Wiley & Sons, Ltd. [source]


    Improved GMM with parameter initialization for unsupervised adaptation of Brain,Computer interface

    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING, Issue 6 2010
    Guangquan Liu
    Abstract An important property of brain signals is their nonstationarity. How to adapt a brain,computer interface (BCI) to the changing brain states is one of the challenges faced by BCI researchers, especially in real application where the subject's real intent is unknown to the system. Gaussian mixture model (GMM) has been used for the unsupervised adaptation of the classifier in BCI. In this paper, a method of initializing the model parameters is proposed for expectation maximization-based GMM parameter estimation. This improved GMM method and other two existing unsupervised adaptation methods are applied to groups of constructed artificial data with different data properties. Performances of these methods in different situations are analyzed. Compared with the other two unsupervised adaptation methods, this method shows a better ability of adapting to changes and discovering class information from unlabelled data. The methods are also applied to real EEG data recorded in 19 experiments. For real data, the proposed method achieves an error rate significantly lower than the other two unsupervised methods. Results of the real data agree with the analysis based on the artificial data, which confirms not only the effectiveness of our method but also the validity of the constructed data. Copyright 2009 John Wiley & Sons, Ltd. [source]


    Elucidation of a protein signature discriminating six common types of adenocarcinoma

    INTERNATIONAL JOURNAL OF CANCER, Issue 4 2007
    Gregory C. Bloom
    Abstract Pathologists are commonly facing the problem of attempting to identify the site of origin of a metastatic cancer when no primary tumor has been identified, yet few markers have been identified to date. Multitumor classifiers based on microarray based RNA expression have recently been described. Here we describe the first approximation of a tumor classifier based entirely on protein expression quantified by two-dimensional gel electrophoresis (2DE). The 2DE was used to analyze the proteomic expression pattern of 77 similarly appearing (using histomorphology) adenocarcinomas encompassing 6 types or sites of origin: ovary, colon, kidney, breast, lung and stomach. Discriminating sets of proteins were identified and used to train an artificial neural network (ANN). A leave-one-out cross validation (LOOCV) method was used to test the ability of the constructed network to predict the single held out sample from each iteration with a maximum predictive accuracy of 87% and an average predictive accuracy of 82% over the range of proteins chosen for its construction. These findings demonstrate the use of proteomics to construct a highly accurate ANN-based classifier for the detection of an individual tumor type, as well as distinguishing between 6 common tumor types in an unknown primary diagnosis setting. 2006 Wiley-Liss, Inc. [source]


    Urinary biomarker profiling in transitional cell carcinoma

    INTERNATIONAL JOURNAL OF CANCER, Issue 11 2006
    Nicholas P. Munro
    Abstract Urinary biomarkers or profiles that allow noninvasive detection of recurrent transitional cell carcinoma (TCC) of the bladder are urgently needed. We obtained duplicate proteomic (SELDI) profiles from 227 subjects (118 TCC, 77 healthy controls and 32 controls with benign urological conditions) and used linear mixed effects models to identify peaks that are differentially expressed between TCC and controls and within TCC subgroups. A Random Forest classifier was trained on 130 profiles to develop an algorithm to predict the presence of TCC in a randomly selected initial test set (n = 54) and an independent validation set (n = 43) several months later. Twenty two peaks were differentially expressed between all TCC and controls (p < 10,7). However potential confounding effects of age, sex and analytical run were identified. In an age-matched sub-set, 23 peaks were differentially expressed between TCC and combined benign and healthy controls at the 0.005 significance level. Using the Random Forest classifier, TCC was predicted with 71.7% sensitivity and 62.5% specificity in the initial set and with 78.3% sensitivity and 65.0% specificity in the validation set after 6 months, compared with controls. Several peaks of importance were also identified in the linear mixed effects model. We conclude that SELDI profiling of urine samples can identify patients with TCC with comparable sensitivities and specificities to current tumor marker tests. This is the first time that reproducibility has been demonstrated on an independent test set analyzed several months later. Identification of the relevant peaks may facilitate multiplex marker assay development for detection of recurrent disease. 2006 Wiley-Liss, Inc. [source]


    Efficient packet classification on network processors

    INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, Issue 1 2008
    Koert Vlaeminck
    Abstract Always-on networking and a growing interest in multimedia- and conversational-IP services offer an opportunity to network providers to participate in the service layer, if they increase functional intelligence in their networks. An important prerequisite to providing advanced services in IP access networks is the availability of a high-speed packet classification module in the network nodes, necessary for supporting any IP service imaginable. Often, access nodes are installed in remote offices, where they terminate a large number of subscriber lines. As such, technology adding processing power in this environment should be energy-efficient, whilst maintaining the flexibility to cope with changing service requirements. Network processor units (NPUs) are designed to overcome these operational restrictions, and in this context this paper investigates their suitability for wireline and robust packet classification in a firewalling application. State-of-the-art packet classification algorithms are examined, whereafter the performance and memory requirements are compared for a Binary Decision Diagram (BDD) and sequential search approach. Several space optimizations for implementing BDD classifiers on NPU hardware are discussed and it is shown that the optimized BDD classifier is able to operate at gigabit wirespeed, independent of the ruleset size, which is a major advantage over a sequential search classifier. Copyright 2007 John Wiley & Sons, Ltd. [source]


    An evolutionary algorithm for constructing a decision forest: Combining the classification of disjoints decision trees

    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 4 2008
    Lior Rokach
    Decision forest is an ensemble classification method that combines multiple decision trees to in a manner that results in more accurate classifications. By combining multiple heterogeneous decision trees, decision forest is effective in mitigating noise that is often prevalent in real-world classification tasks. This paper presents a new genetic algorithm for constructing a decision forest. Each decision tree classifier is trained using a disjoint set of attributes. Moreover, we examine the effectiveness of using a Vapnik,Chervonenkis dimension bound for evaluating the fitness function of decision forest. The new algorithm was tested on various datasets. The obtained results have been compared to other methods, indicating the superiority of the proposed algorithm. 2008 Wiley Periodicals, Inc. [source]


    Comparing a genetic fuzzy and a neurofuzzy classifier for credit scoring

    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 11 2002
    F. Hoffmann
    In this paper, we evaluate and contrast two types of fuzzy classifiers for credit scoring. The first classifier uses evolutionary optimization and boosting for learning fuzzy classification rules. The second classifier is a fuzzy neural network that employs a fuzzy variant of the classic backpropagation learning algorithm. The experiments are carried out on a real life credit scoring data set. It is shown that, for the case at hand, the boosted genetic fuzzy classifier performs better than both the neurofuzzy classifier and the well-known C4.5(rules) decision tree(rules) induction algorithm. However, the better performance of the genetic fuzzy classifier is offset by the fact that it infers approximate fuzzy rules which are less comprehensible for humans than the descriptive fuzzy rules inferred by the neurofuzzy classifier. 2002 Wiley Periodicals, Inc. [source]


    New computational algorithm for the prediction of protein folding types

    INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY, Issue 1 2001
    Nikola, tambuk
    Abstract We present a new computational algorithm for the prediction of a secondary protein structure. The method enables the evaluation of ,- and ,-protein folding types from the nucleotide sequences. The procedure is based on the reflected Gray code algorithm of nucleotide,amino acid relationships, and represents the extension of Swanson's procedure in Ref. 4. It is shown that six-digit binary notation of each codon enables the prediction of ,- and ,-protein folds by means of the error-correcting linear block triple-check code. We tested the validity of the method on the test set of 140 proteins (70 ,- and 70 ,-folds). The test set consisted of standard ,- and ,-protein classes from Jpred and SCOP databases, with nucleotide sequence available in the GenBank database. 100% accurate classification of ,- and ,-protein folds, based on 39 dipeptide addresses derived by the error-correcting coding procedure was obtained by means of the logistic regression analysis (p<0.00000001). Classification tree and machine learning sequential minimal optimization (SMO) classifier confirmed the results by means 97.1% and 90% accurate classification, respectively. Protein fold prediction quality tested by means of leave-one-out cross-validation was a satisfactory 82.1% for the logistic regression and 81.4% for the SMO classifier. The presented procedure of computational analysis can be helpful in detecting the type of protein folding from the newly sequenced exon regions. The method enables quick, simple, and accurate prediction of ,- and ,-protein folds from the nucleotide sequence on a personal computer. 2001 John Wiley & Sons, Inc. Int J Quant Chem 84: 13,22, 2001 [source]


    Automatic classification of protein crystallization images using a curve-tracking algorithm

    JOURNAL OF APPLIED CRYSTALLOGRAPHY, Issue 2 2004
    Marshall Bern
    An algorithm for automatic classification of protein crystallization images acquired from a high-throughput vapor-diffusion system is described. The classifier uses edge detection followed by dynamic-programming curve tracking to determine the drop boundary; this technique optimizes a scoring function that incorporates roundness, smoothness and gradient intensity. The classifier focuses on the most promising region in the drop and computes a number of statistical features, including some derived from the Hough transform and from curve tracking. The five classes of images are `Empty', `Clear', `Precipitate', `Microcrystal Hit' and `Crystal'. On test data, the classifier gives about 12% false negatives (true crystals called `Empty', `Clear' or `Precipitate') and about 14% false positives (true clears or precipitates called `Crystal' or `Microcrystal Hit'). [source]


    Erratum: ST-PLS: a multi-directional nearest shrunken centroid type classifier via PLS

    JOURNAL OF CHEMOMETRICS, Issue 6 2008
    Solve Sb
    J. Chemometrics 22(1): 54,62. DOI: 10.1002/cem.1101 It has come to our attention that errors arose in the printing of this article. The article title and the authors' names were printed incorrectly and should appear as above. Throughout this article, the characters ,fi' in, for example, the word ,classifiers' were not present in the printed version of the paper. We apologise for any inconvenience caused. [source]


    Active learning support vector machines for optimal sample selection in classification

    JOURNAL OF CHEMOMETRICS, Issue 6 2004
    Simeone Zomer
    Abstract Labelling samples is a procedure that may result in significant delays particularly when dealing with larger datasets and/or when labelling implies prolonged analysis. In such cases a strategy that allows the construction of a reliable classifier on the basis of a minimal sized training set by labelling a minor fraction of samples can be of advantage. Support vector machines (SVMs) are ideal for such an approach because the classifier relies on only a small subset of samples, namely the support vectors, while being independent from the remaining ones that typically form the majority of the dataset. This paper describes a procedure where a SVM classifier is constructed with support vectors systematically retrieved from the pool of unlabelled samples. The procedure is termed ,active' because the algorithm interacts with the samples prior to their labelling rather than waiting passively for the input. The learning behaviour on simulated datasets is analysed and a practical application for the detection of hydrocarbons in soils using mass spectrometry is described. Results on simulations show that the active learning SVM performs optimally on datasets where the classes display an intermediate level of separation. On the real case study the classifier correctly assesses the membership of all samples in the original dataset by requiring for labelling around 14% of the data. Its subsequent application on a second dataset of analogous nature also provides perfect classification without further labelling, giving the same outcome as most classical techniques based on the entirely labelled original dataset. Copyright 2004 John Wiley & Sons, Ltd. [source]


    Hybrid Bayesian networks: making the hybrid Bayesian classifier robust to missing training data

    JOURNAL OF CHEMOMETRICS, Issue 5 2003
    Nathaniel A. Woody
    Abstract Many standard classification methods are incapable of handling missing values in a sample. Instead, these methods must rely on external filling methods in order to estimate the missing values. The hybrid network proposed in this paper is an extension of the hybrid classifier that is robust to missing values. The hybrid network is produced by performing empirical Bayesian network structure learning to create a Bayesian network that retains its classification ability in the presence of missing data in both training and test cases. The performance of the hybrid network is measured by calculating a misclassification rate when data are removed from a dataset. These misclassification curves are then compared against similar curves produced from the hybrid classifier and from a classification tree. Copyright 2003 John Wiley & Sons, Ltd. [source]


    Multiple classifier integration for the prediction of protein structural classes

    JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 14 2009
    Lei Chen
    Abstract Supervised classifiers, such as artificial neural network, partition trees, and support vector machines, are often used for the prediction and analysis of biological data. However, choosing an appropriate classifier is not straightforward because each classifier has its own strengths and weaknesses, and each biological dataset has its own characteristics. By integrating many classifiers together, people can avoid the dilemma of choosing an individual classifier out of many to achieve an optimized classification results (Rahman et al., Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and Its Variation, Springer, Berlin, 2002, 167,178). The classification algorithms come from Weka (Witten and Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, 2005) (a collection of software tools for machine learning algorithms). By integrating many predictors (classifiers) together through simple voting, the correct prediction (classification) rates are 65.21% and 65.63% for a basic training dataset and an independent test set, respectively. These results are better than any single machine learning algorithm collected in Weka when exactly the same data are used. Furthermore, we introduce an integration strategy which takes care of both classifier weightings and classifier redundancy. A feature selection strategy, called minimum redundancy maximum relevance (mRMR), is transferred into algorithm selection to deal with classifier redundancy in this research, and the weightings are based on the performance of each classifier. The best classification results are obtained when 11 algorithms are selected by mRMR method, and integrated together through majority votes with weightings. As a result, the prediction correct rates are 68.56% and 69.29% for the basic training dataset and the independent test dataset, respectively. The web-server is available at http://chemdata.shu.edu.cn/protein_st/. 2009 Wiley Periodicals, Inc. J Comput Chem, 2009 [source]


    Prediction of integral membrane protein type by collocated hydrophobic amino acid pairs

    JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 1 2009
    Ke Chen
    Abstract A computational model, IMP-TYPE, is proposed for the classification of five types of integral membrane proteins from protein sequence. The proposed model aims not only at providing accurate predictions but most importantly it incorporates interesting and transparent biological patterns. When contrasted with the best-performing existing models, IMP-TYPE reduces the error rates of these methods by 19 and 34% for two out-of-sample tests performed on benchmark datasets. Our empirical evaluations also show that the proposed method provides even bigger improvements, i.e., 29 and 45% error rate reductions, when predictions are performed for sequences that share low (40%) identity with sequences from the training dataset. We also show that IMP-TYPE can be used in a standalone mode, i.e., it duplicates significant majority of correct predictions provided by other leading methods, while providing additional correct predictions which are incorrectly classified by the other methods. Our method computes predictions using a Support Vector Machine classifier that takes feature-based encoded sequence as its input. The input feature set includes hydrophobic AA pairs, which were selected by utilizing a consensus of three feature selection algorithms. The hydrophobic residues that build up the AA pairs used by our method are shown to be associated with the formation of transmembrane helices in a few recent studies concerning integral membrane proteins. Our study also indicates that Met and Phe display a certain degree of hydrophobicity, which may be more crucial than their polarity or aromaticity when they occur in the transmembrane segments. This conclusion is supported by a recent study on potential of mean force for membrane protein folding and a study of scales for membrane propensity of amino acids. 2008 Wiley Periodicals, Inc. J Comput Chem, 2009 [source]


    LIDAR and vision-based pedestrian detection system

    JOURNAL OF FIELD ROBOTICS (FORMERLY JOURNAL OF ROBOTIC SYSTEMS), Issue 9 2009
    Cristiano Premebida
    A perception system for pedestrian detection in urban scenarios using information from a LIDAR and a single camera is presented. Two sensor fusion architectures are described, a centralized and a decentralized one. In the former, the fusion process occurs at the feature level, i.e., features from LIDAR and vision spaces are combined in a single vector for posterior classification using a single classifier. In the latter, two classifiers are employed, one per sensor-feature space, which were offline selected based on information theory and fused by a trainable fusion method applied over the likelihoods provided by the component classifiers. The proposed schemes for sensor combination, and more specifically the trainable fusion method, lead to enhanced detection performance and, in addition, maintenance of false-alarms under tolerable values in comparison with single-based classifiers. Experimental results highlight the performance and effectiveness of the proposed pedestrian detection system and the related sensor data combination strategies. 2009 Wiley Periodicals, Inc. [source]


    Autonomous off-road navigation with end-to-end learning for the LAGR program

    JOURNAL OF FIELD ROBOTICS (FORMERLY JOURNAL OF ROBOTIC SYSTEMS), Issue 1 2009
    Max Bajracharya
    We describe a fully integrated real-time system for autonomous off-road navigation that uses end-to-end learning from onboard proprioceptive sensors, operator input, and stereo cameras to adapt to local terrain and extend terrain classification into the far field to avoid myopic behavior. The system consists of two learning algorithms: a short-range, geometry-based local terrain classifier that learns from very few proprioceptive examples and is robust in many off-road environments; and a long-range, image-based classifier that learns from geometry-based classification and continuously generalizes geometry to appearance, making it effective even in complex terrain and varying lighting conditions. In addition to presenting the learning algorithms, we describe the system architecture and results from the Learning Applied to Ground Robots (LAGR) program's field tests. 2008 Wiley Periodicals, Inc. [source]


    Pharmacokinetic mapping for lesion classification in dynamic breast MRI

    JOURNAL OF MAGNETIC RESONANCE IMAGING, Issue 6 2010
    Matthias C. Schabel PhD
    Abstract Purpose: To prospectively investigate whether a rapid dynamic MRI protocol, in conjunction with pharmacokinetic modeling, could provide diagnostically useful information for discriminating biopsy-proven benign lesions from malignancies. Materials and Methods: Patients referred to breast biopsy based on suspicious screening findings were eligible. After anatomic imaging, patients were scanned using a dynamic protocol with complete bilateral breast coverage. Maps of pharmacokinetic parameters representing transfer constant (Ktrans), efflux rate constant (kep), blood plasma volume fraction (vp), and extracellular extravascular volume fraction (ve) were averaged over lesions and used, with biopsy results, to generate receiver operating characteristic curves for linear classifiers using one, two, or three parameters. Results: Biopsy and imaging results were obtained from 93 lesions in 74 of 78 study patients. Classification based on Ktrans and kep gave the greatest accuracy, with an area under the receiver operating characteristic curve of 0.915, sensitivity of 91%, and specificity of 85%, compared with values of 88% and 68%, respectively, obtained in a recent study of clinical breast MRI in a similar patient population. Conclusion: Pharmacokinetic classification of breast lesions is practical on modern MRI hardware and provides significant accuracy for identification of malignancies. Sensitivity of a two-parameter linear classifier is comparable to that reported in a recent multicenter study of clinical breast MRI, while specificity is significantly higher. J. Magn. Reson. Imaging 2010;31:1371,1378. 2010 Wiley-Liss, Inc. [source]


    Computer-aided detection of brain tumor invasion using multiparametric MRI

    JOURNAL OF MAGNETIC RESONANCE IMAGING, Issue 3 2009
    Todd R. Jensen PhD
    Abstract Purpose To determine the potential of using a computer-aided detection method to intelligently distinguish peritumoral edema alone from peritumor edema consisting of tumor using a combination of high-resolution morphological and physiological magnetic resonance imaging (MRI) techniques available on most clinical MRI scanners. Materials and Methods This retrospective study consisted of patients with two types of primary brain tumors: meningiomas (n = 7) and glioblastomas (n = 11). Meningiomas are typically benign and have a clear delineation of tumor and edema. Glioblastomas are known to invade outside the contrast-enhancing area. Four classifiers of differing designs were trained using morphological, diffusion-weighted, and perfusion-weighted features derived from MRI to discriminate tumor and edema, tested on edematous regions surrounding tumors, and assessed for their ability to detect nonenhancing tumor invasion. Results The four classifiers provided similar measures of accuracy when applied to the training and testing data. Each classifier was able to identify areas of nonenhancing tumor invasion supported with adjunct images or follow-up studies. Conclusion The combination of features derived from morphological and physiological imaging techniques contains the information necessary for computer-aided detection of tumor invasion and allows for the identification of tumor invasion not previously visualized on morphological, diffusion-weighted, and perfusion-weighted images and maps. Further validation of this approach requires obtaining spatially coregistered tissue samples in a study with a larger sample size. J. Magn. Reson. Imaging 2009;30:481,489. 2009 Wiley-Liss, Inc. [source]