Machine Learning Algorithms (machine + learning_algorithms)

Distribution by Scientific Domains


Selected Abstracts


An early warning system for detection of financial crisis using financial market volatility

EXPERT SYSTEMS, Issue 2 2006
Kyong Joo Oh
Abstract: This study proposes an early warning system (EWS) for detection of financial crisis with a daily financial condition indicator (DFCI) designed to monitor the financial markets and provide warning signals. The proposed EWS differs from other commonly used EWSs in two aspects: (i) it is based on dynamic daily movements of the financial markets; and (ii) it is established as a pattern classifier, which identifies predefined unstable states in terms of financial market volatility. Indeed it issues warning signals on a daily basis by judging whether the financial market has entered a predefined unstable state or not. The major strength of a DFCI is that it can issue timely warning signals while other conventional EWSs must wait for the next round input of monthly or quarterly information. Construction of a DFCI consists of two steps where machine learning algorithms are expected to play a significant role, i.e. (i) establishing sub-DFCIs on various daily financial variables by an artificial neural network, and (ii) integrating the sub-DFCIs into an integrated DFCI by a genetic algorithm. The DFCI for the Korean financial market is built as an empirical case study. [source]


An introduction of the condition class space with continuous value discretization and rough set theory

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 2 2006
Malcolm J. Beynon
The granularity of an information system has an incumbent effect on the efficacy of the analysis from many machine learning algorithms. An information system contains a universe of objects characterized and categorized by condition and decision attributes. To manage the concomitant granularity, a level of continuous value discretization (CVD) is often undertaken. In the case of the rough set theory (RST) methodology for object classification, the granularity contributes to the grouping of objects into condition classes with the same condition attribute values. This article exposits the effect of a level of CVD on the subsequent condition classes constructed, with the introduction of the condition class space,the domain within which the condition classes exist. This domain elucidates the association of the condition classes to the related decision outcomes,reflecting the inexactness incumbent when a level of CVD is undertaken. A series of measures is defined that quantify this association. Throughout this study and without loss of generality, the findings are made through the RST methodology. This further offers a novel exposition of the relationship between all the condition attributes and the RST-related reducts (subsets of condition attributes). © 2006 Wiley Periodicals, Inc. Int J Int Syst 21: 173,191, 2006. [source]


Multiple classifier integration for the prediction of protein structural classes

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 14 2009
Lei Chen
Abstract Supervised classifiers, such as artificial neural network, partition trees, and support vector machines, are often used for the prediction and analysis of biological data. However, choosing an appropriate classifier is not straightforward because each classifier has its own strengths and weaknesses, and each biological dataset has its own characteristics. By integrating many classifiers together, people can avoid the dilemma of choosing an individual classifier out of many to achieve an optimized classification results (Rahman et al., Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and Its Variation, Springer, Berlin, 2002, 167,178). The classification algorithms come from Weka (Witten and Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, 2005) (a collection of software tools for machine learning algorithms). By integrating many predictors (classifiers) together through simple voting, the correct prediction (classification) rates are 65.21% and 65.63% for a basic training dataset and an independent test set, respectively. These results are better than any single machine learning algorithm collected in Weka when exactly the same data are used. Furthermore, we introduce an integration strategy which takes care of both classifier weightings and classifier redundancy. A feature selection strategy, called minimum redundancy maximum relevance (mRMR), is transferred into algorithm selection to deal with classifier redundancy in this research, and the weightings are based on the performance of each classifier. The best classification results are obtained when 11 algorithms are selected by mRMR method, and integrated together through majority votes with weightings. As a result, the prediction correct rates are 68.56% and 69.29% for the basic training dataset and the independent test dataset, respectively. The web-server is available at http://chemdata.shu.edu.cn/protein_st/. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009 [source]


Machine learning approaches for prediction of linear B-cell epitopes on proteins

JOURNAL OF MOLECULAR RECOGNITION, Issue 3 2006
Johannes Söllner
Abstract Identification and characterization of antigenic determinants on proteins has received considerable attention utilizing both, experimental as well as computational methods. For computational routines mostly structural as well as physicochemical parameters have been utilized for predicting the antigenic propensity of protein sites. However, the performance of computational routines has been low when compared to experimental alternatives. Here we describe the construction of machine learning based classifiers to enhance the prediction quality for identifying linear B-cell epitopes on proteins. Our approach combines several parameters previously associated with antigenicity, and includes novel parameters based on frequencies of amino acids and amino acid neighborhood propensities. We utilized machine learning algorithms for deriving antigenicity classification functions assigning antigenic propensities to each amino acid of a given protein sequence. We compared the prediction quality of the novel classifiers with respect to established routines for epitope scoring, and tested prediction accuracy on experimental data available for HIV proteins. The major finding is that machine learning classifiers clearly outperform the reference classification systems on the HIV epitope validation set. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Discovering robust protein biomarkers for disease from relative expression reversals in 2-D DIGE data.

PROTEINS: STRUCTURE, FUNCTION AND BIOINFORMATICS, Issue 8 2007
Troy J. Anderson
Abstract This study assesses the ability of a novel family of machine learning algorithms to identify changes in relative protein expression levels, measured using 2-D DIGE data, which support accurate class prediction. The analysis was done using a training set of 36 total cellular lysates comprised of six normal and three cancer biological replicates (the remaining are technical replicates) and a validation set of four normal and two cancer samples. Protein samples were separated by 2-D DIGE and expression was quantified using DeCyder-2D Differential Analysis Software. The relative expression reversal (RER) classifier correctly classified 9/9 training biological samples (p<0.022) as estimated using a modified version of leave one out cross validation and 6/6 validation samples. The classification rule involved comparison of expression levels for a single pair of protein spots, tropomyosin isoforms and ,-enolase, both of which have prior association as potential biomarkers in cancer. The data was also analyzed using algorithms similar to those found in the extended data analysis package of DeCyder software. We propose that by accounting for sources of within- and between-gel variation, RER classifiers applied to 2-D DIGE data provide a useful approach for identifying biomarkers that discriminate among protein samples of interest. [source]