Test Datasets (test + dataset)

Distribution by Scientific Domains


Selected Abstracts


Multiple classifier integration for the prediction of protein structural classes

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 14 2009
Lei Chen
Abstract Supervised classifiers, such as artificial neural network, partition trees, and support vector machines, are often used for the prediction and analysis of biological data. However, choosing an appropriate classifier is not straightforward because each classifier has its own strengths and weaknesses, and each biological dataset has its own characteristics. By integrating many classifiers together, people can avoid the dilemma of choosing an individual classifier out of many to achieve an optimized classification results (Rahman et al., Multiple Classifier Combination for Character Recognition: Revisiting the Majority Voting System and Its Variation, Springer, Berlin, 2002, 167,178). The classification algorithms come from Weka (Witten and Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, 2005) (a collection of software tools for machine learning algorithms). By integrating many predictors (classifiers) together through simple voting, the correct prediction (classification) rates are 65.21% and 65.63% for a basic training dataset and an independent test set, respectively. These results are better than any single machine learning algorithm collected in Weka when exactly the same data are used. Furthermore, we introduce an integration strategy which takes care of both classifier weightings and classifier redundancy. A feature selection strategy, called minimum redundancy maximum relevance (mRMR), is transferred into algorithm selection to deal with classifier redundancy in this research, and the weightings are based on the performance of each classifier. The best classification results are obtained when 11 algorithms are selected by mRMR method, and integrated together through majority votes with weightings. As a result, the prediction correct rates are 68.56% and 69.29% for the basic training dataset and the independent test dataset, respectively. The web-server is available at http://chemdata.shu.edu.cn/protein_st/. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009 [source]


Population pharmacokinetics of darbepoetin alfa in healthy subjects

BRITISH JOURNAL OF CLINICAL PHARMACOLOGY, Issue 1 2007
Balaji Agoram
Aim To develop and evaluate a population pharmacokinetic (PK) model of the long-acting erythropoiesis-stimulating protein, darbepoetin alfa in healthy subjects. Methods PK profiles were obtained from 140 healthy subjects receiving single intravenous and/or single or multiple subcutaneous doses of darbepoetin alfa (0.75,8.0 µg kg,1, or either 80 or 500 µg). Data were analysed by a nonlinear mixed-effects modelling approach using NONMEM software. Influential covariates were identified by covariate analysis emphasizing parameter estimates and their confidence intervals, rather than stepwise hypothesis testing. The model was evaluated by comparing simulated profiles (obtained using the covariate model) to the observed profiles in a test dataset. Results The population PK model, including first-order absorption, two-compartment disposition and first-order elimination, provided a good description of data. Modelling indicated that for a 70-kg human, the observed nearly twofold disproportionate dose,exposure relationship at the 8.0 µg kg,1 -dose relative to the 0.75 µg kg,1 -dose may reflect changing relative bioavailability, which increased from ,,48% at 0.75 µg kg,1 to 78% at 8.0 µg kg,1. The covariate analysis showed that increasing body weight may be related to increasing clearance and central compartment volume, and that the absorption rate constant decreased with increasing age. The full covariate model performed adequately in a fixed-effects prediction test against an external dataset. Conclusion The developed population PK model describes the inter- and intraindividual variability in darbepoetin alfa PK. The model is a suitable tool for predicting the PK response of darbepoetin alfa using clinically untested dosing regimens. [source]


A comparison of active set method and genetic algorithm approaches for learning weighting vectors in some aggregation operators

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 9 2001
David Nettleton
In this article we compare two contrasting methods, active set method (ASM) and genetic algorithms, for learning the weights in aggregation operators, such as weighted mean (WM), ordered weighted average (OWA), and weighted ordered weighted average (WOWA). We give the formal definitions for each of the aggregation operators, explain the two learning methods, give results of processing for each of the methods and operators with simple test datasets, and contrast the approaches and results. © 2001 John Wiley & Sons, Inc. [source]


Bibliomining for automated collection development in a digital library setting: Using data mining to discover Web-based scholarly research works

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, Issue 12 2003
Scott Nicholson
This research creates an intelligent agent for automated collection development in a digital library setting. It uses a predictive model based on facets of each Web page to select scholarly works. The criteria came from the academic library selection literature, and a Delphi study was used to refine the list to 41 criteria. A Perl program was designed to analyze a Web page for each criterion and applied to a large collection of scholarly and nonscholarly Web pages. Bibliomining, or data mining for libraries, was then used to create different classification models. Four techniques were used: logistic regression, nonparametric discriminant analysis, classification trees, and neural networks. Accuracy and return were used to judge the effectiveness of each model on test datasets. In addition, a set of problematic pages that were difficult to classify because of their similarity to scholarly research was gathered and classified using the models. The resulting models could be used in the selection process to automatically create a digital library of Web-based scholarly research works. In addition, the technique can be extended to create a digital library of any type of structured electronic information. [source]


Textural analysis of contrast-enhanced MR images of the breast

MAGNETIC RESONANCE IN MEDICINE, Issue 1 2003
Peter Gibbs
Abstract Texture analysis was applied to high-resolution, contrast-enhanced (CE) images of the breast to provide a method of lesion discrimination. Significant differences were seen between benign and malignant lesions for a number of textural features, including entropy and sum entropy. Using logistic regression analysis (LRA), a diagnostic accuracy of Az = 0.80 ± 0.07 was obtained with a model requiring only three parameters. By initially dividing the patient data into training and test datasets, reasonable model robustness was also established. On combining features obtained using textural analysis with lesion size, time to maximum enhancement, and patient age, a diagnostic accuracy of Az = 0.92 ± 0.05 was demonstrated. Magn Reson Med 50:92,98, 2003. © 2003 Wiley-Liss, Inc. [source]