Prediction Accuracy (prediction + accuracy)

Distribution by Scientific Domains


Selected Abstracts


SOIL EROSION AND SEDIMENT YIELD PREDICTION ACCURACY USING WEPP,

JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION, Issue 2 2004
John M. Laflen
ABSTRACT: The objectives of this paper are to discuss expectations for the Water Erosion Prediction Project (WEPP) accuracy, to review published studies related to WEPP goodness of fit, and to evaluate these in the context of expectations for WEPP's goodness of fit. WEPP model erosion predictions have been compared in numerous studies to observed values for soil loss and sediment delivery from cropland plots, forest roads, irrigated lands and small watersheds. A number of different techniques for evaluating WEPP have been used, including one recently developed where the ability of WEPP to accurately predict soil erosion can be compared to the accuracy of replicated plots to predict soil erosion. In one study involving 1,594 years of data from runoff plots, WEPP performed similarly to the Universal Soil Loss Erosion (USLE) technology, indicating that WEPP has met the criteria of results being "at least as good with respect to observed data and known relationships as those from the USLE," particularly when the USLE technology was developed using relationships derived from that data set, and using soil erodibility values measured on those plots using data sets from the same period of record. In many cases, WEPP performed as well as could be expected, based on comparisons with the variability in replicate data sets. One major finding has been that soil erodibility values calculated using the technology in WEPP for rainfall conditions may not be suitable for furrow irrigated conditions. WEPP was found to represent the major storms that account for high percentages of soil loss quite well,a single storm application that the USLE technology is unsuitable for,and WEPP has performed well for disturbed forests and forest roads. WEPP has been able to reflect the extremes of soil loss, being quite responsive to the wide differences in cropping, tillage, and other forms of management, one of the requirements for WEPP validation. WEPP was also found to perform well on a wide range of small watersheds, an area where USLE technology cannot be used. [source]


The Violence Proneness Scale of the DUSI-R Predicts Adverse Outcomes Associated with Substance Abuse

THE AMERICAN JOURNAL ON ADDICTIONS, Issue 2 2009
Levent Kirisci PhD
Accuracy of the Violence Proneness Scale (VPS) of the Drug Use Screening Inventory (DUSI-R)1 was evaluated in 328 boys for predicting use of illegal drugs, DUI, selling drugs, sexually transmitted disease, car accident while under acute effects of drugs/alcohol, trading drugs for sex, injuries from a fight, and traumatic head injury. Boys were prospectively tracked from age 16 to 19 at which time these outcomes were documented in the interim period. The results demonstrated that the VPS score is a significant predictor of all outcomes. Prediction accuracy ranged between 62%,83%. These findings suggest that the VPS may be useful for identifying youths who are at high risk for using illicit drugs and commonly associated adverse outcomes. [source]


In silico prediction and screening of ,-secretase inhibitors by molecular descriptors and machine learning methods

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 6 2010
Xue-Gang Yang
Abstract ,-Secretase inhibitors have been explored for the prevention and treatment of Alzheimer's disease (AD). Methods for prediction and screening of ,-secretase inhibitors are highly desired for facilitating the design of novel therapeutic agents against AD, especially when incomplete knowledge about the mechanism and three-dimensional structure of ,-secretase. We explored two machine learning methods, support vector machine (SVM) and random forest (RF), to develop models for predicting ,-secretase inhibitors of diverse structures. Quantitative analysis of the receiver operating characteristic (ROC) curve was performed to further examine and optimize the models. Especially, the Youden index (YI) was initially introduced into the ROC curve of RF so as to obtain an optimal threshold of probability for prediction. The developed models were validated by an external testing set with the prediction accuracies of SVM and RF 96.48 and 98.83% for ,-secretase inhibitors and 98.18 and 99.27% for noninhibitors, respectively. The different feature selection methods were used to extract the physicochemical features most relevant to ,-secretase inhibition. To the best of our knowledge, the RF model developed in this work is the first model with a broad applicability domain, based on which the virtual screening of ,-secretase inhibitors against the ZINC database was performed, resulting in 368 potential hit candidates. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010 [source]


Incorporating structural characteristics for identification of protein methylation sites

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 9 2009
Dray-Ming Shien
Abstract Studies over the last few years have identified protein methylation on histones and other proteins that are involved in the regulation of gene transcription. Several works have developed approaches to identify computationally the potential methylation sites on lysine and arginine. Studies of protein tertiary structure have demonstrated that the sites of protein methylation are preferentially in regions that are easily accessible. However, previous studies have not taken into account the solvent-accessible surface area (ASA) that surrounds the methylation sites. This work presents a method named MASA that combines the support vector machine with the sequence and structural characteristics of proteins to identify methylation sites on lysine, arginine, glutamate, and asparagine. Since most experimental methylation sites are not associated with corresponding protein tertiary structures in the Protein Data Bank, the effective solvent-accessible prediction tools have been adopted to determine the potential ASA values of amino acids in proteins. Evaluation of predictive performance by cross-validation indicates that the ASA values around the methylation sites can improve the accuracy of prediction. Additionally, an independent test reveals that the prediction accuracies for methylated lysine and arginine are 80.8 and 85.0%, respectively. Finally, the proposed method is implemented as an effective system for identifying protein methylation sites. The developed web server is freely available at http://MASA.mbc.nctu.edu.tw/. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009 [source]


Configurational-bias sampling technique for predicting side-chain conformations in proteins

PROTEIN SCIENCE, Issue 9 2006
Tushar Jain
Abstract Prediction of side-chain conformations is an important component of several biological modeling applications. In this work, we have developed and tested an advanced Monte Carlo sampling strategy for predicting side-chain conformations. Our method is based on a cooperative rearrangement of atoms that belong to a group of neighboring side-chains. This rearrangement is accomplished by deleting groups of atoms from the side-chains in a particular region, and regrowing them with the generation of trial positions that depends on both a rotamer library and a molecular mechanics potential function. This method allows us to incorporate flexibility about the rotamers in the library and explore phase space in a continuous fashion about the primary rotamers. We have tested our algorithm on a set of 76 proteins using the all-atom AMBER99 force field and electrostatics that are governed by a distance-dependent dielectric function. When the tolerance for correct prediction of the dihedral angles is a <20° deviation from the native state, our prediction accuracies for ,1 are 83.3% and for ,1 and ,2 are 65.4%. The accuracies of our predictions are comparable to the best results in the literature that often used Hamiltonians that have been specifically optimized for side-chain packing. We believe that the continuous exploration of phase space enables our method to overcome limitations inherent with using discrete rotamers as trials. [source]


Short-Term Traffic Volume Forecasting Using Kalman Filter with Discrete Wavelet Decomposition

COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, Issue 5 2007
Yuanchang Xie
Short-term traffic volume data are often corrupted by local noises, which may significantly affect the prediction accuracy of short-term traffic volumes. Discrete wavelet decomposition analysis is used to divide the original data into several approximate and detailed data such that the Kalman filter model can then be applied to the denoised data and the prediction accuracy can be improved. Two types of wavelet Kalman filter models based on Daubechies 4 and Haar mother wavelets are investigated. Traffic volume data collected from four different locations are used for comparison in this study. The test results show that both proposed wavelet Kalman filter models outperform the direct Kalman filter model in terms of mean absolute percentage error and root mean square error. [source]


Job completion prediction using case-based reasoning for Grid computing environments

CONCURRENCY AND COMPUTATION: PRACTICE & EXPERIENCE, Issue 9 2007
Lilian Noronha Nassif
Abstract One of the main focuses of Grid computing is solving resource-sharing problems in multi-institutional virtual organizations. In such heterogeneous and distributed environments, selecting the best resource to run a job is a complex task. The solutions currently employed still present numerous challenges and one of them is how to let users know when a job will finish. Consequently, reserve in advance remains unavailable. This article presents a new approach, which makes predictions for job execution time in Grid by applying the case-based reasoning paradigm. The work includes the development of a new case retrieval algorithm involving relevance sequence and similarity degree calculations. The prediction model is part of a multi-agent system that selects the best resource of a computational Grid to run a job. Agents representing candidate resources for job execution make predictions in a distributed and parallel manner. The technique presented here can be used in Grid environments at operation time to assist users with batch job submissions. Experimental results validate the prediction accuracy of the proposed mechanisms, and the performance of our case retrieval algorithm. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Gene,gene interactions between HNF4A and KCNJ11 in predicting Type 2 diabetes in women

DIABETIC MEDICINE, Issue 11 2007
L. Qi
Abstract Aims Recent studies indicate transcription factor hepatocyte nuclear factor 4, (HNF-4,, HNF4A) modulates the transcription of the pancreatic B-cell ATP-sensitive K+ (KATP) channel subunit Kir6.2 gene (KCNJ11). Both HNF4A and KCNJ11 have previously been associated with diabetes risk but little is known whether the variations in these genes interact with each other. Methods We conducted a prospective, nested case,control study of 714 incident cases of Type 2 diabetes and 1120 control subjects from the Nurses' Health Study. Results KCNJ11 E23K was significantly associated with an increased diabetes risk (odds ratio 1.26, 95% CI 1.03,1.53) while HNF4A P2 promoter polymorphisms were associated with a moderately increased risk at borderline significance. By using a logistic regression model, we found significant interactions between HNF4A rs2144908, rs4810424 and rs1884613 and KCNJ11 E23K (P for interaction = 0.017, 0.012 and 0.004, respectively). Carrying the minor alleles of the three HNF4A polymorphisms was associated with significantly greater diabetes risk in women carrying the KCNJ11 allele 23K, but not in those who did not carry this allele. Analyses using the multifactor dimensionality reduction (MDR) method confirmed the gene,gene interaction. We identified that the best interaction model included HNF4A rs2144908 and KCNJ11 E23K. Such a two-locus model showed the maximum cross-validation consistency of 10 out of 10 and a significant prediction accuracy of 54.2% (P = 0.01) on the basis of 1000-fold permutation testing. Conclusions Our data indicate that HNF4A P2 promoter polymorphisms may interact with KCNJ11 E23K in predicting Type 2 diabetes in women. [source]


Determinants for the successful establishment of exotic ants in New Zealand

DIVERSITY AND DISTRIBUTIONS, Issue 4 2005
Philip J. Lester
ABSTRACT Biological invasions can dramatically alter ecosystems. An ability to predict the establishment success for exotic species is important for biosecurity and conservation purposes. I examine the exotic New Zealand ant fauna for characteristics that predict or determine an exotic species' ability to establish. Quarantine records show interceptions of 66 ant species: 17 of which have established, 43 have failed to establish, whereas nests of another six are periodically observed but have failed to establish permanently (called ,ephemeral' establishment). Mean temperature at the highest latitude and interception variables were the only factors significantly different between established, failed or ephemeral groups. Aspects of life history, such as competitive behaviour and morphology, were not different between groups. However, in a stepwise discriminant analysis, small size was a key factor influencing establishment success. Interception rate and climate were also secondarily important. The resulting classification table predicted establishment success with 71% accuracy. Because not all exotic species are represented in quarantine records, a further discriminant model is described without interception data. Though with less accuracy (65%) than the full model, it still correctly predicted the success or failure of four species not used in the previous analysis. Techniques for improving the prediction accuracy are discussed. Predicting which species will establish in a new area appears an achievable goal, which will be a valuable tool for conservation biology. [source]


Evaluation of the PESERA model in two contrasting environments

EARTH SURFACE PROCESSES AND LANDFORMS, Issue 5 2009
F. Licciardello
Abstract The performance of the Pan-European Soil Erosion Risk Assessment (PESERA) model was evaluated by comparison with existing soil erosion data collected in plots under different land uses and climate conditions in Europe. In order to identify the most important sources of error, the PESERA model was evaluated by comparing model output with measured values as well as by assessing the effect of the various model components on prediction accuracy through a multistep approach. First, the performance of the hydrological and erosion components of PESERA was evaluated separately by comparing both runoff and soil loss predictions with measured values. In order to assess the performance of the vegetation growth component of PESERA, the predictions of the model based on observed values of vegetation ground cover were also compared with predictions based on the simulated vegetation cover values. Finally, in order to evaluate the sediment transport model, predicted monthly erosion rates were also calculated using observed values of runoff and vegetation cover instead of simulated values. Moreover, in order to investigate the capability of PESERA to reproduce seasonal trends, the observed and simulated monthly runoff and erosion values were aggregated at different temporal scale and we investigated at what extend the model prediction error could be reduced by output aggregation. PESERA showed promise to predict annual average spatial variability quite well. In its present form, short-term temporal variations are not well captured probably due to various reasons. The multistep approach showed that this is not only due to unrealistic simulation of cover and runoff, being erosion prediction also an important source of error. Although variability between the investigated land uses and climate conditions is well captured, absolute rates are strongly underestimated. A calibration procedure, focused on a soil erodibility factor, is proposed to reduce the significant underestimation of soil erosion rates. Copyright © 2009 John Wiley & Sons, Ltd. [source]


Nonlinear determinism in river flow: prediction as a possible indicator

EARTH SURFACE PROCESSES AND LANDFORMS, Issue 7 2007
Bellie SivakumarArticle first published online: 6 DEC 200
Abstract Whether or not river flow exhibits nonlinear determinism remains an unresolved question. While studies on the use of nonlinear deterministic methods for modeling and prediction of river flow series are on the rise and the outcomes are encouraging, suspicions and criticisms of such studies continue to exist as well. An important reason for this situation is that the correlation dimension method, used as a nonlinear determinism identification tool in most of those studies, may possess certain limitations when applied to real river flow series, which are always finite and often short and also contaminated with noise (e.g. measurement error). In view of this, the present study addresses the issue of nonlinear determinism in river flow series using prediction as a possible indicator. This is done by (1) reviewing studies that have employed nonlinear deterministic methods (coupling phase-space reconstruction and local approximation techniques) for river flow predictions and (2) identifying nonlinear determinism (or linear stochasticity) based on the level of prediction accuracy in general, and on the prediction accuracy against the phase-space reconstruction parameters in particular (termed as the ,inverse approach'). The results not only provide possible indications to the presence of nonlinear determinism in the river flow series studied, but also support, both qualitatively and quantitatively, the low correlation dimensions reported for such. Therefore, nonlinear deterministic methods are a viable complement to linear stochastic ones for studying river flow dynamics, if sufficient caution is exercised in their applications and in interpreting the outcomes. Copyright © 2006 John Wiley & Sons, Ltd. [source]


DWEPP: a dynamic soil erosion model based on WEPP source terms

EARTH SURFACE PROCESSES AND LANDFORMS, Issue 7 2007
N. S. Bulygina
Abstract A new rangeland overland-flow erosion model was developed based on Water Erosion Prediction Project (WEPP) sediment source and sink terms. Total sediment yield was estimated for rainfall simulation plots from the WEPP field experiments as well as for a small watershed without a well developed channel network. Both WEPP and DWEPP gave a similar level of prediction accuracy for total event soil losses measured from both rainfall simulation and small watershed experiments. Predictions for plot and hillslope scale erosion simulations were in the range of expected natural variability. Sediment yield dynamics were plotted and compared with experimental results for plots and hillslope, and the results were satisfactory. Effects of cover and canopy on the predicted sediment yields were well represented by the model. DWEPP provides a new tool for assessing erosion rates and dynamics, has physically based erosion mechanics descriptions, is sensitive to treatment differences on the experimental plots and has a well developed parameter database inherited from WEPP. Copyright © 2006 John Wiley & Sons, Ltd. [source]


A simple procedure to approximate slip displacement of freestanding rigid body subjected to earthquake motions

EARTHQUAKE ENGINEERING AND STRUCTURAL DYNAMICS, Issue 4 2007
Tomoyo Taniguchi
Abstract A simple calculation procedure for estimating absolute maximum slip displacement of a freestanding rigid body placed on the ground or floor of linear/nonlinear multi-storey building during an earthquake is developed. The proposed procedure uses the displacement induced by the horizontal sinusoidal acceleration to approximate the absolute maximum slip displacement, i.e. the basic slip displacement. The amplitude of this horizontal sinusoidal acceleration is identical to either the peak horizontal ground acceleration or peak horizontal floor response acceleration. Its period meets the predominant period of the horizontal acceleration employed. The effects of vertical acceleration are considered to reduce the friction force monotonously. The root mean square value of the vertical acceleration at the peak horizontal acceleration is used. A mathematical solution of the basic slip displacement is presented. Employing over one hundred accelerograms, the absolute maximum slip displacements are computed and compared with the corresponding basic slip displacements. Their discrepancies are modelled by the logarithmic normal distribution regardless of the analytical conditions. The modification factor to the basic slip displacement is quantified based on the probability of the non-exceedence of a certain threshold. Therefore, the product of the modification factor and the basic slip displacement gives the design slip displacement of the body as the maximum expected value. Since the place of the body and linear/nonlinear state of building make the modification factor slightly vary, ensuring it to suit the problem is essential to secure prediction accuracy. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Differences in spatial predictions among species distribution modeling methods vary with species traits and environmental predictors

ECOGRAPHY, Issue 6 2009
Alexandra D. Syphard
Prediction maps produced by species distribution models (SDMs) influence decision-making in resource management or designation of land in conservation planning. Many studies have compared the prediction accuracy of different SDM modeling methods, but few have quantified the similarity among prediction maps. There has also been little systematic exploration of how the relative importance of different predictor variables varies among model types and affects map similarity. Our objective was to expand the evaluation of SDM performance for 45 plant species in southern California to better understand how map predictions vary among model types, and to explain what factors may affect spatial correspondence, including the selection and relative importance of different environmental variables. Four types of models were tested. Correlation among maps was highest between generalized linear models (GLMs) and generalized additive models (GAMs) and lowest between classification trees and GAMs or GLMs. Correlation between Random Forests (RFs) and GAMs was the same as between RFs and classification trees. Spatial correspondence among maps was influenced the most by model prediction accuracy (AUC) and species prevalence; map correspondence was highest when accuracy was high and prevalence was intermediate (average prevalence for all species was 0.124). Species functional type and the selection of climate variables also influenced map correspondence. For most (but not all) species, climate variables were more important than terrain or soil in predicting their distributions. Environmental variable selection varied according to modeling method, but the largest differences were between RFs and GLMs or GAMs. Although prediction accuracy was equal for GLMs, GAMs, and RFs, the differences in spatial predictions suggest that it may be important to evaluate the results of more than one model to estimate the range of spatial uncertainty before making planning decisions based on map outputs. This may be particularly important if models have low accuracy or if species prevalence is not intermediate. [source]


Simulating the spatial distribution of clay layer occurrence depth in alluvial soils with a Markov chain geostatistical approach

ENVIRONMETRICS, Issue 1 2010
Weidong Li
Abstract The spatial distribution information of clay layer occurrence depth (CLOD), particularly the spatial distribution maps of occurrence of clay layers at depths less than a certain threshold, in alluvial soils is crucial to designing appropriate plans and measures for precision agriculture and environmental management in alluvial plains. Markov chain geostatistics (MCG), which was proposed recently for simulating categorical spatial variables, can objectively decrease spatial uncertainty and consequently increase prediction accuracy in simulated results by using nonlinear estimators and incorporating various interclass relationships. In this paper, a MCG method was suggested to simulate the CLOD in a meso-scale alluvial soil area by encoding the continuous variable with several threshold values into binary variables (for single thresholds) or a multi-class variable (for all thresholds being considered together). Related optimal prediction maps, realization maps, and occurrence probability maps for all of these indicator-coded variables were generated. The simulated results displayed the spatial distribution characteristics of CLOD within different soil depths in the study area, which are not only helpful to understanding the spatial heterogeneity of clay layers in alluvial soils but also providing valuable quantitative information for precision agricultural management and environmental study. The study indicated that MCG could be a powerful method for simulating discretized continuous spatial variables. Copyright © 2009 John Wiley & Sons, Ltd. [source]


Pedometric mapping of soil organic matter using a soil map with quantified uncertainty

EUROPEAN JOURNAL OF SOIL SCIENCE, Issue 3 2010
B. Kempen
This paper compares three models that use soil type information from point observations and a soil map to map the topsoil organic matter content for the province of Drenthe in the Netherlands. The models differ in how the information on soil type is obtained: model 1 uses soil type as depicted on the soil map for calibration and prediction; model 2 uses soil type as observed in the field for calibration and soil type as depicted on the map for prediction; and model 3 uses observed soil type for calibration and a pedometric soil map with quantified uncertainty for prediction. Calibration of the trend on observed soil type resulted in a much stronger predictive relationship between soil organic matter content and soil type than calibration on mapped soil type. Validation with an independent probability sample showed that model 3 out-performed models 1 and 2 in terms of the mean squared error. However, model 3 over-estimated the prediction error variance and so was too pessimistic about prediction accuracy. Model 2 performed the worst: it had the largest mean squared error and the prediction error variance was strongly under-estimated. Thus validation confirmed that calibration on observed soil type is only valid when the uncertainty about soil type at prediction sites is explicitly accounted for by the model. We conclude that whenever information about the uncertainty of the soil map is available and both soil property and soil type are observed at sampling sites, model 3 can be an improvement over the conventional model 1. [source]


Strategies for preventing defection based on the mean time to defection and their implementations on a self-organizing map

EXPERT SYSTEMS, Issue 5 2005
Young Ae Kim
Abstract: Customer retention is a critical issue for the survival of any business in today's competitive marketplace. In this paper, we propose a dynamic procedure utilizing self-organizing maps and a Markov process for detecting and preventing customer defection that uses data of past and current customer behavior. The basic concept originates from empirical observations that identified that a customer has a tendency to change behavior (i.e. trim-out usage volumes) before eventual withdrawal and defection. Our explanatory model predicts when potential defectors are likely to withdraw. Two strategies are suggested to respond to the question of where to lead potential defectors for the next stage, based on anticipating when the potential defector will leave. Our model predicts potential defectors with little deterioration of prediction accuracy compared with that of the multilayer perceptron neural network and decision trees. Moreover, it performs reasonably well in a controlled experiment using an online game. [source]


A new index of habitat alteration and a comparison of approaches to predict stream habitat conditions

FRESHWATER BIOLOGY, Issue 10 2007
BRIAN FRAPPIER
Summary 1. Stream habitat quality assessment complements biological assessment by providing a mechanism for ruling out habitat degradation as a potential stressor and provides reference targets for the physical aspects of stream restoration projects. This study analysed five approaches for predicting habitat conditions based on discriminant function, linear regressions, ordination and nearest neighbour analyses. 2. Quantitative physical and chemical habitat and riparian conditions in minimally-impacted streams in New Hampshire were estimated using United States Environmental Protection Agency's Environmental Monitoring and Assessment Program protocols. Catchment-scale descriptors were used to predict segment-scale stream channel and riparian habitat, and the accuracy and precision of the different modelling approaches were compared. 3. A new assessment index comparing and summarizing the degree of correspondence between predicted and observed habitat based on Euclidean distance between the standardized habitat factors is described. Higher index scores (i.e. greater Euclidean distance) would suggest a greater deviation in habitat between observed conditions and expected reference conditions. As in most biotic indices, the range in index scores in reference sites would constitute a situation equivalent to reference conditions. This new index avoids the erroneous prediction of multiple, mutually exclusive habitat conditions that have confounded previous habitat assessment approaches. 4. Separate linear regression models for each habitat descriptor yielded the most accurate and precise prediction of reference conditions, with a coefficient of variation (CV) between predictions and observations for all reference sites of 0.269. However, for a unified implementation in regions where a classification-based approach has already been taken for biological assessment, a discriminant analysis approach, that predicted membership in biotic communities and compared the mean habitat features in the biotic communities with the observed habitat features, was similar in prediction accuracy and precision (CV = 0.293). 5. The best model had an error of 27% of the mean index value for the reference sites, indicating substantial room for improvement. Additional catchment characteristics not readily available for this analysis, such as average rainfall or winter snow-pack, surficial geological characteristics or past land-use history, may improve the precision of the predicted habitat features in the reference streams. Land-use history in New Hampshire and regional environmental impacts have greatly impacted stream habitat conditions even in streams considered minimally-impacted today; thus as regional environmental impacts change and riparian forests mature, reference habitat conditions should be re-evaluated. [source]


Neurofuzzy Modeling of Context,Contingent Proximity Relations

GEOGRAPHICAL ANALYSIS, Issue 2 2007
Xiaobai Yao
The notion of proximity is one of the foundational elements in humans' understanding and reasoning of the geographical environments. The perception and cognition of distances plays a significant role in many daily human activities. Yet, few studies have thus far provided context,contingent translation mechanisms between linguistic proximity descriptors (e.g., "near,""far") and metric distance measures. One problem with previous fuzzy logic proximity modeling studies is that they presume the form of the fuzzy membership functions of proximity relations. Another problem is that previous studies have fundamental weaknesses in considering context factors in proximity models. We argue that statistical approaches are ill suited to proximity modeling because of the inherently fuzzy nature of the relations between linguistic and metric distance measures. In this study, we propose a neurofuzzy system approach to solve this problem. The approach allows for the dynamic construction of context,contingent proximity models based on sample data. An empirical case study with human subject survey data is carried out to test the validity of the approach and to compare it with the previous statistical approach. Interpretation and prediction accuracy of the empirical study are discussed. [source]


Quantifying uncertainty in estimates of C emissions from above-ground biomass due to historic land-use change to cropping in Australia

GLOBAL CHANGE BIOLOGY, Issue 8 2001
Damian J. Barrett
Abstract Quantifying continental scale carbon emissions from the oxidation of above-ground plant biomass following land-use change (LUC) is made difficult by the lack of information on how much biomass was present prior to vegetation clearing and on the timing and location of historical LUC. The considerable spatial variability of vegetation and the uncertainty of this variability leads to difficulties in predicting biomass C density (tC ha,1) prior to LUC. The issue of quantifying uncertainties in the estimation of land based sources and sinks of CO2, and the feasibility of reducing these uncertainties by further sampling, is critical information required by governments world-wide for public policy development on climate change issues. A quantitative statistical approach is required to calculate confidence intervals (the level of certainty) of estimated cleared above-ground biomass. In this study, a set of high-quality observations of steady state above-ground biomass from relatively undisturbed ecological sites across the Australian continent was combined with vegetation, topographic, climatic and edaphic data sets within a Geographical Information System. A statistical model was developed from the data set of observations to predict potential biomass and the standard error of potential biomass for all 0.05° (approximately 5 × 5 km) land grid cells of the continent. In addition, the spatial autocorrelation of observations and residuals from the statistical model was examined. Finally, total C emissions due to historic LUC to cultivation and cropping were estimated by combining the statistical model with a data set of fractional cropland area per land grid cell, fAc (Ramankutty & Foley 1998). Total C emissions from loss of above-ground biomass due to cropping since European colonization of Australia was estimated to be 757 MtC. These estimates are an upper limit because the predicted steady state biomass may be less than the above-ground biomass immediately prior to LUC because of disturbance. The estimated standard error of total C emissions was calculated from the standard error of predicted biomass, the standard error of fAc and the spatial autocorrelation of biomass. However, quantitative estimates of the standard error of fAc were unavailable. Thus, two scenarios were developed to examine the effect of error in fAc on the error in total C emissions. In the first scenario, in which fAc was regarded as accurate (i.e. a coefficient of variation, CV, of fAc = 0.0), the 95% confidence interval of the continental C emissions was 379,1135 MtC. In the second scenario, a 50% error in estimated cropland area was assumed (a CV of fAc = 0.50) and the estimated confidence interval increased to between 350 and 1294 MtC. The CV of C emissions for these two scenarios was 25% and 29%. Thus, while accurate maps of land-use change contribute to decreasing uncertainty in C emissions from LUC, the major source of this uncertainty arises from the prediction accuracy of biomass C density. It is argued that, even with large sample numbers, the high cost of sampling biomass carbon may limit the uncertainty of above-ground biomass to about a CV of 25%. [source]


Interpreting missense variants: comparing computational methods in human disease genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR),,

HUMAN MUTATION, Issue 7 2007
Philip A. Chan
Abstract The human genome contains frequent single-basepair variants that may or may not cause genetic disease. To characterize benign vs. pathogenic missense variants, numerous computational algorithms have been developed based on comparative sequence and/or protein structure analysis. We compared computational methods that use evolutionary conservation alone, amino acid (AA) change alone, and a combination of conservation and AA change in predicting the consequences of 254 missense variants in the CDKN2A (n = 92), MLH1 (n = 28), MSH2 (n = 14), MECP2 (n = 30), and tyrosinase (TYR) (n = 90) genes. Variants were validated as either neutral or deleterious by curated locus-specific mutation databases and published functional data. All methods that use evolutionary sequence analysis have comparable overall prediction accuracy (72.9,82.0%). Mutations at codons where the AA is absolutely conserved over a sufficient evolutionary distance (about one-third of variants) had a 91.6 to 96.8% likelihood of being deleterious. Three algorithms (SIFT, PolyPhen, and A-GVGD) that differentiate one variant from another at a given codon did not significantly improve predictive value over conservation score alone using the BLOSUM62 matrix. However, when all four methods were in agreement (62.7% of variants), predictive value improved to 88.1%. These results confirm a high predictive value for methods that use evolutionary sequence conservation, with or without considering protein structural change, to predict the clinical consequences of missense variants. The methods can be generalized across genes that cause different types of genetic disease. The results support the clinical use of computational methods as one tool to help interpret missense variants in genes associated with human genetic disease. Hum Mutat 28(7), 683,693, 2007. Published 2007 Wiley-Liss, Inc. [source]


A data mining approach to financial time series modelling and forecasting

INTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE & MANAGEMENT, Issue 4 2001
Zoran Vojinovic
This paper describes one of the relatively new data mining techniques that can be used to forecast the foreign exchange time series process. The research aims to contribute to the development and application of such techniques by exposing them to difficult real-world (non-toy) data sets. The results reveal that the prediction of a Radial Basis Function Neural Network model for forecasting the daily $US/$NZ closing exchange rates is significantly better than the prediction of a traditional linear autoregressive model in both directional change and prediction of the exchange rate itself. We have also investigated the impact of the number of model inputs (model order), the number of hidden layer neurons and the size of training data set on prediction accuracy. In addition, we have explored how the three different methods for placement of Gaussian radial basis functions affect its predictive quality and singled out the best one. Copyright © 2001 John Wiley & Sons, Ltd. [source]


Numerical analysis of turbulent flow separation in a rectangular duct with a sharp 180-degree turn by algebraic Reynolds stress model

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, Issue 12 2008
Hitoshi Sugiyama
Abstract Turbulent flow in a rectangular duct with a sharp 180-degree turn is difficult to predict numerically because the flow behavior is influenced by several types of forces, including centrifugal force, pressure-driven force, and shear stress generated by anisotropic turbulence. In particular, this type of flow is characterized by a large-scale separated flow, and it is difficult to predict the reattachment point of a separated flow. Numerical analysis has been performed for a turbulent flow in a rectangular duct with a sharp 180-degree turn using the algebraic Reynolds stress model. A boundary-fitted coordinate system is introduced as a method for coordinate transformation to set the boundary conditions next to complicated shapes. The calculated results are compared with the experimental data, as measured by a laser-Doppler anemometer, in order to examine the validity of the proposed numerical method and turbulent model. In addition, the possibility of improving the wall function method in the separated flow region is examined by replacing the log-law velocity profile for a smooth wall with that for a rough wall. The analysis results indicated that the proposed algebraic Reynolds stress model can be used to reasonably predict the turbulent flow in a rectangular duct with a sharp 180-degree turn. In particular, the calculated reattachment point of a separated flow, which is difficult to predict in a turbulent flow, agrees well with the experimental results. In addition, the calculation results suggest that the wall function method using the log-law velocity profile for a rough wall over a separated flow region has some potential for improving the prediction accuracy. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Development of a convection,diffusion-reaction magnetohydrodynamic solver on non-staggered grids

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, Issue 11 2004
Tony W. H. Sheu
Abstract This paper presents a convection,diffusion-reaction (CDR) model for solving magnetic induction equations and incompressible Navier,Stokes equations. For purposes of increasing the prediction accuracy, the general solution to the one-dimensional constant-coefficient CDR equation is employed. For purposes of extending this discrete formulation to two-dimensional analysis, the alternating direction implicit solution algorithm is applied. Numerical tests that are amenable to analytic solutions were performed in order to validate the proposed scheme. Results show good agreement with the analytic solutions and high rate of convergence. Like many magnetohydrodynamic studies, the Hartmann,Poiseuille problem is considered as a benchmark test to validate the code. Copyright © 2004 John Wiley & Sons, Ltd. [source]


Evaluation of one- and two-equation low- Re turbulence models.

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, Issue 12 2003
Axisymmetric separating, Part I, swirling flows
Abstract This first segment of the two-part paper systematically examines several turbulence models in the context of three flows, namely a simple flat-plate turbulent boundary layer, an axisymmetric separating flow, and a swirling flow. The test cases are chosen on the basis of availability of high-quality and detailed experimental data. The tested turbulence models are integrated to solid surfaces and consist of: Rodi's two-layer k,, model, Chien's low-Reynolds number k,, model, Wilcox's k,, model, Menter's two-equation shear-stress-transport model, and the one-equation model of Spalart and Allmaras. The objective of the study is to establish the prediction accuracy of these turbulence models with respect to axisymmetric separating flows, and flows of high streamline curvature. At the same time, the study establishes the minimum spatial resolution requirements for each of these turbulence closures, and identifies the proper low-Mach-number preconditioning and artificial diffusion settings of a Reynolds-averaged Navier,Stokes algorithm for optimum rate of convergence and minimum adverse impact on prediction accuracy. Copyright © 2003 John Wiley & Sons, Ltd. [source]


Evaluation of one- and two-equation low- Re turbulence models.

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, Issue 12 2003
Part II, Vortex-generator jet, diffusing S-duct flows
Abstract This second segment of the two-part paper systematically examines several turbulence models in the context of two flows, namely a vortex flow created by an inclined jet in crossflow, and the flow field in a diffusing S-shaped duct. The test cases are chosen on the basis of availability of high-quality and detailed experimental data. The tested turbulence models are integrated to solid surfaces and consist of: Rodi's two-layer k,, model, Wilcox's k,, model, Menter's two-equation shear,stress-transport model, and the one-equation model of Spalart and Allmaras. The objective of the study is to establish the prediction accuracy of these turbulence models with respect to three-dimensional separated flows with streamline curvature. At the same time, the study establishes the minimum spatial resolution requirements for each of these turbulence closures, and identifies the proper low-Mach-number preconditioning and artificial diffusion settings of a Reynolds-averaged Navier,Stokes algorithm for optimum rate of convergence and minimum adverse impact on prediction accuracy. Copyright © 2003 John Wiley & Sons, Ltd. [source]


On parameter estimation of a simple real-time flow aggregation model

INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, Issue 7 2006
Huirong Fu
Abstract There exists a clear need for a comprehensive framework for accurately analysing and realistically modelling the key traffic statistics that determine network performance. Recently, a novel traffic model, sinusoid with uniform noise (SUN), has been proposed, which outperforms other models in that it can simultaneously achieve tractability, parsimony, accuracy (in predicting network performance), and efficiency (in real-time capability). In this paper, we design, evaluate and compare several estimation approaches, including variance-based estimation (Var), minimum mean-square-error-based estimation (MMSE), MMSE with the constraint of variance (Var+MMSE), MMSE of autocorrelation function with the constraint of variance (Var+AutoCor+MMSE), and variance of secondary demand-based estimation (Secondary Variance), to determining the key parameters in the SUN model. Integrated with the SUN model, all the proposed methods are able to capture the basic behaviour of the aggregation reservation system and closely approximate the system performance. In addition, we find that: (1) the Var is very simple to operate and provides both upper and lower performance bounds. It can be integrated into other methods to provide very accurate approximation to the aggregation's performance and thus obtain an accurate solution; (2) Var+AutoCor+MMSE is superior to other proposed methods in the accuracy to determine system performance; and (3) Var+MMSE and Var+AutoCor+MMSE differ from the other three methods in that both adopt an experimental analysis method, which helps to improve the prediction accuracy while reducing computation complexity. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Effective database processing for classification and regression with continuous variables

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 12 2007
E. Di Tomaso
This article proposes a method for manipulating a database of instances relative to discrete and continuous variables. A fuzzy partition is used to discretize continuous domains. A reorganized form of representing a relational database is proposed. The new form of representation is called an effective database. The effective database is tested on classification and regression problems using general Bayesian networks and Näive Bayes classifiers. The structures and the parameters of the classifiers are estimated from the effective database. An algorithm for updating with soft evidence is used to test the induced models, when continuous variables are present. The experiments show that the effective database procedure produces a selection of relevant information from data, which improves in some cases the prediction accuracy of the classifiers. © 2007 Wiley Periodicals, Inc. Int J Int Syst 22: 1271,1285, 2007. [source]


Predicting avian patch occupancy in a fragmented landscape: do we know more than we think?

JOURNAL OF APPLIED ECOLOGY, Issue 5 2009
Danielle F. Shanahan
Summary 1.,A recent and controversial topic in landscape ecology is whether populations of species respond to habitat fragmentation in a general fashion. Empirical research has provided mixed support, resulting in controversy about the use of general rules in landscape management. Rather than simply assessing post hoc whether individual species follow such rules, a priori testing could shed light on their accuracy and utility for predicting species response to landscape change. 2.,We aim to create an a priori model that predicts the presence or absence of multiple species in habitat patches. Our goal is to balance general theory with relevant species life-history traits to obtain high prediction accuracy. To increase the utility of this work, we aim to use accessible methods that can be applied using readily available inexpensive resources. 3.,The classification tree patch-occupancy model we create for birds is based on habitat suitability, minimum area requirements, dispersal potential of each species and overall landscape connectivity. 4.,To test our model we apply it to the South East Queensland region, Australia, for 17 bird species with varying dispersal potential and habitat specialization. We test the accuracy of our predictions using presence,absence information for 55 vegetation patches. 5.,Overall we achieve Cohen's kappa of 0·33, or ,fair' agreement between the model predictions and test data sets, and generally a very high level of absence prediction accuracy. Habitat specialization appeared to influence the accuracy of the model for different species. 6.,We also compare the a priori model to the statistically derived model for each species. Although this ,optimal model' generally differed from our original predictive model, the process revealed ways in which it could be improved for future attempts. 7.,Synthesis and applications. Our study demonstrates that ecological generalizations alongside basic resources (a vegetation map and some species-specific information) can provide conservative accuracy for predicting species occupancy in remnant vegetation patches. We show that the process of testing and developing models based on general rules could provide basic tools for conservation managers to understand the impact of current or planned landscape change on wildlife populations. [source]


Forecasting migration of cereal aphids (Hemiptera: Aphididae) in autumn and spring

JOURNAL OF APPLIED ENTOMOLOGY, Issue 5 2009
A. M. Klueken
Abstract The migration of cereal aphids and the time of their arrival on winter cereal crops in autumn and spring are of particular importance for plant disease (e.g. barley yellow dwarf virus infection) and related yield losses. In order to identify days with migration potentials in autumn and spring, suction trap data from 29 and 45 case studies (locations and years), respectively, were set-off against meteorological parameters, focusing on the early immigration periods in autumn (22 September to 1 November) and spring (1 May to 9 June). The number of cereal aphids caught in a suction trap increased with increasing temperature, global radiation and duration of sunshine and decreased with increasing precipitation, relative humidity and wind speed. According to linear regression analyses, the temperature, global radiation and wind speed were most frequently and significantly associated with migration, suggesting that they have a major impact on flight activity. For subsequent model development, suction trap catches from different case studies were pooled and binarily classified as days with or without migration as defined by a certain number of migrating cereal aphids. Linear discriminant analyses of several predictor variables (assessed during light hours of a given day) were then performed based on the binary response variables. Three models were used to predict days with suction trap catches ,1, ,4 or ,10 migrating cereal aphids in autumn. Due to the predominance of Rhopalosiphum padi individuals (99.3% of total cereal aphid catch), no distinction between species (R. padi and Sitobion avenae) was made in autumn. As the suction trap catches were lower and species dominance changed in spring, three further models were developed for analysis of all cereal aphid species, R. padi only, and Metopolophium dirhodum and S. avenae combined in spring. The empirical, cross-classification and receiver operating characteristic analyses performed for model validation showed different levels of prediction accuracy. Additional datasets selected at random before model construction and parameterization showed that predictions by the six migration models were 33,81% correct. The models are useful for determining when to start field evaluations. Furthermore, they provide information on the size of the migrating aphid population and, thus, on the importance of immigration for early aphid population development in cereal crops in a given season. [source]