Cross Validation (cross + validation)

Distribution by Scientific Domains
Distribution within Chemistry

Kinds of Cross Validation

  • leave-one-out cross validation


  • Selected Abstracts


    CROSS VALIDATION OF A SENSORY LANGUAGE FOR CHEDDAR CHEESE

    JOURNAL OF SENSORY STUDIES, Issue 3 2002
    M.A. DRAKE
    ABSTRACT Communication and replication of sensory data from different sites are important to track progress on fundamental research issues and to ensure that research efforts are not duplicated. A uniform anchored Cheddar cheese sensory language has previously been identified and refined. The objective of this study was to demonstrate application of the defined sensory language for Cheddar cheese for communication between sensory panels at three different sites. The defined and referenced sensory language for Cheddar cheese was disseminated to panel leaders at the three sites and sensory panels (n , 8) were trained for 40 to 80 h at each site. Ten forty-pound blocks of Cheddar cheese representing different ages were collected and evaluated by the panels. Cheeses were differentiated by the three panels by univariate and multivariate analysis (P<0.05). Cheeses were differentiated by the three panels in a similar manner. Results indicate that it is possible to calibrate panels using a standardized defined sensory language. [source]


    Elucidation of a protein signature discriminating six common types of adenocarcinoma

    INTERNATIONAL JOURNAL OF CANCER, Issue 4 2007
    Gregory C. Bloom
    Abstract Pathologists are commonly facing the problem of attempting to identify the site of origin of a metastatic cancer when no primary tumor has been identified, yet few markers have been identified to date. Multitumor classifiers based on microarray based RNA expression have recently been described. Here we describe the first approximation of a tumor classifier based entirely on protein expression quantified by two-dimensional gel electrophoresis (2DE). The 2DE was used to analyze the proteomic expression pattern of 77 similarly appearing (using histomorphology) adenocarcinomas encompassing 6 types or sites of origin: ovary, colon, kidney, breast, lung and stomach. Discriminating sets of proteins were identified and used to train an artificial neural network (ANN). A leave-one-out cross validation (LOOCV) method was used to test the ability of the constructed network to predict the single held out sample from each iteration with a maximum predictive accuracy of 87% and an average predictive accuracy of 82% over the range of proteins chosen for its construction. These findings demonstrate the use of proteomics to construct a highly accurate ANN-based classifier for the detection of an individual tumor type, as well as distinguishing between 6 common tumor types in an unknown primary diagnosis setting. © 2006 Wiley-Liss, Inc. [source]


    Gene expression profiling of 30 cancer cell lines predicts resistance towards 11 anticancer drugs at clinically achieved concentrations

    INTERNATIONAL JOURNAL OF CANCER, Issue 7 2006
    Balazs Györffy
    Abstract Cancer patients with tumors of similar grading, staging and histogenesis can have markedly different treatment responses to different chemotherapy agents. So far, individual markers have failed to correctly predict resistance against anticancer agents. We tested 30 cancer cell lines for sensitivity to 5-fluorouracil, cisplatin, cyclophosphamide, doxorubicin, etoposide, methotrexate, mitomycin C, mitoxantrone, paclitaxel, topotecan and vinblastine at drug concentrations that can be systemically achieved in patients. The resistance index was determined to designate the cell lines as sensitive or resistant, and then, the subset of resistant vs. sensitive cell lines for each drug was compared. Gene expression signatures for all cell lines were obtained by interrogating Affymetrix U133A arrays. Prediction Analysis of Microarrays was applied for feature selection. An individual prediction profile for the resistance against each chemotherapy agent was constructed, containing 42,297 genes. The overall accuracy of the predictions in a leave-one-out cross validation was 86%. A list of the top 67 multidrug resistance candidate genes that were associated with the resistance against at least 4 anticancer agents was identified. Moreover, the differential expressions of 46 selected genes were also measured by quantitative RT-PCR using a TaqMan micro fluidic card system. As a single gene can be correlated with resistance against several agents, associations with resistance were detected all together for 76 genes and resistance phenotypes, respectively. This study focuses on the resistance at the in vivo concentrations, making future clinical cancer response prediction feasible. The TaqMan-validated gene expression patterns provide new gene candidates for multidrug resistance. Supplementary material for this article can be found on the International Journal of Cancer website at http://www.interscience.wiley.com/jpages/0020-7136/suppmat. © 2005 Wiley-Liss, Inc. [source]


    Very high resolution interpolated climate surfaces for global land areas

    INTERNATIONAL JOURNAL OF CLIMATOLOGY, Issue 15 2005
    Robert J. Hijmans
    Abstract We developed interpolated climate surfaces for global land areas (excluding Antarctica) at a spatial resolution of 30 arc s (often referred to as 1-km spatial resolution). The climate elements considered were monthly precipitation and mean, minimum, and maximum temperature. Input data were gathered from a variety of sources and, where possible, were restricted to records from the 1950,2000 period. We used the thin-plate smoothing spline algorithm implemented in the ANUSPLIN package for interpolation, using latitude, longitude, and elevation as independent variables. We quantified uncertainty arising from the input data and the interpolation by mapping weather station density, elevation bias in the weather stations, and elevation variation within grid cells and through data partitioning and cross validation. Elevation bias tended to be negative (stations lower than expected) at high latitudes but positive in the tropics. Uncertainty is highest in mountainous and in poorly sampled areas. Data partitioning showed high uncertainty of the surfaces on isolated islands, e.g. in the Pacific. Aggregating the elevation and climate data to 10 arc min resolution showed an enormous variation within grid cells, illustrating the value of high-resolution surfaces. A comparison with an existing data set at 10 arc min resolution showed overall agreement, but with significant variation in some regions. A comparison with two high-resolution data sets for the United States also identified areas with large local differences, particularly in mountainous areas. Compared to previous global climatologies, ours has the following advantages: the data are at a higher spatial resolution (400 times greater or more); more weather station records were used; improved elevation data were used; and more information about spatial patterns of uncertainty in the data is available. Owing to the overall low density of available climate stations, our surfaces do not capture of all variation that may occur at a resolution of 1 km, particularly of precipitation in mountainous areas. In future work, such variation might be captured through knowledge-based methods and inclusion of additional co-variates, particularly layers obtained through remote sensing. Copyright © 2005 Royal Meteorological Society. [source]


    A self-adaptive genetic algorithm-artificial neural network algorithm with leave-one-out cross validation for descriptor selection in QSAR study

    JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 10 2010
    Jingheng Wu
    Abstract Based on the quantitative structure-activity relationships (QSARs) models developed by artificial neural networks (ANNs), genetic algorithm (GA) was used in the variable-selection approach with molecule descriptors and helped to improve the back-propagation training algorithm as well. The cross validation techniques of leave-one-out investigated the validity of the generated ANN model and preferable variable combinations derived in the GAs. A self-adaptive GA-ANN model was successfully established by using a new estimate function for avoiding over-fitting phenomenon in ANN training. Compared with the variables selected in two recent QSAR studies that were based on stepwise multiple linear regression (MLR) models, the variables selected in self-adaptive GA-ANN model are superior in constructing ANN model, as they revealed a higher cross validation (CV) coefficient (Q2) and a lower root mean square deviation both in the established model and biological activity prediction. The introduced methods for validation, including leave-multiple-out, Y-randomization, and external validation, proved the superiority of the established GA-ANN models over MLR models in both stability and predictive power. Self-adaptive GA-ANN showed us a prospect of improving QSAR model. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2010 [source]


    Clinical prediction rules for bacteremia and in-hospital death based on clinical data at the time of blood withdrawal for culture: an evaluation of their development and use

    JOURNAL OF EVALUATION IN CLINICAL PRACTICE, Issue 6 2006
    Tsukasa Nakamura MD (Research Fellow)
    Abstract Rationale, aims and objectives, To develop clinical prediction rules for true bacteremia, blood culture positive for gram-negative rods, and in-hospital death using the data at the time of blood withdrawal for culture. Methods, Data on all hospitalized adults who underwent blood cultures at a tertiary care hospital in Japan were collected from an integrated medical computing system. Logistic regression was used for developing prediction rules followed by the jackknife cross validation. Results, Among 739 patients, 144 (19.5%) developed true bacteremia, 66 (8.9) were positive for gram-negative rods, and 203 (27.5%) died during hospitalization. Prediction rule based on the data at the time of blood withdrawal for culture stratified them into five groups with probabilities of true bacteremia 6.5, 9.6, 21.9, 30.1, and 59.6%. For blood culture positive for gram-negative rods, the probabilities were 0.6, 4.7, 8.6, and 31.7%, and for in-hospital death, those were 6.7, 15.5, 26.0, 35.5, and 56.1%. The area of receiver operating characteristic for true bacteremia, blood culture positive for gram-negative rods, and in-hospital death were 0.73, 0.64, and 0.64, respectively, in original cohort and 0.72, 0.64, and 0.64 in validation respectively. Conclusions, The clinical prediction rules are helpful for improved clinical decision making for bacteremia patients. [source]


    PRELIMINARY EVALUATION OF THE APPLICATION OF THE FTIR SPECTROSCOPY TO CONTROL THE GEOGRAPHIC ORIGIN AND QUALITY OF VIRGIN OLIVE OILS

    JOURNAL OF FOOD QUALITY, Issue 4 2007
    ALESSANDRA BENDINI
    ABSTRACT A rapid Fourier transform infrared (FTIR) attenuated total reflectance spectroscopic method was applied to determine qualitative parameters such as free fatty acid (FFA) content and the peroxide value (POV) in virgin olive oils. Calibration models were constructed using partial least squares regression on a large number of virgin olive oil samples. The best results (R2 = 0.955, root mean square error in cross validation [RMSECV] = 0.15) to evaluate FFA content expressed in oleic acid % (w/w) were obtained considering a calibration range from 0.2 to 9.2% of FFA relative to 190 samples. For POV determination, the result obtained, built on 80 olive oil samples with a calibration range from 11.1 to 49.7 meq O2/kg of oil, was not satisfactory (R2 = 0.855, RMSECV = 3.96). We also investigated the capability of FTIR spectroscopy, in combination with multivariate analysis, to distinguish virgin olive oils based on geographic origin. The spectra of 84 monovarietal virgin olive oil samples from eight Italian regions were collected and elaborated by principal component analysis (PCA), considering the fingerprint region. The results were satisfactory and could successfully discriminate the majority of samples coming from the Emilia Romagna, Sardinian and Sicilian regions. Moreover, the explained variance from this PCA was higher than 96%. PRACTICAL APPLICATIONS The verification of the declared origin or the determination of the origin of an unidentified virgin olive oil is a challenging problem. In this work, we have studied the applicability of Fourier transform infrared coupled with multivariate statistical analysis to discriminate the geographic origin of virgin olive oil samples from different Italian regions. [source]


    PREDICTION OF TEXTURE IN GREEN ASPARAGUS BY NEAR INFRARED SPECTROSCOPY (NIRS)

    JOURNAL OF FOOD QUALITY, Issue 4 2002
    D. PEREZ
    NIR spectroscopy was used to estimate three textural parameters of green asparagus: maximum cutting force, energy and toughness. An Instron 1140 Texturometer provided reference data. A total of 199 samples from two asparagus varieties (Taxara and UC-157) were used to obtain the calibration models between the reference data and the NIR spectral data. Standard errors of cross validation (SECV) and r2 were (5.73, 0.84) for maximum cutting force, (0.58, 0.66) for toughness, and (0.04, 0.85) for cutting energy. The mathematical models developed as calibration models were tested using independent validation samples (n =20); the resulting standard errors of prediction (SEP) and r2 for the same parameters were (6.73, 0.82), (0.61, 0.57) and (0.04, 0.89), respectively. For toughness, substantially improved r2 (0.85) and SEP (0.36) when four samples exhibiting large residual values were removed. The results indicated that NIRS could accurately predict texture parameters of green asparagus. [source]


    Multivariate analysis approach to the plasma protein profile of patients with advanced colorectal cancer,

    JOURNAL OF MASS SPECTROMETRY (INCORP BIOLOGICAL MASS SPECTROMETRY), Issue 12 2006
    Eugenio Ragazzi
    Abstract The aim of the present study was to identify the pattern of plasma protein species of interest as markers of colorectal cancer (CRC). Using matrix-assisted laser desorption/ionization-mass spectrometry (MALDI-MS), the plasma protein profile was determined in nine stage IV CRC patients (study group) and nine clean-colon healthy subjects (control group). Multivariate analysis methods were employed to identify distinctive disease patterns at protein spectrum. In the study and control groups, cluster analysis (CA) on the complete MALDI-MS spectra plasma protein profile showed a distinction between CRC patients and healthy subjects, thus allowing the identification of the most discriminating ionic species. Principal component analysis (PCA) and linear discriminant analysis (LDA) yielded similar grouping results. LDA with leave-one-out cross validation achieved a correct classification rate of 89% in both the patients and the healthy subjects. Copyright © 2006 John Wiley & Sons, Ltd. [source]


    Multivariate calibration of covalent aggregate fraction to the raman spectrum of regular human insulin

    JOURNAL OF PHARMACEUTICAL SCIENCES, Issue 9 2008
    Connie M. Gryniewicz
    Abstract Insulin aggregates were prepared by exposing samples of formulated regular human insulin to agitation at 60°C. Aliquots were drawn from the samples periodically over a time range spanning 192 h, and their aggregate compositions were determined with size exclusion chromatography. The complete data set was composed of 39 separate aliquots. The Raman spectra of three separate 10 µL volumes from each aliquot were measured using the drop-coat deposition Raman (DCDR) method. The spectra were calibrated to aggregate composition by partial least squares regression (PLS), resulting in linear calibration (R2,=,0.997) with a root mean squared error of calibration (RMSEC) of 1.3% and a root mean squared error of cross validation (RMSECV) of 5.1% in aggregate composition. Though the time required for aggregates to form under stressed conditions showed substantial sample-to-sample variation, the correlation between aggregate composition and Raman spectrum was remarkably consistent, indicating that Raman spectroscopy may be a viable screening method for aggregation of protein drugs. © 2008 Wiley-Liss, Inc. and the American Pharmacists Association J Pharm Sci 97:3727,3734, 2008 [source]


    In-line measurement of a drug substance via near infrared spectroscopy to ensure a robust crystallization process

    JOURNAL OF PHARMACEUTICAL SCIENCES, Issue 11 2006
    George X. Zhou
    Abstract The crystallization of Etoricoxib, a polymorphic compound, has been optimized and controlled by seeding with the desired polymorph at a moderate supersaturation condition. To enhance the process robustness, near infrared spectroscopy (NIRS) has been evaluated as an inline measurement method for the concentration of Etoricoxib prior to seeding in the crystallization process. In this NIRS method, a spectral discriminant analysis based on principal component analysis (PCA) was established to detect the presence of solids produced by premature crystallization, or bubbles in the path of light. Once a spectrum was qualified as that of clear solution, concentration of Etoricoxib was calculated by a NIRS calibration model built with partial least squares (PLS) regression and with offline HPLC analysis as the reference method. This model was accurate with a standard error of cross validation (SECV) less than 1.2 mg/g Etoricoxib and a standard error of prediction (SEP) less than 1.7 mg/g over the concentration range from 50 to 170 mg/g, temperature range from 49 to 65°C, and different sources of materials. In addition, all aspects of the offline HPLC method, especially the sampling procedure, were optimized to provide an accurate reference for NIRS calibration models. The application of this method at a pilot plant has demonstrated its capability of accurately measuring the process concentration of Etoricoxib as well as detecting the presence of solids produced by premature crystallization before seeding. © 2006 Wiley-Liss, Inc. and the American Pharmacists Association J Pharm Sci 95:2337,2347, 2006 [source]


    Using terahertz pulsed spectroscopy to quantify pharmaceutical polymorphism and crystallinity

    JOURNAL OF PHARMACEUTICAL SCIENCES, Issue 4 2005
    Clare J. Strachan
    Abstract Terahertz pulsed spectroscopy (TPS) is a new technique that is capable of eliciting rich information when investigating pharmaceutical materials. In solids, it probes long-range crystalline lattice vibrations and low energy torsion and hydrogen bonding vibrations. These properties make TPS potentially an ideal tool to investigate crystallinity and polymorphism. In this study four drugs with different solid-state properties were analyzed using TPS and levels of polymorphism and crystallinity were quantified. Carbamazepine and enalapril maleate polymorphs, amorphous, and crystalline indomethacin, and thermotropic liquid crystalline and crystalline fenoprofen calcium mixtures were quantified using partial least-squares analysis. Root-mean-squared errors of cross validation as low as 0.349% and limits of detection as low as approximately 1% were obtained, demonstrating that TPS is an analytical technique of potential in quantifying solid-state properties of pharmaceutical compounds. © 2005 Wiley-Liss, Inc. and the American Pharmacists Association J Pharm Sci 94:837,846, 2005 [source]


    Discriminant analysis of autofluorescence spectra for classification of oral lesions in vivo

    LASERS IN SURGERY AND MEDICINE, Issue 5 2009
    J.L. Jayanthi MSc, MPhil
    Abstract Background and Objectives Low survival rate of individuals with oral cancer emphasize the significance of early detection and treatment. Optical spectroscopic techniques are under various stages of development for diagnosis of epithelial neoplasm. This study evaluates the potential of a multivariate statistical algorithm to classify oral mucosa from autofluorescence spectral features recorded in vivo. Study Design/Methods Autofluorescence spectra were recorded in a clinical trial from 15 healthy volunteers and 34 patients with diode laser excitation (404,nm) and pre-processed by normalization, mean-scaling and its combination. Linear discriminant analysis (LDA) based on leave-one-out (LOO) method of cross validation was performed on spectral data for tissue characterization. The sensitivity and specificity were determined for different lesion pairs from the scatter plot of discriminant function scores. Results Autofluorescence spectra of healthy volunteers consists of a broad emission at 500,nm that is characteristic of endogenous fluorophores, whereas in malignant lesions three additional peaks are observed at 635, 685, and 705,nm due to the accumulation of porphyrins in oral lesions. It was observed that classification design based on discriminant function scores obtained by LDA-LOO method was able to differentiate pre-malignant dysplasia from squamous cell carcinoma (SCC), benign hyperplasia from dysplasia and hyperplasia from normal with overall sensitivities of 86%, 78%, and 92%, and specificities of 90%, 100%, and 100%, respectively. Conclusions The application of LDA-LOO method on the autofluorescence spectra recorded during a clinical trial in patients was found suitable to discriminate oral mucosal alterations during tissue transformation towards malignancy with improved diagnostic accuracies. Lasers Surg. Med. 41:345,352, 2009. © 2009 Wiley-Liss, Inc. [source]


    Three-dimensional spatial interpolation of surface meteorological observations from high-resolution local networks

    METEOROLOGICAL APPLICATIONS, Issue 3 2008
    Francesco Uboldi
    Abstract An objective analysis technique is applied to a local, high-resolution meteorological observation network in the presence of complex topography. The choice of optimal interpolation (OI) makes it possible to implement a standard spatial interpolation algorithm efficiently. At the same time OI constitutes a basis to develop, in perspective, a full multivariate data assimilation scheme. In the absence of a background model field, a simple and effective de-trending procedure is implemented. Three-dimensional correlation functions are used to account for the orographic distribution of observing stations. Minimum-scale correlation parameters are estimated by means of the integral data influence (IDI) field. Hourly analysis fields of temperature and relative humidity are routinely produced at the Regional Weather Service of Lombardia. The analysis maps show significant informational content even in the presence of strong gradients and infrequent meteorological situations. Quantitative evaluation of the analysis fields is performed by systematically computing their cross validation (CV) scores and by estimating the analysis bias. Further developments concern the implementation of an automatic quality control procedure and the improvement of error covariance estimation. Copyright © 2008 Royal Meteorological Society [source]


    Discovering robust protein biomarkers for disease from relative expression reversals in 2-D DIGE data.

    PROTEINS: STRUCTURE, FUNCTION AND BIOINFORMATICS, Issue 8 2007
    Troy J. Anderson
    Abstract This study assesses the ability of a novel family of machine learning algorithms to identify changes in relative protein expression levels, measured using 2-D DIGE data, which support accurate class prediction. The analysis was done using a training set of 36 total cellular lysates comprised of six normal and three cancer biological replicates (the remaining are technical replicates) and a validation set of four normal and two cancer samples. Protein samples were separated by 2-D DIGE and expression was quantified using DeCyder-2D Differential Analysis Software. The relative expression reversal (RER) classifier correctly classified 9/9 training biological samples (p<0.022) as estimated using a modified version of leave one out cross validation and 6/6 validation samples. The classification rule involved comparison of expression levels for a single pair of protein spots, tropomyosin isoforms and ,-enolase, both of which have prior association as potential biomarkers in cancer. The data was also analyzed using algorithms similar to those found in the extended data analysis package of DeCyder software. We propose that by accounting for sources of within- and between-gel variation, RER classifiers applied to 2-D DIGE data provide a useful approach for identifying biomarkers that discriminate among protein samples of interest. [source]


    Brief communication: A probabilistic approach to age estimation from infracranial sequences of maturation,

    AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, Issue 4 2010
    Hélène Coqueugniot
    Abstract Infracranial sequences of maturation are commonly used to estimate the age at death of nonadult specimens found in archaeological, paleoanthropological, or forensic contexts. Typically, an age assessment is made by comparing the degree of long-bone epiphyseal fusion in the target specimen to the age ranges for different stages of fusion in a reference skeletal collection. While useful as a first approximation, this approach has a number of shortcomings, including the potential for "age mimicry," being highly dependent on the sample size of the reference sample and outliers, not using the entire fusion distribution, and lacking a straightforward quantitative way of combining age estimates from multiple sites of fusion. Here we present an alternative probabilistic approach based on data collected on 137 individuals, ranging in age from 7- to 29-years old, from a documented skeletal collection from Coimbra, Portugal. We then use cross validation to evaluate the accuracy of age estimation from epiphyseal fusion. While point estimates of age can, at least in some circumstances, be both accurate and precise based on the entire skeleton, or many sites of fusion, there will often be substantial error in these estimates when they derive from one or only a few sites. Because a probabilistic approach to age estimation from epiphyseal fusion is computationally intensive, we make available a series of spreadsheets or computer programs that implement the approach presented here. Am J Phys Anthropol 142:655,664, 2010. © 2010 Wiley-Liss, Inc. [source]


    A serum metabolomic investigation on hepatocellular carcinoma patients by chemical derivatization followed by gas chromatography/mass spectrometry

    RAPID COMMUNICATIONS IN MASS SPECTROMETRY, Issue 19 2008
    Ruyi Xue
    The purpose of this study was to investigate the serum metabolic difference between hepatocellular carcinoma (HCC, n,=,20) male patients and normal male subjects (n,=,20). Serum metabolome was detected through chemical derivatization followed by gas chromatography/mass spectrometry (GC/MS). The acquired GC/MS data was analyzed by stepwise discriminant analysis (SDA) and support vector machine (SVM). The metabolites including butanoic acid, ethanimidic acid, glycerol, L-isoleucine, L-valine, aminomalonic acid, D-erythrose, hexadecanoic acid, octadecanoic acid, and 9,12-octadecadienoic acid in combination with each other gave the strongest segregation between the two groups. By applying these variables, our method provided a diagnostic model that could well discriminate between HCC patients and normal subjects. More importantly, the error count estimate for each group was 0%. The total classifying accuracy of the discriminant function tested by SVM 20-fold cross validation was 75%. This technique is different from traditional ones and appears to be a useful tool in the area of HCC diagnosis. Copyright © 2008 John Wiley & Sons, Ltd. [source]


    Hedonic price index estimation under mean-independence of time dummies from quality characteristics

    THE ECONOMETRICS JOURNAL, Issue 1 2003
    Yasushi Kondo
    Summary. We estimate hedonic price indices (HPI) for rental offices in Tokyo for the period 1985,1991. We take a partially linear regression (PLR) model, linear in x (year dummies) and nonparametric in z (office quality characteristics), as our main model; the usual linear model is used as well. Since x consists of year dummies, the linearity in x is not a restriction in the PLR model; the only restriction is that of no interaction between x and z. For the PLR model, the HPI are estimated -consistently with a two-stage procedure. For our data, x turns out to be (almost) mean-independent of z. This implies that least squares estimation (LSE) for models with a misspecified function for z is still consistent. The mean-independence also leads to an efficiency result that, under heteroskedasticity of unknown form, the two-stage PLR model estimator is at least as efficient as any LSE for models specifying (rightly or wrongly) the part for z. In addition to these, several interesting practical lessons are noted in doing the two-stage PLR model estimation. First, the cross validation (CV) used in the PLR model literature can fail if the mean-independence is ignored. Second, high order kernels can make the CV criterion function ill behaved. Third, product kernels work as well as spherically symmetric kernels. Fourth, nonparametric specification tests may work poorly due to a sample splitting problem with outliers in the data or due to choosing more than one bandwidth; in this regard, a test suggested by Stute (1997) and Stute et al. (1998) is recommended. [source]


    Development of a high resolution daily gridded temperature data set (1969,2005) for the Indian region

    ATMOSPHERIC SCIENCE LETTERS, Issue 4 2009
    A. K. Srivastava
    Abstract A high resolution daily gridded temperature data set for the Indian region was developed using temperature data of 395 quality controlled stations for the period 1969,2005. A modified version of the Shepard's angular distance weighting algorithm was used for interpolating the station temperature data into 1° latitude × 1° longitude grids. Using the cross validation, errors were estimated and found less than 0.5 °C. The data set was also compared with another high resolution data set and found comparable. Mean frequency of cold and heat waves, temperature anomalies associated with the monsoon breaks have been presented. Copyright © 2009 Royal Meteorological Society [source]


    Perturbation signal design for neural network based identification of multivariable nonlinear systems

    THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING, Issue 1 2002
    Pankaj S. Kulkarni
    Abstract The paper focuses on issues in experimental design for identification of nonlinear multivariable systems. Perturbation signal design is analyzed for a hybrid model structure consisting of linear and neural network structures. Input signals, designed to minimize the effects of nonlinearities during the linear model identification for the multivariable case, have been proposed and its properties have been theoretically established. The superiority of the proposed perturbation signal and the hybrid model has been demonstrated through extensive cross validations. The utility of the obtained models for control has also been proved through a case study involving MPC of a nonlinear multivariable neutralization plant. On traite dans cet article de la problématique des plans expérimentaux pour la détermination des systèmes multivariés non linéaires. La conception des signaux de perturbation est analysée pour un modèle de structure hybride composée de structures à réseaux linéaires et neuronaux. Des signaux d'entrée, con,us pour minimiser les effets des non-linéarités lors de la détermination du modèle linéaire pour le cas multivarié, sont proposés et leurs propriétés sont établies de manière théorique. La supériorité du signal de perturbation et du modèle hybride proposés est démontrée par des validations croisées poussées. L'utilité des modèles obtenus pour le contr,le est également prouvée par une étude de cas faisant intervenir le MPC d'une installation de neutralisation multivariée non linéaires. [source]