Dataset Consisting (dataset + consisting)

Distribution by Scientific Domains


Selected Abstracts


Hourly and daily clearness index and diffuse fraction at a tropical station, Ile-Ife, Nigeria

INTERNATIONAL JOURNAL OF CLIMATOLOGY, Issue 8 2009
E. C. Okogbue
Abstract Dataset consisting of hourly global and diffuse solar radiation measured over the period February 1992 and December 2002 have been utilized to investigate the diurnal and seasonal variations of hourly and daily clearness index together with the diffuse fraction at a tropical station Ile-Ife (7.5°N, 4.57°E), Nigeria. Statistical analysis (the frequency and cumulative frequency distribution of the hourly and daily clearness index) and subsequent characterization of the sky conditions over the station based on these were also done, and their implications for solar energy utilization in the area discussed. Daytime (11:00,15:00 LST) monthly mean hourly diffuse fraction, M,d (explained in a separate ,List of Symbols' provided, along with other symbols used in this article), have values, which are most of the time less than 0.52, 0.54 and 0.60 respectively for January, February and March in the dry season. However, during the months of July and August (which are typical of the wet season), the values range between 0.61 and 0.85 (being generally greater than 0.65) with the corresponding values of the monthly mean hourly clearness index, M,T, ranging between 0.23 and 0.45. Statistical analysis of hourly and daily clearness index showed that the local sky conditions at the station were almost devoid of clear skies and overcast skies (clear skies and overcast skies occurred for only about 3.5% and 4.8% of the time respectively). The sky conditions were rather predominantly cloudy (cloudy skies occurred for about 88% of the time) all the year round. Copyright © 2009 Royal Meteorological Society [source]


Estimation of the soil,water partition coefficient normalized to organic carbon for ionizable organic chemicals,

ENVIRONMENTAL TOXICOLOGY & CHEMISTRY, Issue 10 2008
Antonio Franco
Abstract The sorption of organic electrolytes to soil was investigated. A dataset consisting of 164 electrolytes, composed of 93 acids, 65 bases, and six amphoters, was collected from literature and databases. The partition coefficient log KOW of the neutral molecule and the dissociation constant pKa were calculated by the software ACD/Labs®. The Henderson-Hasselbalch equation was applied to calculate dissociation. Regressions were developed to predict separately for the neutral and the ionic molecule species the distribution coefficient (Kd) normalized to organic carbon (KOC) from log KOW and pKa. The log KOC of strong acids (pKa < 4) was not correlated to these parameters. The regressions derived for weak acids and bases (undissociated at environmental pH) were similar. The highest sorption was found for strong bases (pKa > 7.5), probably due to electrical interactions. Nonetheless, their log KOC was highly correlated to log KOW. For bases, a nonlinear regression was developed, too. The new regression equations are applicable in the whole pKa range of acids, bases, and amphoters and are useful in particular for relatively strong bases and amphoters, for which no predictive methods specifically have been developed so far. [source]


What you match does matter: the effects of data on DSGE estimation

JOURNAL OF APPLIED ECONOMETRICS, Issue 5 2010
Pablo A. Guerron-Quintana
This paper explores the effects of using alternative combinations of observables for the estimation of Dynamic Stochastic General Equilibrium (DSGE) models. I find that the estimation of structural parameters describing the Taylor rule and sticky contracts in prices and wages is particularly sensitive to the set of observables. In terms of the model's predictions, the exclusion of some observables may lead to estimated parameters with unexpected outcomes, such as recessions following a positive technology shock. More importantly, two ways to assess different sets of observables are proposed. These measures favor a dataset consisting of seven observables. Copyright © 2009 John Wiley & Sons, Ltd. [source]


Prediction of interactiveness between small molecules and enzymes by combining gene ontology and compound similarity

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 8 2010
Lei Chen
Abstract Determination of whether a small organic molecule interacts with an enzyme can help to understand the molecular and cellular functions of organisms, and the metabolic pathways. In this research, we present a prediction model, by combining compound similarity and enzyme similarity, to predict the interactiveness between small molecules and enzymes. A dataset consisting of 2859 positive couples of small molecule and enzyme and 286,056 negative couples was employed. Compound similarity is a measurement of how similar two small molecules are, proposed by Hattori et al., J Am Chem Soc 2003, 125, 11853 which can be availed at http://www.genome.jp/ligand-bin/search_compound, while enzyme similarity was obtained by three ways, they are blast method, using gene ontology items and functional domain composition. Then a new distance between a pair of couples was established and nearest neighbor algorithm (NNA) was employed to predict the interactiveness of enzymes and small molecules. A data distribution strategy was adopted to get a better data balance between the positive samples and the negative samples during training the prediction model, by singling out one-fourth couples as testing samples and dividing the rest data into seven training datasets,the rest positive samples were added into each training dataset while only the negative samples were divided. In this way, seven NNAs were built. Finally, simple majority voting system was applied to integrate these seven models to predict the testing dataset, which was demonstrated to have better prediction results than using any single prediction model. As a result, the highest overall prediction accuracy achieved 97.30%. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010 [source]


DOES ELECTRICITY RESTRUCTURING WORK?

THE JOURNAL OF INDUSTRIAL ECONOMICS, Issue 3 2007
EVIDENCE FROM THE U.S. NUCLEAR ENERGY INDUSTRY
This paper examines whether electricity restructuring improves the efficiency of U.S. nuclear power generation. Using a panel dataset consisting of the full sample of 73 investor-owned nuclear plants in the United States from 1992 to 1998, I estimate the plant-level cross-sectional and longitudinal efficiency changes associated with restructuring. Special attention is given to the potential policy endogeneity bias and different modeling strategies are presented to cope with the issue. Overall, I find a striking positive relationship between restructuring and cost reduction, and increased plant utilization. [source]


Molecular phylogenetic evidence for paraphyly of the genus Sooglossus, with the description of a new genus of Seychellean frogs

BIOLOGICAL JOURNAL OF THE LINNEAN SOCIETY, Issue 3 2007
ARIE VAN DER MEIJDEN
The Seychelles harbour an endemic frog family, the Sooglossidae, currently containing two genera: Sooglossus, with three species, and Nesomantis, with one species. These unique frogs are generally considered to be basal neobatrachians, although their relationships to other neobatrachian taxa, except the Nasikabatrachidae, remain unresolved. Our molecular phylogeny based on a dataset consisting of fragments of the nuclear rag-1 and rag-2 genes, as well as mitochondrial 16S rRNA in representatives of the major neobatrachian lineages, confirmed the previously postulated Sooglossidae + Nasikabatrachidae clade and the placement of the South American Caudiverbera with the Australian Myobatrachidae, but did not further resolve the position of sooglossids. Our results do, however, unambiguously show sooglossids to be monophyletic but the genus Sooglossus to be paraphyletic, with the type species Sooglossus sechellensis being more closely related to Nesomantis thomasseti than to Sooglossus gardineri and Sooglossus pipilodryas, in agreement with morphological, karyological, and bioacoustic data. As a taxonomic consequence, we propose to consider the genus name Nesomantis as junior synonym of Sooglossus, and to transfer the species thomasseti to Sooglossus. For the clade composed of the species gardineri and pipilodryas, here, we propose the new generic name Leptosooglossus. A significant genetic differentiation of 3% was found between specimens of Sooglossus thomasseti from the Mahé and Silhouette Islands, highlighting the need for further studies on their possible taxonomic distinctness. © 2007 The Linnean Society of London, Biological Journal of the Linnean Society, 2007, 91, 347,359. [source]


Joint Modelling of Repeated Transitions in Follow-up Data , A Case Study on Breast Cancer Data

BIOMETRICAL JOURNAL, Issue 3 2005
B. Genser
Abstract In longitudinal studies where time to a final event is the ultimate outcome often information is available about intermediate events the individuals may experience during the observation period. Even though many extensions of the Cox proportional hazards model have been proposed to model such multivariate time-to-event data these approaches are still very rarely applied to real datasets. The aim of this paper is to illustrate the application of extended Cox models for multiple time-to-event data and to show their implementation in popular statistical software packages. We demonstrate a systematic way of jointly modelling similar or repeated transitions in follow-up data by analysing an event-history dataset consisting of 270 breast cancer patients, that were followed-up for different clinical events during treatment in metastatic disease. First, we show how this methodology can also be applied to non Markovian stochastic processes by representing these processes as "conditional" Markov processes. Secondly, we compare the application of different Cox-related approaches to the breast cancer data by varying their key model components (i.e. analysis time scale, risk set and baseline hazard function). Our study showed that extended Cox models are a powerful tool for analysing complex event history datasets since the approach can address many dynamic data features such as multiple time scales, dynamic risk sets, time-varying covariates, transition by covariate interactions, autoregressive dependence or intra-subject correlation. (© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Understanding on the residue contact network using the log-normal cluster model and the multilevel wheel diagram,

BIOPOLYMERS, Issue 10 2010
Weitao Sun
Abstract Residue clusters play essential role in stabilizing protein structures in the form of complex networks. We show that the cluster sizes in a native protein follow the log-normal distribution for a dataset consisting of 424 proteins. To our knowledge, this is the first time of such fitting for the native structures. Based on log-normal model, the asymptotically increasing mean cluster sizes produce a critical protein chain length of about 200 amino acids, beyond which length most globular proteins have nearly the same mean cluster sizes. This suggests that the larger proteins use a different packing mechanism than the smaller proteins. We confirmed the scale-free property of the residue contact network for most of the protein structures in the dataset, although the violations were observed for the tightly packed proteins. Residue cluster network wheel (RCNW) is proposed to visualize the relationship between the multiple properties of the residue network such as the cluster size, the residue types and contacts, and the flexibility of the residue. We noticed that the residues with large cluster size have smaller C, displacement measured using the normal mode analysis. © 2010 Wiley Periodicals, Inc. Biopolymers 93: 904,916, 2010. [source]


Generalisability in unbalanced, uncrossed and fully nested studies

MEDICAL EDUCATION, Issue 4 2010
Ajit Narayanan
Medical Education 2010: 44: 367,378 Objectives, There is growing interest in multi-source, multi-level feedback for measuring the performance of health care professionals. However, data are often unbalanced (e.g. there are different numbers of raters for each doctor), uncrossed (e.g. raters rate the doctor on only one occasion) and fully nested (e.g. raters for a doctor are unique to that doctor). Estimating the true score variance among doctors under these circumstances is proving a challenge. Methods, Extensions to reliability and generalisability (G) formulae are introduced to handle unbalanced, uncrossed and fully nested data to produce coefficients that take into account variances among raters, ratees and questionnaire items at different levels of analysis. Decision (D) formulae are developed to handle predictions of minimum numbers of raters for unbalanced studies. An artificial dataset and two real-world datasets consisting of colleague and patient evaluations of doctors are analysed to demonstrate the feasibility and relevance of the formulae. Another independent dataset is used for validating D predictions of G coefficients for varying numbers of raters against actual G coefficients. A combined G coefficient formula is introduced for estimating multi-sourced reliability. Results, The results from the formulae indicate that it is possible to estimate reliability and generalisability in unbalanced, fully nested and uncrossed studies, and to identify extraneous variance that can be removed to estimate true score variance among doctors. The validation results show that it is possible to predict the minimum numbers of raters even if the study is unbalanced. Discussion, Calculating G and D coefficients for psychometric data based on feedback on doctor performance is possible even when the data are unbalanced, uncrossed and fully nested, provided that: (i) variances are separated at the rater and ratee levels, and (ii) the average number of raters per ratee is used in calculations for deriving these coefficients. [source]