Unbiased Estimates (unbiased + estimate)

Distribution by Scientific Domains
Distribution within Life Sciences

Selected Abstracts

Genome-wide amplification and allelotyping of sporadic pituitary adenomas identify novel regions of genetic loss

D. J. Simpson
Through the use of a candidate gene approach, several previous studies have identified loss of heterozygosity (LOH) at putative tumor-suppressor gene (TSG) loci in sporadic pituitary tumors. This study reports a genome-wide allelotyping by use of 122 microsatellite markers in a large cohort of tumors, consisting of somatotrophinomas and non-functioning adenomas. Samples were first subject to prior whole genome amplification by primer extension pre-amplification (PEP) to circumvent limitations imposed by insufficient DNA for whole-genome analysis with this number of microsatellite markers. The overall mean frequency of loss in invasive tumors was significantly higher than that in their non-invasive counterparts (7 vs. 3% somatotrophinomas; 6 vs. 3% non-functioning adenomas, respectively). Analysis of the mean frequency of LOH, across all markers to individual chromosomal arms, identified 13 chromosomal arms in somatotrophinomas and 10 in non-functioning tumors, with LOH greater than the 99% upper confidence interval calculated for the rate of overall random allelic loss. In the majority of cases, these losses were more frequent in invasive tumors than in their non-invasive counterparts, suggesting these to be markers of tumor progression. Other regions showed similar frequencies of LOH in both invasive and non-invasive tumors, implying these to be early changes in pituitary tumorigenesis. This genome-wide study also revealed chromosomal regions where losses were frequently associated with an individual marker, for example, chromosome arm 1q (LOH > 30%). In some cases, these losses were subtype-specific and were found at a higher frequency in invasive tumors than in their non-invasive counterparts. Identification of these regions of loss provides the first preliminary evidence for the location of novel putative TSGs involved in pituitary tumorigenesis that are, in some cases, subtype-specific. This investigation provides an unbiased estimate of global aberrations in sporadic pituitary tumors as assessed by LOH analysis. The identification of multiple "hotspots" throughout the genome may be a reflection of an unstable chromatin structure that is susceptible to a deletion or epigenetic-mediated gene-silencing events. © 2003 Wiley-Liss, Inc. [source]

Bill harnesses on nestling Tufted Puffins influence adult provisioning behavior

Carina Gjerdrum
ABSTRACT For burrow-nesting seabirds, investigators have examined nestling diet by attaching harnesses to the bills of nestlings to intercept food delivered by the parent. To determine whether this method provides an unbiased estimate of nestling diet, we evaluated its effect on the provisioning behavior of Tufted Puffins (Fratercula cirrhata) nesting on Triangle Island, British Columbia. Adults delivering food to nestlings with bill harnesses always hesitated before entering a burrow with food, increasing their susceptibility to kleptoparasitism by gulls, and did not always leave the food intended for the nestling. These responses by adult puffins could lead to underestimates of energy intake rates of nestlings and unreliable comparisons with other species if prey left by adults in nest burrows were the only source of data. We also compared estimates of the species, number, and size of prey delivered by adult puffins as determined by direct observation from blinds to samples of prey collected directly from nest burrows and found that the two sampling techniques produced similar results. However, identifying rare prey species and gathering precise information about prey length, mass, and condition require collection of prey, and we recommend using a combination of techniques to obtain the most reliable estimates of nestling diet. SINOPSIS Los investigadores de aves marinas que anidan en cavidades o guaridas, han examinado la dieta de los pichones colocando artefactos en el pico de los polluelos que intercepta la comida que traen los adultos. Para determinar si el método provee de un estimado sin sesgo de la dieta de pichones, evaluamos su efecto en la conducta de Fratercula cirrhata, de aves que anidaron en Triangle Island, Columbia Británica. Los adultos que trajeron comida a pichones que tenían artefactos en el pico, tuvieron reservas para entrar en la cavidad, exponiéndose a kleptoparasitismo por parte de gaviotas. Además no siempre le dejaron comida a los pichones. La respuesta de los adultos, puede llevar a subestimar las necesidades energéticas de los pichones y hacer comparaciones poco confiables con otras especies si se utiliza únicamente como datos, las presas dejadas en las guaridas por los adultos. También comparamos las especies utilizadas para alimentar a los polluelos, número de presas y su tamaño comparando observaciones directas de lo que se llevaba a los nidos, con lo que se dejaba en las guaridas y encontramos que ambos métodos arrojan resultados similares. Sin embargo, la identificación de presas raras, obtener información precisa sobre la longitud de la presa, peso y condición de esta, requieren el examinar las misma. Recomendamos utilizar una combinación de ambos métodos para obtener estimados confiables de la dieta de polluelos. [source]

Estimating the variance of estimated trends in proportions when there is no unique subject identifier

William K. Mountford
Summary., Longitudinal population-based surveys are widely used in the health sciences to study patterns of change over time. In many of these data sets unique patient identifiers are not publicly available, making it impossible to link the repeated measures from the same individual directly. This poses a statistical challenge for making inferences about time trends because repeated measures from the same individual are likely to be positively correlated, i.e., although the time trend that is estimated under the naïve assumption of independence is unbiased, an unbiased estimate of the variance cannot be obtained without knowledge of the subject identifiers linking repeated measures over time. We propose a simple method for obtaining a conservative estimate of variability for making inferences about trends in proportions over time, ensuring that the type I error is no greater than the specified level. The method proposed is illustrated by using longitudinal data on diabetes hospitalization proportions in South Carolina. [source]

Are QST,FST comparisons for natural populations meaningful?

Abstract Comparisons between putatively neutral genetic differentiation amongst populations, FST, and quantitative genetic variation, QST, are increasingly being used to test for natural selection. However, we find that approximately half of the comparisons that use only data from wild populations confound phenotypic and genetic variation. We urge the use of a clear distinction between narrow-sense QST, which can be meaningfully compared with FST, and phenotypic divergence measured between populations, PST, which is inadequate for comparisons in the wild. We also point out that an unbiased estimate of QST can be found using the so-called ,animal model' of quantitative genetics. [source]

Z -scores and the birthweight paradox

Enrique F. Schisterman
Summary Investigators have long puzzled over the observation that low-birthweight babies of smokers tend to fare better than low-birthweight babies of non-smokers. Similar observations have been made with regard to factors other than smoking status, including socio-economic status, race and parity. Use of standardised birthweights, or birthweight z -scores, has been proposed as an approach to resolve the crossing of the curves that is the hallmark of the so-called birthweight paradox. In this paper, we utilise directed acyclic graphs, analytical proofs and an extensive simulation study to consider the use of z -scores of birthweight and their effect on statistical analysis. We illustrate the causal questions implied by inclusion of birthweight in statistical models, and illustrate the utility of models that include birthweight or z -scores to address those questions. Both analytically and through a simulation study we show that neither birthweight nor z -score adjustment may be used for effect decomposition. The z -score approach yields an unbiased estimate of the total effect, even when collider-stratification would adversely impact estimates from birthweight-adjusted models; however, the total effect could have been estimated more directly with an unadjusted model. The use of z -scores does not add additional information beyond the use of unadjusted models. Thus, the ability of z -scores to successfully resolve the paradoxical crossing of mortality curves is due to an alteration in the causal parameter being estimated (total effect), rather than adjustment for confounding or effect decomposition or other factors. [source]

Optimal Valuation of Noisy Real Assets

Paul D. Childs
We study the optimal valuation of real assets when true asset values are unobservable. In our model, the observed value cointegrates with the unobserved true asset value to cause serial correlation in the time series of observed values. Autocorrelation as well as total variance in the observed value are used to calculate an efficient unbiased estimate of the true asset value (the time,filtered value). The optimal value estimate is shown to have three time,weighted terms: a deterministic forward value, a comparison of observed values with previously determined time,filtered values, and a convexity correction for incomplete information. The residual variance measures the precision of the value estimate, which can increase or decrease monotonically over time as well as display a linear or nonlinear time trend. We also show how to revise time,filtered estimates based on the arrival of new information. Our results relate to work on illiquid asset markets, including appraisal smoothing, tests of market efficiency, and the valuation of options on real assets. [source]

Evaluation of a large-eddy model simulation of a mixed-phase altocumulus cloud using microwave radiometer, lidar and Doppler radar data

J. H. Marsham
Abstract Using the Met Office large-eddy model (LEM) we simulate a mixed-phase altocumulus cloud that was observed from Chilbolton in southern England by a 94 GHz Doppler radar, a 905 nm lidar, a dual-wavelength microwave radiometer and also by four radiosondes. It is important to test and evaluate such simulations with observations, since there are significant differences between results from different cloud-resolving models for ice clouds. Simulating the Doppler radar and lidar data within the LEM allows us to compare observed and modelled quantities directly, and allows us to explore the relationships between observed and unobserved variables. For general-circulation models, which currently tend to give poor representations of mixed-phase clouds, the case shows the importance of using: (i) separate prognostic ice and liquid water, (ii) a vertical resolution that captures the thin layers of liquid water, and (iii) an accurate representation the subgrid vertical velocities that allow liquid water to form. It is shown that large-scale ascents and descents are significant for this case, and so the horizontally averaged LEM profiles are relaxed towards observed profiles to account for these. The LEM simulation then gives a reasonable cloud, with an ice-water path approximately two thirds of that observed, with liquid water at the cloud top, as observed. However, the liquid-water cells that form in the updraughts at cloud top in the LEM have liquid-water paths (LWPs) up to half those observed, and there are too few cells, giving a mean LWP five to ten times smaller than observed. In reality, ice nucleation and fallout may deplete ice-nuclei concentrations at the cloud top, allowing more liquid water to form there, but this process is not represented in the model. Decreasing the heterogeneous nucleation rate in the LEM increased the LWP, which supports this hypothesis. The LEM captures the increase in the standard deviation in Doppler velocities (and so vertical winds) with height, but values are 1.5 to 4 times smaller than observed (although values are larger in an unforced model run, this only increases the modelled LWP by a factor of approximately two). The LEM data show that, for values larger than approximately 12 cm s,1, the standard deviation in Doppler velocities provides an almost unbiased estimate of the standard deviation in vertical winds, but provides an overestimate for smaller values. Time-smoothing the observed Doppler velocities and modelled mass-squared-weighted fallspeeds shows that observed fallspeeds are approximately two-thirds of the modelled values. Decreasing the modelled fallspeeds to those observed increases the modelled IWC, giving an IWP 1.6 times that observed. Copyright © 2006 Royal Meteorological Society [source]

On the Role of Baseline Measurements for Crossover Designs under the Self and Mixed Carryover Effects Model

BIOMETRICS, Issue 1 2010
Yuanyuan Liang
Summary It is well known that optimal designs are strongly model dependent. In this article, we apply the Lagrange multiplier approach to the optimal design problem, using a recently proposed model for carryover effects. Generally, crossover designs are not recommended when carryover effects are present and when the primary goal is to obtain an unbiased estimate of the treatment effect. In some cases, baseline measurements are believed to improve design efficiency. This article examines the impact of baselines on optimal designs using two different assumptions about carryover effects during baseline periods and employing a nontraditional crossover design model. As anticipated, baseline observations improve design efficiency considerably for two-period designs, which use the data in the first period only to obtain unbiased estimates of treatment effects, while the improvement is rather modest for three- or four-period designs. Further, we find little additional benefits for measuring baselines at each treatment period as compared to measuring baselines only in the first period. Although our study of baselines did not change the results on optimal designs that are reported in the literature, the problem of strong model dependency problem is generally recognized. The advantage of using multiperiod designs is rather evident, as we found that extending two-period designs to three- or four-period designs significantly reduced variability in estimating the direct treatment effect contrast. [source]

Efficient sampling and data reduction techniques for probabilistic seismic lifeline risk assessment

Nirmal Jayaram
Abstract Probabilistic seismic risk assessment for spatially distributed lifelines is less straightforward than for individual structures. While procedures such as the ,PEER framework' have been developed for risk assessment of individual structures, these are not easily applicable to distributed lifeline systems, due to difficulties in describing ground-motion intensity (e.g. spectral acceleration) over a region (in contrast to ground-motion intensity at a single site, which is easily quantified using Probabilistic Seismic Hazard Analysis), and since the link between the ground-motion intensities and lifeline performance is usually not available in closed form. As a result, Monte Carlo simulation (MCS) and its variants are well suited for characterizing ground motions and computing resulting losses to lifelines. This paper proposes a simulation-based framework for developing a small but stochastically representative catalog of earthquake ground-motion intensity maps that can be used for lifeline risk assessment. In this framework, Importance Sampling is used to preferentially sample ,important' ground-motion intensity maps, and K -Means Clustering is used to identify and combine redundant maps in order to obtain a small catalog. The effects of sampling and clustering are accounted for through a weighting on each remaining map, so that the resulting catalog is still a probabilistically correct representation. The feasibility of the proposed simulation framework is illustrated by using it to assess the seismic risk of a simplified model of the San Francisco Bay Area transportation network. A catalog of just 150 intensity maps is generated to represent hazard at 1038 sites from 10 regional fault segments causing earthquakes with magnitudes between five and eight. The risk estimates obtained using these maps are consistent with those obtained using conventional MCS utilizing many orders of magnitudes more ground-motion intensity maps. Therefore, the proposed technique can be used to drastically reduce the computational expense of a simulation-based risk assessment, without compromising the accuracy of the risk estimates. This will facilitate computationally intensive risk analysis of systems such as transportation networks. Finally, the study shows that the uncertainties in the ground-motion intensities and the spatial correlations between ground-motion intensities at various sites must be modeled in order to obtain unbiased estimates of lifeline risk. Copyright © 2010 John Wiley & Sons, Ltd. [source]

Tillage affects the activity-density, absolute density, and feeding damage of the pea leaf weevil in spring pea

Timothy D. Hatten
Abstract Conversion from conventional-tillage (CT) to no-tillage (NT) agriculture can affect pests and beneficial organisms in various ways. NT has been shown to reduce the relative abundance and feeding damage of pea leaf weevil (PLW), Sitona lineatus L. (Coleoptera: Curculionidae) in spring pea, especially during the early-season colonization period in the Palouse region of northwest Idaho. Pitfall traps were used to quantify tillage effects on activity-density of PLW in field experiments conducted during 2001 and 2002. As capture rate of pitfall traps for PLW might be influenced by effects of tillage treatment, two mark-recapture studies were employed to compare trapping rates in NT and CT spring pea during 2003. Also in 2003, direct sampling was used to estimate PLW densities during the colonization period, and to assess PLW feeding damage on pea. PLW activity-density was significantly lower in NT relative to CT during the early colonization period (May) of 2001 and 2002, and during the late colonization period (June) of 2002. Activity-density was not different between treatments during the early emergence (July) or late emergence (August) periods in either year of the study. Trap capture rates did not differ between tillage systems in the mark-recapture studies, suggesting that pitfall trapping provided unbiased estimates of PLW relative abundances. PLW absolute densities and feeding damage were significantly lower in NT than in CT. These results indicate that NT provides a pest suppression benefit in spring pea. [source]

Using Expectations to Test Asset Pricing Models

Alon Brav
Asset pricing models generate predictions relating assets' expected rates of return and their risk attributes. Most tests of these models have employed realized rates of return as a proxy for expected return. We use analysts' expected rates of return to examine the relation between these expectations and firm attributes. By assuming that analysts' expectations are unbiased estimates of market-wide expected rates of return, we can circumvent the use of realized rates of return and provide evidence on the predictions emanating from traditional asset pricing models. We find a positive, robust relation between expected return and market beta and a negative relation between expected return and firm size, consistent with the notion that these are risk factors. We do not find that high book-to-market firms are expected to earn higher returns than low book-to-market firms, inconsistent with the notion that book-to-market is a risk factor. [source]

PEL: an unbiased method for estimating age-dependent genetic disease risk from pedigree data unselected for family history

F. Alarcon
Abstract Providing valid risk estimates of a genetic disease with variable age of onset is a major challenge for prevention strategies. When data are obtained from pedigrees ascertained through affected individuals, an adjustment for ascertainment bias is necessary. This article focuses on ascertainment through at least one affected and presents an estimation method based on maximum likelihood, called the Proband's phenotype exclusion likelihood or PEL for estimating age-dependent penetrance using disease status and genotypic information of family members in pedigrees unselected for family history. We studied the properties of the PEL and compared with another method, the prospective likelihood, in terms of bias and efficiency in risk estimate. For that purpose, family samples were simulated under various disease risk models and under various ascertainment patterns. We showed that, whatever the genetic model and the ascertainment scheme, the PEL provided unbiased estimates, whereas the prospective likelihood exhibited some bias in a number of situations. As an illustration, we estimated the disease risk for transthyretin amyloid neuropathy from a French sample and a Portuguese sample and for BRCA1/2 associated breast cancer from a sample ascertained on early-onset breast cancer cases. Genet. Epidemiol. 33:379,385, 2009. © 2008 Wiley-Liss, Inc. [source]

Uncertainties in interpretation of isotope signals for estimation of fine root longevity: theoretical considerations

YIQI LUOArticle first published online: 25 JUN 200
Abstract This paper examines uncertainties in the interpretation of isotope signals when estimating fine root longevity, particularly in forests. The isotope signals are depleted ,13C values from elevated CO2 experiments and enriched ,14C values from bomb 14C in atmospheric CO2. For the CO2 experiments, I explored the effects of six root mortality patterns (on,off, proportional, constant, normal, left skew, and right skew distributions), five levels of nonstructural carbohydrate (NSC) reserves, and increased root growth on root ,13C values after CO2 fumigation. My analysis indicates that fitting a linear equation to ,13C data provides unbiased estimates of longevity only if root mortality follows an on,off model, without dilution of isotope signals by pretreatment NSC reserves, and under a steady state between growth and death. If root mortality follows the other patterns, the linear extrapolation considerably overestimates root longevity. In contrast, fitting an exponential equation to ,13C data underestimates longevity with all the mortality patterns except the proportional one. With either linear or exponential extrapolation, dilution of isotope signals by pretreatment NSC reserves could result in overestimation of root longevity by several-fold. Root longevity is underestimated if elevated CO2 stimulates fine root growth. For the bomb 14C approach, I examined the effects of four mortality patterns (on,off, proportional, constant, and normal distribution) on root ,14C values. For a given ,14C value, the proportional pattern usually provides a shorter estimate of root longevity than the other patterns. Overall, we have to improve our understanding of root growth and mortality patterns and to measure NSC reserves in order to reduce uncertainties in estimated fine root longevity from isotope data. [source]

A discrete random effects probit model with application to the demand for preventive care

Partha Deb
Abstract I have developed a random effects probit model in which the distribution of the random intercept is approximated by a discrete density. Monte Carlo results show that only three to four points of support are required for the discrete density to closely mimic normal and chi-squared densities and provide unbiased estimates of the structural parameters and the variance of the random intercept. The empirical application shows that both observed family characteristics and unobserved family-level heterogeneity are important determinants of the demand for preventive care. Copyright © 2001 John Wiley & Sons, Ltd. [source]

The Relationship between Hospital Volume and Mortality in Mechanical Ventilation: An Instrumental Variable Analysis

Jeremy M. Kahn
Objective. To examine the relationship between hospital volume and mortality for nonsurgical patients receiving mechanical ventilation. Data Sources. Pennsylvania state discharge records from July 1, 2004, to June 30, 2006, linked to the Pennsylvania Department of Health death records and the 2000 United States Census. Study Design. We categorized all general acute care hospitals in Pennsylvania (n=169) by the annual number of nonsurgical, mechanically ventilated discharges according to previous criteria. To estimate the relationship between annual volume and 30-day mortality, we fit linear probability models using administrative risk adjustment, clinical risk adjustment, and an instrumental variable approach. Principle Findings. Using a clinical measure of risk adjustment, we observed a significant reduction in the probability of 30-day mortality at higher volume hospitals (,300 admissions per year) compared with lower volume hospitals (<300 patients per year; absolute risk reduction: 3.4%, p=.04). No significant volume,outcome relationship was observed using only administrative risk adjustment. Using the distance from the patient's home to the nearest higher volume hospital as an instrument, the volume,outcome relationship was greater than observed using clinical risk adjustment (absolute risk reduction: 7.0%, p=.01). Conclusions. Care in higher volume hospitals is independently associated with a reduction in mortality for patients receiving mechanical ventilation. Adequate risk adjustment is essential in order to obtained unbiased estimates of the volume,outcome relationship. [source]

Evidence for density-dependent survival in adult cormorants from a combined analysis of recoveries and resightings

Morten Frederiksen
Summary 1.,The increasing population of cormorants (Phalacrocorax carbo sinensis) in Europe since 1970 has led to conflicts with fishery interests. Control of cormorant populations is a management issue in many countries and a predictive population model is needed. However, reliable estimates of survival are lacking as input for such a model 2.,Capture,recapture estimates of survival of dispersive species like cormorants suffer from an unknown bias due to permanent emigration from the study area. However, a combined analysis of resightings and recovery of dead birds allows unbiased estimates of survival and emigration. 3.,We use data on 11 000 cormorants colour-ringed as chicks in the Danish colony Vorsø 1977,97 to estimate adult survival and colony fidelity. Recent statistical models allowing simultaneous use of recovery and resighting data are employed. We compensate for variation in colour-ring quality, and study the effect of population size and winter severity on survival, as well as of breeding success on fidelity by including these factors as covariates in statistical models. 4.,Annual adult survival fluctuated from year to year (0·74,0·95), with a mean of 0·88. A combination of population size in Europe and winter temperatures explained 52,64% of the year-to-year variation in survival. Differences in survival between sexes was less than 1%. Cormorants older than ,,12 years experienced lower survival, whereas second-year birds had survival similar to adults. Colony fidelity declined after 1990 from nearly 1 to ,,0·90, implying 10% permanent emigration per year. This change coincided with a decline in food availability. 5.,Apparently, survival was more severely affected by winter severity when population size was high. This could be caused by saturation of high-quality wintering habitat, forcing some birds to winter in less good habitat where they would be more vulnerable to cold winters. There was thus evidence for density dependence in adult survival, at least in cold winters. 6.,The high population growth rate sustained by European Ph. c. sinensis in the 1970s and 1980s can partly be accounted for by unusually high survival of immature and adult birds, probably caused by absence of hunting, low population density and high food availability. [source]

The forests of presettlement New England, USA: spatial and compositional patterns based on town proprietor surveys

Charles V. Cogbill
Abstract Aim, This study uses the combination of presettlement tree surveys and spatial analysis to produce an empirical reconstruction of tree species abundance and vegetation units at different scales in the original landscape. Location, The New England study area extends across eight physiographic sections, from the Appalachian Mountains to the Atlantic Coastal Plain. The data are drawn from 389 original towns in what are now seven states in the north-eastern United States. These towns have early land division records which document the witness trees growing in the town before European settlement (c. seventeenth to eighteenth century ad). Methods, Records of witness trees from presettlement surveys were collated from towns throughout the study area (1.3 × 105 km2). Tree abundance was averaged over town-wide samples of multiple forest types, integrating proportions of taxa at a local scale (102 km2). These data were summarized into genus groups over the sample towns, which were then mapped [geographical information system (GIS)], classified (Cluster Analysis) and ordinated [detrended correspondence analysis (DCA)]. Modern climatic and topographic variables were also derived from GIS analyses for each town and all town attributes were quantitatively compared. Distributions of both individual species and vegetation units were analysed and displayed for spatial analysis of vegetation structure. Results, The tally of 153,932 individual tree citations show a dominant latitudinal trend in the vegetation. Spatial patterns are concisely displayed as pie charts of genus composition arrayed on sampled towns. Detailed interpolated frequency surfaces show spatial patterns of range and abundance of the dominant taxa. Oak, spruce, hickory and chestnut reach distinctive range limits within the study area. Eight vegetation clusters are distinguished. The northern vegetation is a continuous geographical sequence typified by beech while the southern vegetation is an amorphous group typified by oak. Main conclusions, The wealth of information recorded in the New England town presettlement surveys is an ideal data base to elucidate the natural patterns of vegetation over an extensive spatial area. The timing, town-wide scale, expansive coverage, quantitative enumeration and unbiased estimates are critical advantages of proprietor lotting surveys in determining original tree distributions. This historical,geographical approach produces a vivid reconstruction of the natural vegetation and species distributions as portrayed on maps. The spatial, vegetational and environmental patterns all demonstrate a distinct ,tension zone' separating ,northern hardwood' and ,central hardwood' towns. The presettlement northern hardwood forests, absolutely dominated by beech, forms a continuum responding to a complex climatic gradient of altitude and latitude. The oak forests to the south are distinguished by non-zonal units, probably affected by fire. Although at the continental scale, the forests seem to be a broad transition, at a finer scale they respond to topography such as the major valleys or the northern mountains. This study resets some preconceptions about the original forest, such as the overestimation of the role of pine, hemlock and chestnut and the underestimation of the distinctiveness of the tension zone. Most importantly, the forests of the past and their empirical description provide a basis for many ecological, educational and management applications today. [source]

Modeling Randomness in Judging Rating Scales with a Random-Effects Rating Scale Model

Wen-Chung Wang
This study presents the random-effects rating scale model (RE-RSM) which takes into account randomness in the thresholds over persons by treating them as random-effects and adding a random variable for each threshold in the rating scale model (RSM) (Andrich, 1978). The RE-RSM turns out to be a special case of the multidimensional random coefficients multinomial logit model (MRCMLM) (Adams, Wilson, & Wang, 1997) so that the estimation procedures for the MRCMLM can be directly applied. The results of the simulation indicated that when the data were generated from the RSM, using the RSM and the RE-RSM to fit the data made little difference: both resulting in accurate parameter recovery. When the data were generated from the RE-RSM, using the RE-RSM to fit the data resulted in unbiased estimates, whereas using the RSM resulted in biased estimates, large fit statistics for the thresholds, and inflated test reliability. An empirical example of 10 items with four-point rating scales was illustrated in which four models were compared: the RSM, the RE-RSM, the partial credit model (Masters, 1982), and the constrained random-effects partial credit model. In this real data set, the need for a random-effects formulation becomes clear. [source]

Testing assumptions of mark,recapture theory in the coral reef fish Lutjanus apodus

C. L. Wormald
This study tested assumptions of the Cormack,Jolly,Seber capture,mark,recapture (CMR) model in a population of the tropical snapper Lutjanus apodus in the central Bahamas using a combination of laboratory and field studies. The suitability of three different tag types [passive integrated transponder (PIT) tag, T-anchor tag and fluorescent dye jet-injected into the fins] was assessed. PIT tags were retained well, whereas T-anchor tags and jet-injected dye were not. PIT tags had no detectable effect on the rates of growth or survival of individuals. The capture method (fish trapping) was found to provide a representative sample of the population; however, a positive trap response was identified and therefore the assumption of equal capture probability was violated. This study illustrates an approach that can be used to test some of the critical assumptions of the CMR theory and it demonstrates that CMR methods can provide unbiased estimates of growth and mortality of L. apodus provided that trap response is explicitly modelled when estimating survival probability. [source]

Joint generalized estimating equations for multivariate longitudinal binary outcomes with missing data: an application to acquired immune deficiency syndrome data

Stuart R. Lipsitz
Summary., In a large, prospective longitudinal study designed to monitor cardiac abnormalities in children born to women who are infected with the human immunodeficiency virus, instead of a single outcome variable, there are multiple binary outcomes (e.g. abnormal heart rate, abnormal blood pressure and abnormal heart wall thickness) considered as joint measures of heart function over time. In the presence of missing responses at some time points, longitudinal marginal models for these multiple outcomes can be estimated by using generalized estimating equations (GEEs), and consistent estimates can be obtained under the assumption of a missingness completely at random mechanism. When the missing data mechanism is missingness at random, i.e. the probability of missing a particular outcome at a time point depends on observed values of that outcome and the remaining outcomes at other time points, we propose joint estimation of the marginal models by using a single modified GEE based on an EM-type algorithm. The method proposed is motivated by the longitudinal study of cardiac abnormalities in children who were born to women infected with the human immunodeficiency virus, and analyses of these data are presented to illustrate the application of the method. Further, in an asymptotic study of bias, we show that, under a missingness at random mechanism in which missingness depends on all observed outcome variables, our joint estimation via the modified GEE produces almost unbiased estimates, provided that the correlation model has been correctly specified, whereas estimates from standard GEEs can lead to substantial bias. [source]

Estimation of pairwise relatedness between individuals and characterization of isolation-by-distance processes using dominant genetic markers

Olivier J. Hardy
Abstract A new estimator of the pairwise relatedness coefficient between individuals adapted to dominant genetic markers is developed. This estimator does not assume genotypes to be in Hardy,Weinberg proportions but requires a knowledge of the departure from these proportions (i.e. the inbreeding coefficient). Simulations show that the estimator provides accurate estimates, except for some particular types of individual pairs such as full-sibs, and performs better than a previously developed estimator. When comparing marker-based relatedness estimates with pedigree expectations, a new approach to account for the change of the reference population is developed and shown to perform satisfactorily. Simulations also illustrate that this new relatedness estimator can be used to characterize isolation by distance within populations, leading to essentially unbiased estimates of the neighbourhood size. In this context, the estimator appears fairly robust to moderate errors made on the assumed inbreeding coefficient. The analysis of real data sets suggests that dominant markers (random amplified polymorphic DNA, amplified fragment length polymorphism) may be as valuable as co-dominant markers (microsatellites) in studying microgeographic isolation-by-distance processes. It is argued that the estimators developed should find major applications, notably for conservation biology. [source]

Movement patterns and study area boundaries: influences on survival estimation in capture,mark,recapture studies

OIKOS, Issue 8 2008
Gregg E. Horton
The inability to account for the availability of individuals in the study area during capture,mark,recapture (CMR) studies and the resultant confounding of parameter estimates can make correct interpretation of CMR model parameter estimates difficult. Although important advances based on the Cormack,Jolly,Seber (CJS) model have resulted in estimators of true survival that work by unconfounding either death or recapture probability from availability for capture in the study area, these methods rely on the researcher's ability to select a method that is correctly matched to emigration patterns in the population. If incorrect assumptions regarding site fidelity (non-movement) are made, it may be difficult or impossible as well as costly to change the study design once the incorrect assumption is discovered. Subtleties in characteristics of movement (e.g. life history-dependent emigration, nomads vs territory holders) can lead to mixtures in the probability of being available for capture among members of the same population. The result of these mixtures may be only a partial unconfounding of emigration from other CMR model parameters. Biologically-based differences in individual movement can combine with constraints on study design to further complicate the problem. Because of the intricacies of movement and its interaction with other parameters in CMR models, quantification of and solutions to these problems are needed. Based on our work with stream-dwelling populations of Atlantic salmon Salmo salar, we used a simulation approach to evaluate existing CMR models under various mixtures of movement probabilities. The Barker joint data model provided unbiased estimates of true survival under all conditions tested. The CJS and robust design models provided similarly unbiased estimates of true survival but only when emigration information could be incorporated directly into individual encounter histories. For the robust design model, Markovian emigration (future availability for capture depends on an individual's current location) was a difficult emigration pattern to detect unless survival and especially recapture probability were high. Additionally, when local movement was high relative to study area boundaries and movement became more diffuse (e.g. a random walk), local movement and permanent emigration were difficult to distinguish and had consequences for correctly interpreting the survival parameter being estimated (apparent survival vs true survival). [source]

A Method of Automated Nonparametric Content Analysis for Social Science

Daniel J. Hopkins
The increasing availability of digitized text presents enormous opportunities for social scientists. Yet hand coding many blogs, speeches, government records, newspapers, or other sources of unstructured text is infeasible. Although computer scientists have methods for automated content analysis, most are optimized to classify individual documents, whereas social scientists instead want generalizations about the population of documents, such as the proportion in a given category. Unfortunately, even a method with a high percent of individual documents correctly classified can be hugely biased when estimating category proportions. By directly optimizing for this social science goal, we develop a method that gives approximately unbiased estimates of category proportions even when the optimal classifier performs poorly. We illustrate with diverse data sets, including the daily expressed opinions of thousands of people about the U.S. presidency. We also make available software that implements our methods and large corpora of text for further analysis. [source]

Evaluating capture,recapture population and density estimation of tigers in a population with known parameters

R. K. Sharma
Abstract Conservation strategies for endangered species require accurate and precise estimates of abundance. Unfortunately, obtaining unbiased estimates can be difficult due to inappropriate estimator models and study design. We evaluate population,density estimators for tigers Panthera tigris in Kanha Tiger Reserve, India, using camera traps in conjunction with telemetry (n=6) in a known minimum population of 14 tigers. An effort of 462 trap nights over 42 days yielded 44 photographs of 12 adult tigers. Using closed population estimators, the best-fit model (program capture) accounted for individual heterogeneity (Mh). The least biased and precise population estimate ( (SE) []) was obtained by the Mh Jackknife 1 (JK1) [14 (1.89)] in program care -2. Tiger density ( (SE) []) per 100 km2 was estimated at 13 (2.08) when the effective trapping area was estimated using the half mean maximum distance moved (1/2 MMDM), 8.1 (2.08), using the home-range radius, 7.8 (1.59), with the full MMDM and 8.0 (3.0) with the spatial likelihood method in program density 4.1. The actual density of collared tigers (3.27 per 100 km2) was closely estimated by home-range radius at 3.9 (0.76), full MMDM at 3.48 (0.81) and spatial likelihood at 3.78 (1.54), but overestimated by 1/2 MMDM at 6 (0.81) tigers per 100 km2. Sampling costs (Rs. 450 per camera day) increased linearly with camera density, while the precision of population estimates leveled off at 25 cameras per 100 km2. At simulated low tiger densities, a camera density of 50 per 100 km2 with an effort of 8 trap nights km,2 provided 95% confidence coverage, but estimates lacked precision. [source]

A Bayesian Spatial Multimarker Genetic Random-Effect Model for Fine-Scale Mapping

M.-Y. Tsai
Summary Multiple markers in linkage disequilibrium (LD) are usually used to localize the disease gene location. These markers may contribute to the disease etiology simultaneously. In contrast to the single-locus tests, we propose a genetic random effects model that accounts for the dependence between loci via their spatial structures. In this model, the locus-specific random effects measure not only the genetic disease risk, but also the correlations between markers. In other words, the model incorporates this relation in both mean and covariance structures, and the variance components play important roles. We consider two different settings for the spatial relations. The first is our proposal, relative distance function (RDF), which is intuitive in the sense that markers nearby are likely to correlate with each other. The second setting is a common exponential decay function (EDF). Under each setting, the inference of the genetic parameters is fully Bayesian with Markov chain Monte Carlo (MCMC) sampling. We demonstrate the validity and the utility of the proposed approach with two real datasets and simulation studies. The analyses show that the proposed model with either one of two spatial correlations performs better as compared with the single locus analysis. In addition, under the RDF model, a more precise estimate for the disease locus can be obtained even when the candidate markers are fairly dense. In all simulations, the inference under the true model provides unbiased estimates of the genetic parameters, and the model with the spatial correlation structure does lead to greater confidence interval coverage probabilities. [source]

On the Role of Baseline Measurements for Crossover Designs under the Self and Mixed Carryover Effects Model

BIOMETRICS, Issue 1 2010
Yuanyuan Liang
Summary It is well known that optimal designs are strongly model dependent. In this article, we apply the Lagrange multiplier approach to the optimal design problem, using a recently proposed model for carryover effects. Generally, crossover designs are not recommended when carryover effects are present and when the primary goal is to obtain an unbiased estimate of the treatment effect. In some cases, baseline measurements are believed to improve design efficiency. This article examines the impact of baselines on optimal designs using two different assumptions about carryover effects during baseline periods and employing a nontraditional crossover design model. As anticipated, baseline observations improve design efficiency considerably for two-period designs, which use the data in the first period only to obtain unbiased estimates of treatment effects, while the improvement is rather modest for three- or four-period designs. Further, we find little additional benefits for measuring baselines at each treatment period as compared to measuring baselines only in the first period. Although our study of baselines did not change the results on optimal designs that are reported in the literature, the problem of strong model dependency problem is generally recognized. The advantage of using multiperiod designs is rather evident, as we found that extending two-period designs to three- or four-period designs significantly reduced variability in estimating the direct treatment effect contrast. [source]

Estimating Disease Prevalence Using Relatives of Case and Control Probands

BIOMETRICS, Issue 1 2010
Kristin N. Javaras
Summary We introduce a method of estimating disease prevalence from case,control family study data. Case,control family studies are performed to investigate the familial aggregation of disease; families are sampled via either a case or a control proband, and the resulting data contain information on disease status and covariates for the probands and their relatives. Here, we introduce estimators for overall prevalence and for covariate-stratum-specific (e.g., sex-specific) prevalence. These estimators combine the proportion of affected relatives of control probands with the proportion of affected relatives of case probands and are designed to yield approximately unbiased estimates of their population counterparts under certain commonly made assumptions. We also introduce corresponding confidence intervals designed to have good coverage properties even for small prevalences. Next, we describe simulation experiments where our estimators and intervals were applied to case,control family data sampled from fictional populations with various levels of familial aggregation. At all aggregation levels, the resulting estimates varied closely and symmetrically around their population counterparts, and the resulting intervals had good coverage properties, even for small sample sizes. Finally, we discuss the assumptions required for our estimators to be approximately unbiased, highlighting situations where an alternative estimator based only on relatives of control probands may perform better. [source]

Smooth Random Effects Distribution in a Linear Mixed Model

BIOMETRICS, Issue 4 2004
Wendimagegn Ghidey
Summary A linear mixed model with a smooth random effects density is proposed. A similar approach to P -spline smoothing of Eilers and Marx (1996, Statistical Science11, 89,121) is applied to yield a more flexible estimate of the random effects density. Our approach differs from theirs in that the B -spline basis functions are replaced by approximating Gaussian densities. Fitting the model involves maximizing a penalized marginal likelihood. The best penalty parameters minimize Akaike's Information Criterion employing Gray's (1992, Journal of the American Statistical Association87, 942,951) results. Although our method is applicable to any dimensions of the random effects structure, in this article the two-dimensional case is explored. Our methodology is conceptually simple, and it is relatively easy to fit in practice and is applied to the cholesterol data first analyzed by Zhang and Davidian (2001, Biometrics57, 795,802). A simulation study shows that our approach yields almost unbiased estimates of the regression and the smoothing parameters in small sample settings. Consistency of the estimates is shown in a particular case. [source]