Maximum Likelihood Estimates (maximum + likelihood_estimate)

Distribution by Scientific Domains


Selected Abstracts


A statistical model for unwarping of 1-D electrophoresis gels

ELECTROPHORESIS, Issue 22 2005
Chris Glasbey Professor
Abstract A statistical model is proposed which relates density profiles in 1-D electrophoresis gels, such as those produced by pulsed-field gel electrophoresis (PFGE), to databases of profiles of known genotypes. The warp in each gel lane is described by a trend that is linear in its parameters plus a first-order autoregressive process, and density differences are modelled by a mixture of two normal distributions. Maximum likelihood estimates are computed efficiently by a recursive algorithm that alternates between dynamic time warping to align individual lanes and generalised-least-squares regression to ensure that the warp is smooth between lanes. The method, illustrated using PFGE of Escherichia coli O157 strains, automatically unwarps and classifies gel lanes, and facilitates manual identification of new genotypes. [source]


The Hazard Rate of Foreign Direct Investment: A Structural Estimation of a Real-option Model,

OXFORD BULLETIN OF ECONOMICS & STATISTICS, Issue 5 2006
Enrico Pennings
Abstract The hazard rate of investment is derived within a real-option model, and its properties are analysed so as to directly study the relation between uncertainty and investment. Maximum likelihood estimates of the hazard are calculated using a sample of multinational enterprises (MNEs) that invested in Central and Eastern Europe over the period 1990,98. Employing a standard, non-parametric specification of the hazard, our measure of uncertainty has a negative effect on investment, but the reduced-form model is unable to control for nonlinearities in the relationship. The structural estimation of the option-based hazard is instead able to account for the nonlinearities and exhibits a significant value of waiting, although the latter is independent of our measure of uncertainty. This finding supports the existence of alternative channels through which uncertainty can affect investment. [source]


Maximum likelihood estimates of admixture in northeastern Mexico using 13 short tandem repeat loci

AMERICAN JOURNAL OF HUMAN BIOLOGY, Issue 4 2002
Ricardo M. Cerda-Flores
Tetrameric short tandem repeat (STR) polymorphisms are widely used in population genetics, molecular evolution, gene mapping and linkage analysis, paternity tests, forensic analysis, and medical applications. This article provides allelic distributions of the STR loci D3S1358, vWA, FGA, D8S1179, D21S11, D18S51, D5S818, D13S317, D7S820, CSF1PO, TPOX, TH01, and D16S539 in 143 Mestizos from Northeastern Mexico, estimates of contributions of genes of European (Spanish), American Indian and African origin in the gene pool of this admixed Mestizo population (using 10 of these loci); and a comparison of the genetic admixture of this population with the previously reported two polymorphic molecular markers, D1S80 and HLA-DQA1 (n = 103). Genotype distributions were in agreement with Hardy-Weinberg expectations (HWE) for almost all 13 STR markers. Maximum likelihood estimates of admixture components yield a trihybrid model with Spanish, Amerindian, and African ancestry with the admixture proportions: 54.99% ± 3.44, 39.99% ± 2.57, and 5.02% ± 2.82, respectively. These estimates were not significantly different from those obtained using D1S80 and HLA-DQA1 loci (59.99% ± 5.94, 36.99% ± 5.04, and 3.02% ± 2.76). In conclusion, Mestizos of Northeastern Mexico showed a similar ancestral contribution independent of the markers used for evolutionary purposes. Further validation of this database supports the use of the 13 STR loci along with D1S80 and HLA-DQA1 as a battery of efficient DNA forensic markers in Northeastern Mestizo populations of Mexico. Am. J. Hum. Biol. 14:429,439, 2002. © 2002 Wiley-Liss, Inc. [source]


Maximum likelihood estimates for the Hildreth,Houck random coefficients model

THE ECONOMETRICS JOURNAL, Issue 1 2002
Asad Zaman
We explore maximum likelihood (ML) estimation of the Hildreth,Houck random coefficients model. We show that the global ML estimator can be inconsistent. We develop an alternative LML (local ML) estimator and prove that it is consistent and asymptotically efficient for points in the interior of the parameters. Properties of the LML and comparisons with common method of moments (MM) estimates are done via Monte Carlo. Boundary parameters lead to nonstandard asymptotic distributions for the LML which are described. The LML is used to develop a modification of the LR test for random coefficients. Simulations suggest that the LR test is more powerful for distant alternatives than the Breusch,Pagan (BP) Lagrange multiplier test. A simple modification of the BP test also appears to be more powerful than the BP. [source]


Estimating the unknown change point in the parameters of the lognormal distribution

ENVIRONMETRICS, Issue 2 2007
V. K. Jandhyala
Abstract We develop change-point methodology for identifying dynamic trends in the parameters of a two-parameter lognormal distribution. The methodology primarily considers the asymptotic distribution of the maximum likelihood estimate of the unknown change point. Among others, the asymptotic distribution enables one to construct confidence interval estimates for the unknown change point. The methodology is applied to identify changes in the monthly water discharges of the Nacetinsky Creek in the German part of the Ergebirge Mountains. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Data Sparseness and On-Line Pretest Item Calibration-Scaling Methods in CAT

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 3 2002
Jae-Chun Ban
The purpose of this study was to compare and evaluate three on-line pretest item calibration-scaling methods (the marginal maximum likelihood estimate with one expectation maximization [EM] cycle [OEM] method, the marginal maximum likelihood estimate with multiple EM cycles [MEM] method, and Stocking's Method B) in terms of itern parameter recovery when the item responses to the pretest items in the pool are sparse. Simulations of computerized adaptive tests were used to evaluate the results yielded by the three methods. The MEM method produced the smallest average total error in parameter estimation, and the OEM method yielded the largest total error. [source]


Estimating numbers of infectious units from serial dilution assays

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 1 2006
Nigel Stallard
Summary., The paper concerns the design and analysis of serial dilution assays to estimate the infectivity of a sample of tissue when it is assumed that the sample contains a finite number of indivisible infectious units such that a subsample will be infectious if it contains one or more of these units. The aim of the study is to estimate the number of infectious units in the original sample. The standard approach to the analysis of data from such a study is based on the assumption of independence of aliquots both at the same dilution level and at different dilution levels, so that the numbers of infectious units in the aliquots follow independent Poisson distributions. An alternative approach is based on calculation of the expected value of the total number of samples tested that are not infectious. We derive the likelihood for the data on the basis of the discrete number of infectious units, enabling calculation of the maximum likelihood estimate and likelihood-based confidence intervals. We use the exact probabilities that are obtained to compare the maximum likelihood estimate with those given by the other methods in terms of bias and standard error and to compare the coverage of the confidence intervals. We show that the methods have very similar properties and conclude that for practical use the method that is based on the Poisson assumption is to be recommended, since it can be implemented by using standard statistical software. Finally we consider the design of serial dilution assays, concluding that it is important that neither the dilution factor nor the number of samples that remain untested should be too large. [source]


Semiparametric Estimation Exploiting Covariate Independence in Two-Phase Randomized Trials

BIOMETRICS, Issue 1 2009
James Y. Dai
Summary Recent results for case,control sampling suggest when the covariate distribution is constrained by gene-environment independence, semiparametric estimation exploiting such independence yields a great deal of efficiency gain. We consider the efficient estimation of the treatment,biomarker interaction in two-phase sampling nested within randomized clinical trials, incorporating the independence between a randomized treatment and the baseline markers. We develop a Newton,Raphson algorithm based on the profile likelihood to compute the semiparametric maximum likelihood estimate (SPMLE). Our algorithm accommodates both continuous phase-one outcomes and continuous phase-two biomarkers. The profile information matrix is computed explicitly via numerical differentiation. In certain situations where computing the SPMLE is slow, we propose a maximum estimated likelihood estimator (MELE), which is also capable of incorporating the covariate independence. This estimated likelihood approach uses a one-step empirical covariate distribution, thus is straightforward to maximize. It offers a closed-form variance estimate with limited increase in variance relative to the fully efficient SPMLE. Our results suggest exploiting the covariate independence in two-phase sampling increases the efficiency substantially, particularly for estimating treatment,biomarker interactions. [source]


Survival of Bowhead Whales, Balaena mysticetus, Estimated from 1981,1998 Photoidentification Data

BIOMETRICS, Issue 4 2002
Judith Zeh
Summary. Annual survival probability of bowhead whales, Balaena mysticetus, was estimated using both Bayesian and maximum likelihood implementations of Cormack and Jolly-Seber (JS) models for capture-recapture estimation in open populations and reduced-parameter generalizations of these models. Aerial photographs of naturally marked bowheads collected between 1981 and 1998 provided the data. The marked whales first photographed in a particular year provided the initial ,capture' and ,release' of those marked whales and photographs in subsequent years the ,recaptures'. The Cormack model, often called the Cormack-Jolly-Seber (CJS) model, and the program MARK were used to identify the model with a single survival and time-varying capture probabilities as the most appropriate for these data. When survival was constrained to be one or less, the maximum likelihood estimate computed by MARK was one, invalidating confidence interval computations based on the asymptotic standard error or profile likelihood. A Bayesian Markov chain Monte Carlo (MCMC) implementation of the model was used to produce a posterior distribution for annual survival. The corresponding reduced-parameter JS model was also fit via MCMC because it is the more appropriate of the two models for these photoidentification data. Because the CJS model ignores much of the information on capture probabilities provided by the data, its results are less precise and more sensitive to the prior distributions used than results from the JS model. With priors for annual survival and capture probabilities uniform from 0 to 1, the posterior mean for bowhead survival rate from the JS model is 0.984, and 95% of the posterior probability lies between 0.948 and 1. This high estimated survival rate is consistent with other bowhead life history data. [source]


Estimating the Frequency Distribution of Crossovers during Meiosis from Recombination Data

BIOMETRICS, Issue 2 2001
Kai Yu
Summary. Estimation of tetrad crossover frequency distributions from genetic recombination data is a classic problem dating back to Weinstein (1936, Genetics21, 155,199). But a number of important issues, such as how to specify the maximum number of crossovers, how to construct confidence intervals for crossover probabilities, and how to obtain correct p -values for hypothesis tests, have never been adequately addressed. In this article, we obtain some properties of the maximum likelihood estimate (MLE) for crossover probabilities that imply guidelines for choosing the maximum number of crossovers. We give these results for both normal meiosis and meiosis with nondisjunction. We also develop an accelerated EM algorithm to find the MLE more efficiently. We propose bootstrap-based methods to find confidence intervals and p -values and conduct simulation studies to check the validity of the bootstrap approach. [source]


A Multiple Imputation Approach to Cox Regression with Interval-Censored Data

BIOMETRICS, Issue 1 2000
Wei Pan
Summary. We propose a general semiparametric method based on multiple imputation for Cox regression with interval-censored data. The method consists of iterating the following two steps. First, from finite-interval-censored (but not right-censored) data, exact failure times are imputed using Tanner and Wei's poor man's or asymptotic normal data augmentation scheme based on the current estimates of the regression coefficient and the baseline survival curve. Second, a standard statistical procedure for right-censored data, such as the Cox partial likelihood method, is applied to imputed data to update the estimates. Through simulation, we demonstrate that the resulting estimate of the regression coefficient and its associated standard error provide a promising alternative to the nonparametric maximum likelihood estimate. Our proposal is easily implemented by taking advantage of existing computer programs for right,censored data. [source]


Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods

ECOLOGY LETTERS, Issue 7 2007
Subhash R. Lele
Abstract We introduce a new statistical computing method, called data cloning, to calculate maximum likelihood estimates and their standard errors for complex ecological models. Although the method uses the Bayesian framework and exploits the computational simplicity of the Markov chain Monte Carlo (MCMC) algorithms, it provides valid frequentist inferences such as the maximum likelihood estimates and their standard errors. The inferences are completely invariant to the choice of the prior distributions and therefore avoid the inherent subjectivity of the Bayesian approach. The data cloning method is easily implemented using standard MCMC software. Data cloning is particularly useful for analysing ecological situations in which hierarchical statistical models, such as state-space models and mixed effects models, are appropriate. We illustrate the method by fitting two nonlinear population dynamics models to data in the presence of process and observation noise. [source]


Maximum likelihood estimators of population parameters from doubly left-censored samples

ENVIRONMETRICS, Issue 8 2006
Abou El-Makarim A. Aboueissa
Abstract Left-censored data often arise in environmental contexts with one or more detection limits, DLs. Estimators of the parameters are derived for left-censored data having two detection limits: DL1 and DL2 assuming an underlying normal distribution. Two different approaches for calculating the maximum likelihood estimates (MLE) are given and examined. These methods also apply to lognormally distributed environmental data with two distinct detection limits. The performance of the new estimators is compared utilizing many simulated data sets. Examples are given illustrating the use of these methods utilizing a computer program given in the Appendix. Copyright © 2006 John Wiley & Sons, Ltd. [source]


PROMISCUITY AND THE RATE OF MOLECULAR EVOLUTION AT PRIMATE IMMUNITY GENES

EVOLUTION, Issue 8 2010
Gabriela Wlasiuk
Recently, a positive correlation between basal leukocyte counts and mating system across primates suggested that sexual promiscuity could be an important determinant of the evolution of the immune system. Motivated by this idea, we examined the patterns of molecular evolution of 15 immune defense genes in primates in relation to promiscuity and other variables expected to affect disease risk. We obtained maximum likelihood estimates of the rate of protein evolution for terminal branches of the primate phylogeny at these genes. Using phylogenetically independent contrasts, we found that immunity genes evolve faster in more promiscuous species, but only for a subset of genes that interact closely with pathogens. We also observed a significantly greater proportion of branches under positive selection in the more promiscuous species. Analyses of independent contrasts also showed a positive effect of group size. However, this effect was not restricted to genes that interact closely with pathogens, and no differences were observed in the proportion of branches under positive selection in species with small and large groups. Together, these results suggest that mating system has influenced the evolution of some immunity genes in primates, possibly due to increased risk of acquiring sexually transmitted diseases in species with higher levels of promiscuity. [source]


Maximum-likelihood estimation of haplotype frequencies in nuclear families

GENETIC EPIDEMIOLOGY, Issue 1 2004
Tim Becker
Abstract The importance of haplotype analysis in the context of association fine mapping of disease genes has grown steadily over the last years. Since experimental methods to determine haplotypes on a large scale are not available, phase has to be inferred statistically. For individual genotype data, several reconstruction techniques and many implementations of the expectation-maximization (EM) algorithm for haplotype frequency estimation exist. Recent research work has shown that incorporating available genotype information of related individuals largely increases the precision of haplotype frequency estimates. We, therefore, implemented a highly flexible program written in C, called FAMHAP, which calculates maximum likelihood estimates (MLEs) of haplotype frequencies from general nuclear families with an arbitrary number of children via the EM-algorithm for up to 20 SNPs. For more loci, we have implemented a locus-iterative mode of the EM-algorithm, which gives reliable approximations of the MLEs for up to 63 SNP loci, or less when multi-allelic markers are incorporated into the analysis. Missing genotypes can be handled as well. The program is able to distinguish cases (haplotypes transmitted to the first affected child of a family) from pseudo-controls (non-transmitted haplotypes with respect to the child). We tested the performance of FAMHAP and the accuracy of the obtained haplotype frequencies on a variety of simulated data sets. The implementation proved to work well when many markers were considered and no significant differences between the estimates obtained with the usual EM-algorithm and those obtained in its locus-iterative mode were observed. We conclude from the simulations that the accuracy of haplotype frequency estimation and reconstruction in nuclear families is very reliable in general and robust against missing genotypes. © 2004 Wiley-Liss, Inc. [source]


Evaluations of maximization procedures for estimating linkage parameters under heterogeneity

GENETIC EPIDEMIOLOGY, Issue 3 2004
Swati Biswas
Abstract Locus heterogeneity is a major problem plaguing the mapping of disease genes responsible for complex genetic traits via linkage analysis. A common feature of several available methods to account for heterogeneity is that they involve maximizing a multidimensional likelihood to obtain maximum likelihood estimates. The high dimensionality of the likelihood surface may be due to multiple heterogeneity (mixing) parameters, linkage parameters, and/or regression coefficients corresponding to multiple covariates. Here, we focus on this nontrivial computational aspect of incorporating heterogeneity by considering several likelihood maximization procedures, including the expectation maximization (EM) algorithm and the stochastic expectation maximization (SEM) algorithm. The wide applicability of these procedures is demonstrated first through a general formulation of accounting for heterogeneity, and then by applying them to two specific formulations. Furthermore, our simulation studies as well as an application to the Genetic Analysis Workshop 12 asthma datasets show that, among other observations, SEM performs better than EM. As an aside, we illustrate a limitation of the popular admixture approach for incorporating heterogeneity, proved elsewhere. We also show how to obtain standard errors (SEs) for EM and SEM estimates, using methods available in the literature. These SEs can then be combined with the corresponding estimates to provide confidence intervals of the parameters. © 2004 Wiley-Liss, Inc. [source]


Robustness of inference on measured covariates to misspecification of genetic random effects in family studies

GENETIC EPIDEMIOLOGY, Issue 1 2003
Ruth M.Pfeiffer
Abstract Family studies to identify disease-related genes frequently collect only families with multiple cases. It is often desirable to determine if risk factors that are known to influence disease risk in the general population also play a role in the study families. If so, these factors should be incorporated into the genetic analysis to control for confounding. Pfeiffer et al. [2001 Biometrika 88: 933,948] proposed a variance components or random effects model to account for common familial effects and for different genetic correlations among family members. After adjusting for ascertainment, they found maximum likelihood estimates of the measured exposure effects. Although it is appealing that this model accounts for genetic correlations as well as for the ascertainment of families, in order to perform an analysis one needs to specify the distribution of random genetic effects. The current work investigates the robustness of the proposed model with respect to various misspecifications of genetic random effects in simulations. When the true underlying genetic mechanism is polygenic with a small dominant component, or Mendelian with low allele frequency and penetrance, the effects of misspecification on the estimation of fixed effects in the model are negligible. The model is applied to data from a family study on nasopharyngeal carcinoma in Taiwan. Genet Epidemiol 24:14,23, 2003. © 2003 Wiley-Liss, Inc. [source]


Standard errors for EM estimation

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 2 2000
M. Jamshidian
The EM algorithm is a popular method for computing maximum likelihood estimates. One of its drawbacks is that it does not produce standard errors as a by-product. We consider obtaining standard errors by numerical differentiation. Two approaches are considered. The first differentiates the Fisher score vector to yield the Hessian of the log-likelihood. The second differentiates the EM operator and uses an identity that relates its derivative to the Hessian of the log-likelihood. The well-known SEM algorithm uses the second approach. We consider three additional algorithms: one that uses the first approach and two that use the second. We evaluate the complexity and precision of these three and the SEM in algorithm seven examples. The first is a single-parameter example used to give insight. The others are three examples in each of two areas of EM application: Poisson mixture models and the estimation of covariance from incomplete data. The examples show that there are algorithms that are much simpler and more accurate than the SEM algorithm. Hopefully their simplicity will increase the availability of standard error estimates in EM applications. It is shown that, as previously conjectured, a symmetry diagnostic can accurately estimate errors arising from numerical differentiation. Some issues related to the speed of the EM algorithm and algorithms that differentiate the EM operator are identified. [source]


Conditional Heteroskedasticity Driven by Hidden Markov Chains

JOURNAL OF TIME SERIES ANALYSIS, Issue 2 2001
Christian Francq
We consider a generalized autoregressive conditionally heteroskedastic (GARCH) equation where the coefficients depend on the state of a nonobserved Markov chain. Necessary and sufficient conditions ensuring the existence of a stationary solution are given. In the case of ARCH regimes, the maximum likelihood estimates are shown to be consistent. The identification problem is also considered. This is illustrated by means of real and simulated data sets. [source]


A Pareto model for classical systems

MATHEMATICAL METHODS IN THE APPLIED SCIENCES, Issue 1 2008
Saralees Nadarajah
Abstract A new Pareto distribution is introduced for pooling knowledge about classical systems. It takes the form of the product of two Pareto probability density functions (pdfs). Various structural properties of this distribution are derived, including its cumulative distribution function (cdf), moments, mean deviation about the mean, mean deviation about the median, entropy, asymptotic distribution of the extreme order statistics, maximum likelihood estimates and the Fisher information matrix. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Optimum step-stress for temperature accelerated life testing

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, Issue 8 2007
Evans Gouno
Abstract Step-stress accelerated life testing is a design strategy where the stress is modified several times during the test. In this work we address the problem of designing such a test. We focus on temperature accelerated life testing and we address the problems of setting the step duration and the stress levels. Assuming an Arrhenius model, maximum likelihood estimates of the parameters are computed. Relying on the properties of these estimators we compare different criteria for assessing the optimality of the plans produced. Some tables are presented to illustrate the method. For a fixed number of steps and a set of temperatures, a table of optimal length steps can be computed. For fixed step lengths, sets of temperatures leading to optimal plans are also available. Thus, this work provides useful tools to help engineers make decisions in testing strategy. Copyright © 2007 John Wiley & Sons, Ltd. [source]


A comparison of three estimators of the Weibull parameters

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, Issue 4 2001
Katina R. Skinner
Abstract Using mean square error as the criterion, we compare two least squares estimates of the Weibull parameters based on non-parametric estimates of the unreliability with the maximum likelihood estimates (MLEs). The two non-parametric estimators are that of Herd,Johnson and one recently proposed by Zimmer. Data was generated using computer simulation with three small sample sizes (5, 10 and 15) with three multiply-censored patterns for each sample size. Our results indicate that the MLE is a better estimator of the Weibull characteristic value, ,, than the least squares estimators considered. No firm conclusions may be made regarding the best estimate of the Weibull shape parameter, although the use of maximum likelihood is not recommended for small sample sizes. Whenever least squares estimation of both Weibull parameters is appropriate, we recommend the use of the Zimmer estimator of reliability. Copyright © 2001 John Wiley & Sons, Ltd. [source]


Temporal disaggregation by state space methods: Dynamic regression methods revisited

THE ECONOMETRICS JOURNAL, Issue 3 2006
Tommaso Proietti
Summary, The paper advocates the use of state space methods to deal with the problem of temporal disaggregation by dynamic regression models, which encompass the most popular techniques for the distribution of economic flow variables, such as Chow,Lin, Fernández and Litterman. The state space methodology offers the generality that is required to address a variety of inferential issues that have not been dealt with previously. The paper contributes to the available literature in three ways: (i) it concentrates on the exact initialization of the different models, showing that this issue is of fundamental importance for the properties of the maximum likelihood estimates and for deriving encompassing autoregressive distributed lag models that nest exactly the traditional disaggregation models; (ii) it points out the role of diagnostics and revisions histories in judging the quality of the disaggregated estimates and (iii) it provides a thorough treatment of the Litterman model, explaining the difficulties commonly encountered in practice when estimating this model. [source]


AN EVALUATION OF NON-ITERATIVE METHODS FOR ESTIMATING THE LINEAR-BY-LINEAR PARAMETER OF ORDINAL LOG-LINEAR MODELS

AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 3 2009
Eric J. Beh
Summary Parameter estimation for association and log-linear models is an important aspect of the analysis of cross-classified categorical data. Classically, iterative procedures, including Newton's method and iterative scaling, have typically been used to calculate the maximum likelihood estimates of these parameters. An important special case occurs when the categorical variables are ordinal and this has received a considerable amount of attention for more than 20 years. This is because models for such cases involve the estimation of a parameter that quantifies the linear-by-linear association and is directly linked with the natural logarithm of the common odds ratio. The past five years has seen the development of non-iterative procedures for estimating the linear-by-linear parameter for ordinal log-linear models. Such procedures have been shown to lead to numerically equivalent estimates when compared with iterative, maximum likelihood estimates. Such procedures also enable the researcher to avoid some of the computational difficulties that commonly arise with iterative algorithms. This paper investigates and evaluates the performance of three non-iterative procedures for estimating this parameter by considering 14 contingency tables that have appeared in the statistical and allied literature. The estimation of the standard error of the association parameter is also considered. [source]


Fitting and comparing seed germination models with a focus on the inverse normal distribution

AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 3 2004
Michael E. O'Neill
Summary This paper reviews current methods for fitting a range of models to censored seed germination data and recommends adoption of a probability-based model for the time to germination. It shows that, provided the probability of a seed eventually germinating is not on the boundary, maximum likelihood estimates, their standard errors and the resultant deviances are identical whether only those seeds which have germinated are used or all seeds (including seeds ungerminated at the end of the experiment). The paper recommends analysis of deviance when exploring whether replicate data are consistent with a hypothesis that the underlying distributions are identical, and when assessing whether data from different treatments have underlying distributions with common parameters. The inverse normal distribution, otherwise known as the inverse Gaussian distribution, is discussed, as a natural distribution for the time to germination (including a parameter to measure the lag time to germination). The paper explores some of the properties of this distribution, evaluates the standard errors of the maximum likelihood estimates of the parameters and suggests an accurate approximation to the cumulative distribution function and the median time to germination. Additional material is on the web, at http://www.agric.usyd.edu.au/staff/oneill/. [source]


Robust linear mixed models using the skew t distribution with application to schizophrenia data

BIOMETRICAL JOURNAL, Issue 4 2010
Hsiu J. Ho
Abstract We consider an extension of linear mixed models by assuming a multivariate skew t distribution for the random effects and a multivariate t distribution for the error terms. The proposed model provides flexibility in capturing the effects of skewness and heavy tails simultaneously among continuous longitudinal data. We present an efficient alternating expectation-conditional maximization (AECM) algorithm for the computation of maximum likelihood estimates of parameters on the basis of two convenient hierarchical formulations. The techniques for the prediction of random effects and intermittent missing values under this model are also investigated. Our methodologies are illustrated through an application to schizophrenia data. [source]


Sequential designs for ordinal phase I clinical trials

BIOMETRICAL JOURNAL, Issue 2 2009
Guohui Liu
Abstract Sequential designs for phase I clinical trials which incorporate maximum likelihood estimates (MLE) as data accrue are inherently problematic because of limited data for estimation early on. We address this problem for small phase I clinical trials with ordinal responses. In particular, we explore the problem of the nonexistence of the MLE of the logistic parameters under a proportional odds model with one predictor. We incorporate the probability of an undetermined MLE as a restriction, as well as ethical considerations, into a proposed sequential optimal approach, which consists of a start-up design, a follow-on design and a sequential dose-finding design. Comparisons with nonparametric sequential designs are also performed based on simulation studies with parameters drawn from a real data set. [source]


Robust Joint Modeling of Longitudinal Measurements and Competing Risks Failure Time Data

BIOMETRICAL JOURNAL, Issue 1 2009
Ning Li
Abstract Existing methods for joint modeling of longitudinal measurements and survival data can be highly influenced by outliers in the longitudinal outcome. We propose a joint model for analysis of longitudinal measurements and competing risks failure time data which is robust in the presence of outlying longitudinal observations during follow-up. Our model consists of a linear mixed effects sub-model for the longitudinal outcome and a proportional cause-specific hazards frailty sub-model for the competing risks data, linked together by latent random effects. Instead of the usual normality assumption for measurement errors in the linear mixed effects sub-model, we adopt a t -distribution which has a longer tail and thus is more robust to outliers. We derive an EM algorithm for the maximum likelihood estimates of the parameters and estimate their standard errors using a profile likelihood method. The proposed method is evaluated by simulation studies and is applied to a scleroderma lung study (© 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Fisher Information Matrix of the Dirichlet-multinomial Distribution

BIOMETRICAL JOURNAL, Issue 2 2005
Sudhir R. Paul
Abstract In this paper we derive explicit expressions for the elements of the exact Fisher information matrix of the Dirichlet-multinomial distribution. We show that exact calculation is based on the beta-binomial probability function rather than that of the Dirichlet-multinomial and this makes the exact calculation quite easy. The exact results are expected to be useful for the calculation of standard errors of the maximum likelihood estimates of the beta-binomial parameters and those of the Dirichlet-multinomial parameters for data that arise in practice in toxicology and other similar fields. Standard errors of the maximum likelihood estimates of the beta-binomial parameters and those of the Dirichlet-multinomial parameters, based on the exact and the asymptotic Fisher information matrix based on the Dirichlet distribution, are obtained for a set of data from Haseman and Soares (1976), a dataset from Mosimann (1962) and a more recent dataset from Chen, Kodell, Howe and Gaylor (1991). There is substantial difference between the standard errors of the estimates based on the exact Fisher information matrix and those based on the asymptotic Fisher information matrix. (© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Sensitivity Analyses Comparing Outcomes Only Existing in a Subset Selected Post-Randomization, Conditional on Covariates, with Application to HIV Vaccine Trials

BIOMETRICS, Issue 2 2006
Bryan E. Shepherd
Summary In many experiments, researchers would like to compare between treatments and outcome that only exists in a subset of participants selected after randomization. For example, in preventive HIV vaccine efficacy trials it is of interest to determine whether randomization to vaccine causes lower HIV viral load, a quantity that only exists in participants who acquire HIV. To make a causal comparison and account for potential selection bias we propose a sensitivity analysis following the principal stratification framework set forth by Frangakis and Rubin (2002, Biometrics58, 21,29). Our goal is to assess the average causal effect of treatment assignment on viral load at a given baseline covariate level in the always infected principal stratum (those who would have been infected whether they had been assigned to vaccine or placebo). We assume stable unit treatment values (SUTVA), randomization, and that subjects randomized to the vaccine arm who became infected would also have become infected if randomized to the placebo arm (monotonicity). It is not known which of those subjects infected in the placebo arm are in the always infected principal stratum, but this can be modeled conditional on covariates, the observed viral load, and a specified sensitivity parameter. Under parametric regression models for viral load, we obtain maximum likelihood estimates of the average causal effect conditional on covariates and the sensitivity parameter. We apply our methods to the world's first phase III HIV vaccine trial. [source]