Multivariate Normal Distribution (multivariate + normal_distribution)

Distribution by Scientific Domains


Selected Abstracts


Semiparametric variance-component models for linkage and association analyses of censored trait data

GENETIC EPIDEMIOLOGY, Issue 7 2006
G. Diao
Abstract Variance-component (VC) models are widely used for linkage and association mapping of quantitative trait loci in general human pedigrees. Traditional VC methods assume that the trait values within a family follow a multivariate normal distribution and are fully observed. These assumptions are violated if the trait data contain censored observations. When the trait pertains to age at onset of disease, censoring is inevitable because of loss to follow-up and limited study duration. Censoring also arises when the trait assay cannot detect values below (or above) certain thresholds. The latent trait values tend to have a complex distribution. Applying traditional VC methods to censored trait data would inflate type I error and reduce power. We present valid and powerful methods for the linkage and association analyses of censored trait data. Our methods are based on a novel class of semiparametric VC models, which allows an arbitrary distribution for the latent trait values. We construct appropriate likelihood for the observed data, which may contain left or right censored observations. The maximum likelihood estimators are approximately unbiased, normally distributed, and statistically efficient. We develop stable and efficient numerical algorithms to implement the corresponding inference procedures. Extensive simulation studies demonstrate that the proposed methods outperform the existing ones in practical situations. We provide an application to the age at onset of alcohol dependence data from the Collaborative Study on the Genetics of Alcoholism. A computer program is freely available. Genet. Epidemiol. 2006. © 2006 Wiley-Liss, Inc. [source]


Efficient Calculation of P-value and Power for Quadratic Form Statistics in Multilocus Association Testing

ANNALS OF HUMAN GENETICS, Issue 3 2010
Liping Tong
Summary We address the asymptotic and approximate distributions of a large class of test statistics with quadratic forms used in association studies. The statistics of interest take the general form D=XTA X, where A is a general similarity matrix which may or may not be positive semi-definite, and X follows the multivariate normal distribution with mean , and variance matrix ,, where , may or may not be singular. We show that D can be written as a linear combination of independent ,2 random variables with a shift. Furthermore, its distribution can be approximated by a ,2 or the difference of two ,2 distributions. In the setting of association testing, our methods are especially useful in two situations. First, when the required significance level is much smaller than 0.05 such as in a genome scan, the estimation of p-values using permutation procedures can be challenging. Second, when an EM algorithm is required to infer haplotype frequencies from un-phased genotype data, the computation can be intensive for a permutation procedure. In either situation, an efficient and accurate estimation procedure would be useful. Our method can be applied to any quadratic form statistic and therefore should be of general interest. [source]


New ways of looking at experimental phasing

ACTA CRYSTALLOGRAPHICA SECTION D, Issue 11 2003
Randy J. Read
In the original work by Blow and Crick, experimental phasing was formulated as a least-squares problem. For good data on good derivatives this approach works reasonably well, but we now attempt to extract more information from poorer data than in the past. As in many other crystallographic problems, the assumptions underlying the use of least squares for phasing are not satisfied, particularly for poor derivatives. The introduction of maximum likelihood (and more powerful computers) has led to substantial improvements. For computational convenience, these new methods still make many assumptions about the independence of different measurements and sources of error. A more general formulation for the probability distributions underlying likelihood-based methods for both experimental phasing and molecular-replacement phasing is reviewed. In the new formulation, all the structure factors associated with a particular hkl are considered to be related by a complex multivariate normal distribution. When it is assumed that certain errors are independent, the general formulation reduces to current likelihood targets. However, the new formulation makes the necessary assumptions more explicit and points the way to improving phasing using both isomorphous and anomalous differences. [source]


Bayesian Inference for Smoking Cessation with a Latent Cure State

BIOMETRICS, Issue 3 2009
Sheng Luo
Summary We present a Bayesian approach to modeling dynamic smoking addiction behavior processes when cure is not directly observed due to censoring. Subject-specific probabilities model the stochastic transitions among three behavioral states: smoking, transient quitting, and permanent quitting (absorbent state). A multivariate normal distribution for random effects is used to account for the potential correlation among the subject-specific transition probabilities. Inference is conducted using a Bayesian framework via Markov chain Monte Carlo simulation. This framework provides various measures of subject-specific predictions, which are useful for policy-making, intervention development, and evaluation. Simulations are used to validate our Bayesian methodology and assess its frequentist properties. Our methods are motivated by, and applied to, the Alpha-Tocopherol, Beta-Carotene Lung Cancer Prevention study, a large (29,133 individuals) longitudinal cohort study of smokers from Finland. [source]


A new reconstruction of multivariate normal orthant probabilities

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 1 2008
Peter Craig
Summary., A new method is introduced for geometrically reconstructing orthant probabilities for non-singular multivariate normal distributions. Orthant probabilities are expressed in terms of those for auto-regressive sequences and an efficient method is developed for numerical approximation of the latter. The approach allows more efficient accurate evaluation of the multivariate normal cumulative distribution function than previously, for many situations where the original distribution arises from a graphical model. An implementation is available as a package for the statistical software R and an application is given to multivariate probit models. [source]


Reparameterizing the Pattern Mixture Model for Sensitivity Analyses Under Informative Dropout

BIOMETRICS, Issue 4 2000
Michael J. Daniels
Summary. Pattern mixture models are frequently used to analyze longitudinal data where missingness is induced by dropout. For measured responses, it is typical to model the complete data as a mixture of multivariate normal distributions, where mixing is done over the dropout distribution. Fully parameterized pattern mixture models are not identified by incomplete data; Little (1993, Journal of the American Statistical Association88, 125,134) has characterized several identifying restrictions that can be used for model fitting. We propose a reparameterization of the pattern mixture model that allows investigation of sensitivity to assumptions about nonidentified parameters in both the mean and variance, allows consideration of a wide range of nonignorable missing-data mechanisms, and has intuitive appeal for eliciting plausible missing-data mechanisms. The parameterization makes clear an advantage of pattern mixture models over parametric selection models, namely that the missing-data mechanism can be varied without affecting the marginal distribution of the observed data. To illustrate the utility of the new parameterization, we analyze data from a recent clinical trial of growth hormone for maintaining muscle strength in the elderly. Dropout occurs at a high rate and is potentially informative. We undertake a detailed sensitivity analysis to understand the impact of missing-data assumptions on the inference about the effects of growth hormone on muscle strength. [source]