Home About us Contact | |||
Covariance Structure (covariance + structure)
Selected AbstractsGaussian Process Functional Regression Modeling for Batch DataBIOMETRICS, Issue 3 2007J. Q. Shi Summary A Gaussian process functional regression model is proposed for the analysis of batch data. Covariance structure and mean structure are considered simultaneously, with the covariance structure modeled by a Gaussian process regression model and the mean structure modeled by a functional regression model. The model allows the inclusion of covariates in both the covariance structure and the mean structure. It models the nonlinear relationship between a functional output variable and a set of functional and nonfunctional covariates. Several applications and simulation studies are reported and show that the method provides very good results for curve fitting and prediction. [source] COVARIATE-ADJUSTED REGRESSION FOR LONGITUDINAL DATA INCORPORATING CORRELATION BETWEEN REPEATED MEASUREMENTSAUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 3 2009Danh V. Nguyen Summary We propose an estimation method that incorporates the correlation/covariance structure between repeated measurements in covariate-adjusted regression models for distorted longitudinal data. In this distorted data setting, neither the longitudinal response nor (possibly time-varying) predictors are directly observable. The unobserved response and predictors are assumed to be distorted/contaminated by unknown functions of a common observable confounder. The proposed estimation methodology adjusts for the distortion effects both in estimation of the covariance structure and in the regression parameters using generalized least squares. The finite-sample performance of the proposed estimators is studied numerically by means of simulations. The consistency and convergence rates of the proposed estimators are also established. The proposed method is illustrated with an application to data from a longitudinal study of cognitive and social development in children. [source] Space,time modeling of 20 years of daily air temperature in the Chicago metropolitan regionENVIRONMETRICS, Issue 5 2009Hae-Kyung Im Abstract We analyze 20 years of daily minimum and maximum air temperature data in the Chicago metropolitan region and propose a parsimonious model that describes their mean function and the space,time covariance structure. The mean function contains a long-term trend, annual and semiannual harmonics, and physical covariates such as latitude, distance to the Lake Michigan, and winds, each interacted with the harmonic terms, thus allowing the effects of physical covariates to vary smoothly over time. The temporal correlation at a given location is described using an ARMA(1,2) model. The residuals (innovations) from this models are treated as independent replications of a spatial process with covariance structure in the Matérn class. The space,time covariance structure parameters are allowed to vary seasonally. Using the estimated covariance structure, we interpolate the temperature to a fine grid in the Chicago metropolitan region. This procedure borrows information from temporally and spatially adjacent data. The methods presented in this paper should be useful to approach other environmental problems where the data are discrete and regular in time but irregular in space. Copyright © 2008 John Wiley & Sons, Ltd. [source] Spatial-temporal model for ambient air pollutants in the state of KuwaitENVIRONMETRICS, Issue 7 2006Fahimah A. Al-Awadhi Abstract In this paper we consider dynamic Bayesian models for four different pollutants: nitric oxide(NO), carbon monoxide(CO), sulphur dioxide(SO2) and non-methane hydrocarbon (NCH4) recorded daily in six different stations in Kuwait from 1999 to 2002. The structure of the models depends on time, space and pollutants dependencies. The approach strives to incorporate the uncertainty of the covariance structure into simulated models and final inference; therefore, hierarchical Bayesian model is applied. Association between level of pollutants and different meteorological variables, such as wind speed, wind directions, temperature and humidity are considered. The models will decompose into two main components: a deterministic part to represent the observed components term and a stochastic term to represent the unobservable components. Our analysis will start with basic model and gradually increase its complexity. At each stage the efficiency of the model will be measured. The resulting models subsequently are tested by comparing the output terms and by comparing and the predictions with the real observations. Copyright © 2006 John Wiley & Sons, Ltd. [source] Sampling and variance estimation on continuous domainsENVIRONMETRICS, Issue 6 2006Cynthia Cooper Abstract This paper explores fundamental concepts of design- and model-based approaches to sampling and estimation for a response defined on a continuous domain. The paper discusses the concepts in design-based methods as applied in a continuous domain, the meaning of model-based sampling, and the interpretation of the design-based variance of a model-based estimate. A model-assisted variance estimator is examined for circumstances for which a direct design-based estimator may be inadequate or not available. The alternative model-assisted variance estimator is demonstrated in simulations on a realization of a response generated by a process with exponential covariance structure. The empirical results demonstrate that the model-assisted variance estimator is less biased and more efficient than Horvitz,Thompson and Yates,Grundy variance estimators applied to a continuous-domain response. Copyright © 2006 John Wiley & Sons, Ltd. [source] A high frequency kriging approach for non-stationary environmental processesENVIRONMETRICS, Issue 5 2001Montserrat Fuentes Abstract Emission reductions were mandated in the Clean Air Act Amendments of 1990 with the expectation that they would result in major reductions in the concentrations of atmospherically transported pollutants. The emission reductions are intended to reduce public health risks and to protect sensitive ecosystems. To determine whether the emission reductions are having the intended effect on atmospheric concentrations, monitoring data must be analyzed taking into consideration the spatial structure shown by the data. Maps of pollutant concentrations and fluxes are useful over different geopolitical boundaries, to discover when, where, and to what extent the U.S. Nation's air quality is improving or declining. Since the spatial covariance structure shown by the data changes with location, the standard kriging methodology for spatial interpolation cannot be used because it assumes stationarity of the process. We present a new methodology for spatial interpolation of non-stationary processes. In this method the field is represented locally as a stationary isotropic random field, but the parameters of the stationary random field are allowed to vary across space. A procedure for interpolation is presented that uses an expression for the spectral density at high frequencies. New fitting algorithms are developed using spectral approaches. In cases where the data are distributed exactly or approximately on a lattice, it is argued that spectral approaches have potentially enormous computational benefits compared with maximum likelihood. The methods are extended to interpolation questions using approximate Bayesian approaches to account for parameter uncertainty. We develop applications to obtain the total loading of pollutant concentrations and fluxes over different geo-political boundaries. Copyright © 2001 John Wiley & Sons, Ltd. [source] Geostatistics in fisheries survey design and stock assessment: models, variances and applicationsFISH AND FISHERIES, Issue 3 2001Pierre Petitgas Abstract Over the past 10 years, fisheries scientists gradually adopted geostatistical tools when analysing fish stock survey data for estimating population abundance. First, the relation between model-based variance estimates and covariance structure enabled estimation of survey precision for non-random survey designs. The possibility of using spatial covariance for optimising sampling strategy has been a second motive for using geostatistics. Kriging also offers the advantage of weighting data values, which is useful when sample points are clustered. This paper discusses, with fisheries applications, the different geostatistical models that characterise spatial variation, and their variance formulae for many different survey designs. Some anticipated developments of geostatistics related to multivariate structures, temporal variability and adaptive sampling are discussed. [source] Summer drought: a driver for crown condition and mortality of Norway spruce in NorwayFOREST PATHOLOGY, Issue 2 2004S. Solberg Summary Summer drought, i.e. unusually dry and warm weather, has been a significant stress factor for Norway spruce in southeast Norway during the 14 years of forest monitoring. Dry and warm summers were followed by increases in defoliation, discolouration of foliage, cone formation and mortality. The causal mechanisms are discussed. Most likely, the defoliation resulted from increased needle-fall in the autumn after dry summers. During the monitoring period 1988,2001, southeast Norway was repeatedly affected by summer drought, in particular, in the early 1990s. The dataset comprised 455 ,Forest officers' plots' with annual data on crown condition and mortality. Linear mixed models were used for estimation and hypothesis testing, including a variance,covariance structure for the handling of random effects and temporal autocorrelation. Résumé La sécheresse estivale, c'est à dire un temps exceptionnellement sec et chaud, a été un facteur significatif de stress pour l'Epicéa commun dans le sud-est de la Norvège au cours de 14 années de surveillance. Les étés secs et chauds ont été suivis d'une augmentation de la défoliation, des colorations anormales du feuillage, de la formation de cônes et de la mortalité. Les mécanismes causaux sont discutés. La défoliation peut probablement s'expliquer par une chute automnale des aiguilles après les étés secs. Pendant la période de suivi de 1988 à 2001, le sud-est de la Norvège a été affecté de façon répétée par des sécheresses estivales, en particulier au début des années 1990. La base de données comprend 455 ,parcelles d'agents forestiers' avec des données annuelles sur l'état des houppiers et la mortalité. Des modèles linéaires mixtes ont été utilisés pour tester les hypothèses et faire les estimations, en incluant une structure de variance-covariance pour prendre en compte les effets aléatoires et les auto-corrélations temporelles. Zusammenfassung Sommertrockenheit, d.h. ungewöhnlich trockenes und warmes Wetter, war ein wesentlicher Stressfaktor für die Fichte (Picea abies) in Südwestnorwegen während der 14 Jahre, in denen der Waldzustand bisher erfasst wurde. Nach trockenen und warmen Sommern nahmen der Nadelverlust, die Nadelverfärbung, die Zapfenbildung und die Mortalität zu. Die ursächlichen Mechanismen hierfür werden diskutiert. Am wahrscheinlichsten ist der Blattverlust das Ergebnis eines erhöhten Nadelfalles im Herbst nach einem trockenen Sommer. Während der Beobachtungsperiode von 1988 bis 2001 traten in Südwestnorwegen wiederholt trockene Sommer auf, insbesondere zu Beginn der 90er Jahre. Das Datenset umfasste 455 Stichprobeflächen mit jährlichen Angaben zum Kronenzustand und zur Mortalität. Für die statistische Analyse wurden lineare Modelle mit gemischten Effekten verwendet, einschliesslich einer Varianz-Kovarianzstruktur für die zeitreihenbedingten Autokorrelationen. [source] ON-THE-JOB SEARCH, PRODUCTIVITY SHOCKS, AND THE INDIVIDUAL EARNINGS PROCESS,INTERNATIONAL ECONOMIC REVIEW, Issue 3 2010Fabien Postel-Vinay Individual labor earnings observed in worker panel data have complex, highly persistent dynamics. We investigate the capacity of a structural job search model with on-the-job search, wage renegotiation by mutual consent, and i.i.d. productivity shocks to replicate salient properties of these dynamics, such as the covariance structure of earnings, the evolution of individual earnings mean, and variance with the duration of uninterrupted employment, or the distribution of year-to-year earnings changes. Structural estimation of our model on a 12-year panel of highly educated British workers shows that our simple framework produces a dynamic earnings structure that is remarkably consistent with the data. [source] Multivariate analyses of carcass traits for Angus cattle fitting reduced rank and factor analytic modelsJOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 2 2007K. Meyer Summary Multivariate analyses of carcass traits for Angus cattle, consisting of six traits recorded on the carcass and eight auxiliary traits measured by ultrasound scanning of live animals, are reported. Analyses were carried out by restricted maximum likelihood, fitting a number of reduced rank and factor analytic models for the genetic covariance matrix. Estimates of eigenvalues and eigenvectors for different orders of fit are contrasted and implications for the estimates of genetic variances and correlations are examined. Results indicate that at most eight principal components (PCs) are required to model the genetic covariance structure among the 14 traits. Selection index calculations suggest that the first seven of these PCs are sufficient to obtain estimates of breeding values for the carcass traits without loss in the expected accuracy of evaluation. This implied that the number of effects fitted in genetic evaluation for carcass traits can be halved by estimating breeding values for the leading PCs directly. [source] Using unlabelled data to update classification rules with applications in food authenticity studiesJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 1 2006Nema Dean Summary., An authentic food is one that is what it purports to be. Food processors and consumers need to be assured that, when they pay for a specific product or ingredient, they are receiving exactly what they pay for. Classification methods are an important tool in food authenticity studies where they are used to assign food samples of unknown type to known types. A classification method is developed where the classification rule is estimated by using both the labelled and the unlabelled data, in contrast with many classical methods which use only the labelled data for estimation. This methodology models the data as arising from a Gaussian mixture model with parsimonious covariance structure, as is done in model-based clustering. A missing data formulation of the mixture model is used and the models are fitted by using the EM and classification EM algorithms. The methods are applied to the analysis of spectra of food-stuffs recorded over the visible and near infra-red wavelength range in food authenticity studies. A comparison of the performance of model-based discriminant analysis and the method of classification proposed is given. The classification method proposed is shown to yield very good misclassification rates. The correct classification rate was observed to be as much as 15% higher than the correct classification rate for model-based discriminant analysis. [source] A spatiotemporal model for Mexico City ozone levelsJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 2 2004Gabriel Huerta Summary., We consider hourly readings of concentrations of ozone over Mexico City and propose a model for spatial as well as temporal interpolation and prediction. The model is based on a time-varying regression of the observed readings on air temperature. Such a regression requires interpolated values of temperature at locations and times where readings are not available. These are obtained from a time-varying spatiotemporal model that is coupled to the model for the ozone readings. Two location-dependent harmonic components are added to account for the main periodicities that ozone presents during a given day and that are not explained through the covariate. The model incorporates spatial covariance structure for the observations and the parameters that define the harmonic components. Using the dynamic linear model framework, we show how to compute smoothed means and predictive values for ozone. We illustrate the methodology on data from September 1997. [source] Prediction in ARMA Models with GARCH in Mean EffectsJOURNAL OF TIME SERIES ANALYSIS, Issue 5 2001Menelaos Karanasos This paper considers forecasting the conditional mean and variance from an ARMA model with GARCH in mean effects. Expressions for the optimal predictors and their conditional and unconditional MSEs are presented. We also derive the formula for the covariance structure of the process and its conditional variance. JEL. C22. [source] Modelling covariance structure in ascending dose studies of isolated tissues and organsPHARMACEUTICAL STATISTICS: THE JOURNAL OF APPLIED STATISTICS IN THE PHARMACEUTICAL INDUSTRY, Issue 2 2003Richard John Brammer This paper describes the analysis of two pharmacology assays: the guinea pig papillary muscle assay (an example of an isolated tissue assay) and an assay looking at pressure changes in isolated rat lungs. Both assays use an ascending dose design to minimize carryover effects. This is often necessary in these studies, due to the limited life span of the tissues. Various mixed models, with different covariance structures, are fitted to find the most appropriate model. These are then compared to two other possible methods of analysis: paired t-tests and two-way analysis of variance. For both assays, the mixed model was found to be the best approach. These examples illustrate the importance of modelling covariance structure correctly in any ascending dose study, whether in isolated organs/tissues, in animals or phase I volunteers. Copyright © 2003 John Wiley & Sons, Ltd. [source] The Covariance Structure of Italian Male WagesTHE MANCHESTER SCHOOL, Issue 6 2000Lorenzo Cappellari Using an unbalanced panel of Italian male wages covering the 1974,88 interval, in this study we estimate the parameters of the wage covariance structure by minimum distance. Estimated variance components models allow for a linear trend in permanent wages, so that wage profile convergence can be assessed by considering the covariance between intercepts and slopes of such individual trends. Evidence of permanent wage convergence is found in the overall wage distribution, but not within white collar workers data; this contrasts with human capital interpretations of wage dynamics and suggests that other factors, such as the egalitarian wage-setting framework fully effective until the mid-1980s, could have played a major role in shaping the wage distribution. [source] COVARIATE-ADJUSTED REGRESSION FOR LONGITUDINAL DATA INCORPORATING CORRELATION BETWEEN REPEATED MEASUREMENTSAUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 3 2009Danh V. Nguyen Summary We propose an estimation method that incorporates the correlation/covariance structure between repeated measurements in covariate-adjusted regression models for distorted longitudinal data. In this distorted data setting, neither the longitudinal response nor (possibly time-varying) predictors are directly observable. The unobserved response and predictors are assumed to be distorted/contaminated by unknown functions of a common observable confounder. The proposed estimation methodology adjusts for the distortion effects both in estimation of the covariance structure and in the regression parameters using generalized least squares. The finite-sample performance of the proposed estimators is studied numerically by means of simulations. The consistency and convergence rates of the proposed estimators are also established. The proposed method is illustrated with an application to data from a longitudinal study of cognitive and social development in children. [source] Gaussian Process Functional Regression Modeling for Batch DataBIOMETRICS, Issue 3 2007J. Q. Shi Summary A Gaussian process functional regression model is proposed for the analysis of batch data. Covariance structure and mean structure are considered simultaneously, with the covariance structure modeled by a Gaussian process regression model and the mean structure modeled by a functional regression model. The model allows the inclusion of covariates in both the covariance structure and the mean structure. It models the nonlinear relationship between a functional output variable and a set of functional and nonfunctional covariates. Several applications and simulation studies are reported and show that the method provides very good results for curve fitting and prediction. [source] Bayesian Covariance Selection in Generalized Linear Mixed ModelsBIOMETRICS, Issue 2 2006Bo Cai Summary The generalized linear mixed model (GLMM), which extends the generalized linear model (GLM) to incorporate random effects characterizing heterogeneity among subjects, is widely used in analyzing correlated and longitudinal data. Although there is often interest in identifying the subset of predictors that have random effects, random effects selection can be challenging, particularly when outcome distributions are nonnormal. This article proposes a fully Bayesian approach to the problem of simultaneous selection of fixed and random effects in GLMMs. Integrating out the random effects induces a covariance structure on the multivariate outcome data, and an important problem that we also consider is that of covariance selection. Our approach relies on variable selection-type mixture priors for the components in a special Cholesky decomposition of the random effects covariance. A stochastic search MCMC algorithm is developed, which relies on Gibbs sampling, with Taylor series expansions used to approximate intractable integrals. Simulated data examples are presented for different exponential family distributions, and the approach is applied to discrete survival data from a time-to-pregnancy study. [source] Statistical Assessment of Numerical Models,INTERNATIONAL STATISTICAL REVIEW, Issue 2 2003Montserrat Fuentes Summary Evaluation of physically based computer models for air quality applications is crucial to assist in control strategy selection. The high risk of getting the wrong control strategy has costly economic and social consequences. The objective comparison of modeled concentrations with observed field data is one approach to assessment of model performance. For dry deposition fluxes and concentrations of air pollutants there is a very limited supply of evaluation data sets. We develop a formal method for evaluation of the performance of numerical models, which can be implemented even when the field measurements are very sparse. This approach is applied to a current U.S. Environmental Protection Agency air quality model. In other cases, exemplified by an ozone study from the California Central Valley, the observed field is relatively data rich, and more or less standard geostatistical tools can be used to compare model to data. Yet another situation is when the cost of model runs is prohibitive, and a statistical approach to approximating the model output is needed. We describe two ways of obtaining such approximations. A common technical issue in the assessment of environmental numerical models is the need for tools to estimate nonstationary spatial covariance structures. We describe in detail two such approaches. Résumé L'évaluation de modèles informatiques à bases physiques pour des applications à la qualité de l'air est cruciale pour aider à la sélection d'une stratégie de contrôle. Le choix d'une mauvaise stratégie de contrôle peut avoir des conséquences economiques et sociales coúteuses. Une approche pour évaluer la performance du modèle est la comparaison objective de concentrations modélisées avec des données de terrain observées. Pour les flux de dépôts secs et les concentrations de polluants de l'air, l'offre de données d'évaluation est très limitée. Nous développons une méthode formelle pour évaluer la performance de modèles numériques, qui peut être mise en oeuvre même lorsque les mesures de terrain sont trés clairsemées. Cette approche est appliquée à un modèle de qualité de l'air de l'Agence de la Protection de l'Environnement Américaine. Dans d'autres cas, comme une étude de l'ozone de la vallée Californienne centrale, le champ observé est relativement riche en données, et l'on peut utiliser peu ou prou des outils géostatistiques standards pour comparer le modèle aux données. Une autre situation se présente quand le coút du modèle est prohibitif et qu'une approche statistique pour effectuer des approximations des sorties du modèle est nécessaire. Nous décrivons deux manières d'obtenir de telles approximations. Un problème technique commun à l'évaluation des modèles environnementaux numériques est le besoin d'outils pour estimer les structures de la covariance spatiale non stationnaire. Nous decrivons en detail deux de ces approches. [source] Bilinear modelling of batch processes.JOURNAL OF CHEMOMETRICS, Issue 5 2008Part I: theoretical discussion Abstract When studying the principal component analysis (PCA) or partial least squares (PLS) modelling of batch process data, one realizes that there is a wide range of approaches. In many cases, new modelling approaches are presented just because they work properly for a particular application, for example, on-line monitoring and a given number of processes. A clear understanding of why these approaches perform successfully and which are the advantages and disadvantages in front of the others is seldom supplied. Why does modelling after batch-wise unfolding capture changing dynamics? What are the consequences of variable-wise unfolding? Is there any best unfolding method? When should several models for a single process be used? In this paper, it is shown how these and other related questions can be answered by properly analyzing the dynamic covariance structures of the various approaches. Copyright © 2008 John Wiley & Sons, Ltd. [source] Wavelet-based functional mixed modelsJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 2 2006Jeffrey S. Morris Summary., Increasingly, scientific studies yield functional data, in which the ideal units of observation are curves and the observed data consist of sets of curves that are sampled on a fine grid. We present new methodology that generalizes the linear mixed model to the functional mixed model framework, with model fitting done by using a Bayesian wavelet-based approach. This method is flexible, allowing functions of arbitrary form and the full range of fixed effects structures and between-curve covariance structures that are available in the mixed model framework. It yields nonparametric estimates of the fixed and random-effects functions as well as the various between-curve and within-curve covariance matrices. The functional fixed effects are adaptively regularized as a result of the non-linear shrinkage prior that is imposed on the fixed effects' wavelet coefficients, and the random-effect functions experience a form of adaptive regularization because of the separately estimated variance components for each wavelet coefficient. Because we have posterior samples for all model quantities, we can perform pointwise or joint Bayesian inference or prediction on the quantities of the model. The adaptiveness of the method makes it especially appropriate for modelling irregular functional data that are characterized by numerous local features like peaks. [source] A Bayesian Spatial Multimarker Genetic Random-Effect Model for Fine-Scale MappingANNALS OF HUMAN GENETICS, Issue 5 2008M.-Y. Tsai Summary Multiple markers in linkage disequilibrium (LD) are usually used to localize the disease gene location. These markers may contribute to the disease etiology simultaneously. In contrast to the single-locus tests, we propose a genetic random effects model that accounts for the dependence between loci via their spatial structures. In this model, the locus-specific random effects measure not only the genetic disease risk, but also the correlations between markers. In other words, the model incorporates this relation in both mean and covariance structures, and the variance components play important roles. We consider two different settings for the spatial relations. The first is our proposal, relative distance function (RDF), which is intuitive in the sense that markers nearby are likely to correlate with each other. The second setting is a common exponential decay function (EDF). Under each setting, the inference of the genetic parameters is fully Bayesian with Markov chain Monte Carlo (MCMC) sampling. We demonstrate the validity and the utility of the proposed approach with two real datasets and simulation studies. The analyses show that the proposed model with either one of two spatial correlations performs better as compared with the single locus analysis. In addition, under the RDF model, a more precise estimate for the disease locus can be obtained even when the candidate markers are fairly dense. In all simulations, the inference under the true model provides unbiased estimates of the genetic parameters, and the model with the spatial correlation structure does lead to greater confidence interval coverage probabilities. [source] Dynamic Conditionally Linear Mixed Models for Longitudinal DataBIOMETRICS, Issue 1 2002M. Pourahmadi Summary. We develop a new class of models, dynamic conditionally linear mixed models, for longitudinal data by decomposing the within-subject covariance matrix using a special Cholesky decomposition. Here ,dynamic' means using past responses as covariates and ,conditional linearity' means that parameters entering the model linearly may be random, but nonlinear parameters are nonrandom. This setup offers several advantages and is surprisingly similar to models obtained from the first-order linearization method applied to nonlinear mixed models. First, it allows for flexible and computationally tractable models that include a wide array of covariance structures; these structures may depend on covariates and hence may differ across subjects. This class of models includes, e.g., all standard linear mixed models, antedependence models, and Vonesh-Carter models. Second, it guarantees the fitted marginal covariance matrix of the data is positive definite. We develop methods for Bayesian inference and motivate the usefulness of these models using a series of longitudinal depression studies for which the features of these new models are well suited. [source] |