Statistical Society (statistical + society)

Distribution by Scientific Domains

Kinds of Statistical Society

  • royal statistical society

  • Selected Abstracts

    Statistical issues in first-in-man studies

    Professor Stephen Senn
    Preface., In March 2006 a first-in-man trial took place using healthy volunteers involving the use of monoclonal antibodies. Within hours the subjects had suffered such adverse effects that they were admitted to intensive care at Northwick Park Hospital. In April 2006 the Secretary of State for Health announced the appointment of Professor (now Sir) Gordon Duff, who chairs the UK's Commission on Human Medicines, to chair a scientific expert group on phase 1 clinical trials. The group reported on December 7th, 2006 (Expert Scientific Group on Clinical Trials, 2006a). Clinical trials have a well-established regulatory basis both in the UK and worldwide. Trials have to be approved by the regulatory authority and are subject to a detailed protocol concerning, among other things, the study design and statistical analyses that will form the basis of the evaluation. In fact, a cornerstone of the regulatory framework is the statistical theory and methods that underpin clinical trials. As a result, the Royal Statistical Society established an expert group of its own to look in detail at the statistical issues that might be relevant to first-in-man studies. The group mainly comprised senior Fellows of the Society who had expert knowledge of the theory and application of statistics in clinical trials. However, the group also included an expert immunologist and clinicians to ensure that the interface between statistics and clinical disciplines was not overlooked. In addition, expert representation was sought from Statisticians in the Pharmaceutical Industry (PSI), an organization with which the Royal Statistical Society has very close links. The output from the Society's expert group is contained in this report. It makes a number of recommendations directed towards the statistical aspects of clinical trials. As such it complements the report by Professor Duff's group and will, I trust, contribute to a safer framework for first-in-man trials in the future. Tim Holt (President, Royal Statistical Society) [source]

    Transactions of the Statistical Society of London (1837)

    Sidney Rosenbaum
    Summary. The Transactions of the Statistical Society of London (1837) appeared before the journal of the Royal Statistical Society began publication and represents the substantial statistical work that had been undertaken in the early years of the existence of the Society. The contents of this publication are summarized here against the historical background of the time. [source]

    Statistical review by research ethics committees

    P. Williamson
    This paper discusses some of the issues surrounding statistical review by research ethics committees (RECs). A survey of local RECs in 1997 revealed that only 27/184 (15%) included a statistician member at that time, although 70/175 (40%) recognized the need for such. The role of the statistician member is considered and the paper includes a summary of a meeting of the Royal Statistical Society to discuss statistical issues that frequently arise in the review of REC applications. A list of minimum qualifications which RECs should expect from anyone claiming to be a statistician would be useful, together with a list of statisticians who are well qualified and willing to serve on RECs, and a list of training courses for REC members covering the appropriate statistical issues. [source]

    Discovering the false discovery rate

    Yoav Benjamini
    Summary., I describe the background for the paper ,Controlling the false discovery rate: a new and powerful approach to multiple comparisons' by Benjamini and Hochberg that was published in the Journal of the Royal Statistical Society, Series B, in 1995. I review the progress since made on the false discovery rate, as well as the major conceptual developments that followed. [source]

    Variability explained by covariates in linear mixed-effect models for longitudinal data

    Bo Hu
    Abstract Variability explained by covariates or explained variance is a well-known concept in assessing the importance of covariates for dependent outcomes. In this paper we study R2 statistics of explained variance pertinent to longitudinal data under linear mixed-effect models, where the R2 statistics are computed at two different levels to measure, respectively, within- and between-subject variabilities explained by the covariates. By deriving the limits of R2 statistics, we find that the interpretation of explained variance for the existing R2 statistics is clear only in the case where the covariance matrix of the outcome vector is compound symmetric. Two new R2 statistics are proposed to address the effect of time-dependent covariate means. In the general case where the outcome covariance matrix is not compound symmetric, we introduce the concept of compound symmetry projection and use it to define level-one and level-two R2 statistics. Numerical results are provided to support the theoretical findings and demonstrate the performance of the R2 statistics. The Canadian Journal of Statistics 38: 352,368; 2010 © 2010 Statistical Society of Canada La variation expliquée par les covariables (ou la variance expliquée) est un concept bien connu pour mesurer l'importance de ces covariables sur la variable dépendante. Dans cet article, nous étudions la statistique du R carré pour la variance expliquée pertinente aux données longitudinales pour des modèles linéaires à effets mixtes. La statistique du R carré est calculée à deux niveaux différents pour mesurer la variation expliquée par les covariables à l'intérieur et entre les sujets. En obtenant des limites aux statistiques du R carré, nous trouvons que l'interprétation de la variance expliquée pour les statistiques du R carré existantes est claire seulement dans le cas où la matrice de variance-covariance des observations dépendantes est symétrique composée. Deux nouvelles statistiques du R carré sont proposées afin de prendre en compte les effets des moyennes des covariables pouvant dépendre du temps. Dans le cas général où la matrice de variance-covariance des observations n'est pas symétrique composée, nous introduisons le concept de projection symétrique composée et nous l'utilisons pour définir les statistiques du R carré de niveaux 1 et 2. Des résultats numériques appuient nos résultats théoriques et ils montrent la performance des statistiques du R carré. La revue canadienne de statistique 38: 352,368; 2010 © 2010 Société statistique du Canada [source]

    Small area estimation of poverty indicators

    Isabel Molina
    Abstract The authors propose to estimate nonlinear small area population parameters by using the empirical Bayes (best) method, based on a nested error model. They focus on poverty indicators as particular nonlinear parameters of interest, but the proposed methodology is applicable to general nonlinear parameters. They use a parametric bootstrap method to estimate the mean squared error of the empirical best estimators. They also study small sample properties of these estimators by model-based and design-based simulation studies. Results show large reductions in mean squared error relative to direct area-specific estimators and other estimators obtained by "simulated" censuses. The authors also apply the proposed method to estimate poverty incidences and poverty gaps in Spanish provinces by gender with mean squared errors estimated by the mentioned parametric bootstrap method. For the Spanish data, results show a significant reduction in coefficient of variation of the proposed empirical best estimators over direct estimators for practically all domains. The Canadian Journal of Statistics 38: 369,385; 2010 © 2010 Statistical Society of Canada Les auteurs proposent d'estimer les paramètres non linéaires d'une population de petits domaines en utilisant une méthode bayésienne empirique. L'emphase est mise sur les indicateurs de pauvreté comme paramètres non linéaires d'intérêt particuliers, mais ils proposent une méthodologie qui s'applique à des paramètres non linéaires plus généraux. Ils utilisent une méthode de rééchantillonnage paramétrique pour estimer l'erreur quadratique moyenne du meilleur estimateur empirique. À l'aide de simulations basées sur le modèle et sur le plan de sondage, ils étudient les propriétés de ces estimateurs pour les petits échantillons. Les résultats obtenus montrent une grande réduction de l'erreur quadratique moyenne par rapport aux estimateurs propres aux régions et les autres estimateurs obtenus par recensements « simulés». Les auteurs ont aussi appliqué la méthodologie proposée à l'estimation des incidences de pauvreté et des disparités, en fonction du sexe, du niveau de la pauvreté des provinces espagnoles. Les erreurs quadratiques moyennes sont estimées en utilisant la méthode de rééchantillonnage paramétrique citée auparavant. Pour les données espagnoles, les résultats montrent une réduction substantielle du coefficient de variation des meilleurs estimateurs empiriques proposés par rapport aux estimateurs spécifiques pour pratiquement tous les domaines. La revue canadienne de statistique 38: 369,385; 2010 © 2010 Société statistique du Canada [source]

    Two-part regression models for longitudinal zero-inflated count data

    Marco Alfò
    Abstract Two-part models are quite well established in the economic literature, since they resemble accurately a principal-agent type model, where homogeneous, observable, counted outcomes are subject to a (prior, exogenous) selection choice. The first decision can be represented by a binary choice model, modeled using a probit or a logit link; the second can be analyzed through a truncated discrete distribution such as a truncated Poisson, negative binomial, and so on. Only recently, a particular attention has been devoted to the extension of two-part models to handle longitudinal data. The authors discuss a semi-parametric estimation method for dynamic two-part models and propose a comparison with other, well-established alternatives. Heterogeneity sources that influence the first level decision process, that is, the decision to use a certain service, are assumed to influence also the (truncated) distribution of the positive outcomes. Estimation is carried out through an EM algorithm without parametric assumptions on the random effects distribution. Furthermore, the authors investigate the extension of the finite mixture representation to allow for unobservable transition between components in each of these parts. The proposed models are discussed using empirical as well as simulated data. The Canadian Journal of Statistics 38: 197,216; 2010 © 2010 Statistical Society of Canada Les modèles en deux parties sont bien établis dans la littérature économique puisqu'ils sont très similaires à un modèle principal-agent pour lequel les résultats homogènes, observables et dénombrables sont sujets à un critère de sélection (exogène et a priori). La première décision est représentée à l'aide un modèle de choix binaire et une fonction de lien probit ou logit tandis que la seconde peut être analysée à l'aide d'une loi discrète tronquée telle que la loi de Poisson tronquée, la loi binomiale négative, etc. Depuis peu, une attention particulière a été portée à la généralisation du modèle en deux parties pour prendre en compte les données longitudinales. Les auteurs présentent une méthode d'estimation semi-paramétrique pour les modèles en deux parties dynamiques et ils les comparent avec d'autres modèles alternatifs bien connus. Les sources hétérogènes qui influencent le premier niveau du processus de décision, c'est-à-dire la décision d'utiliser un certain service, sont censées influencer aussi la distribution (tronquée) des résultats positifs. L'estimation est faite à l'aide de l'algorithme EM sans présupposés paramétriques sur la distribution des effets aléatoires. De plus, les auteurs considèrent une généralisation à une représentation en mélange fini afin de permettre une transition non observable entre les différentes composantes de chacune des parties. Une discussion est faite sur les modèles présentés en utilisant des données empiriques ou simulées. La revue canadienne de statistique 38: 197,216; 2010 © 2010 Société statistique du Canada [source]

    Modified weights based generalized quasilikelihood inferences in incomplete longitudinal binary models

    Brajendra C. Sutradhar
    Abstract In an incomplete longitudinal set up, a small number of repeated responses subject to an appropriate missing mechanism along with a set of covariates are collected from a large number of independent individuals over a small period of time. In this set up, the regression effects of the covariates are routinely estimated by solving certain inverse weights based generalized estimating equations. These inverse weights are introduced to make the estimating equation unbiased so that a consistent estimate of the regression parameter vector may be obtained. In the existing studies, these weights are in general formulated conditional on the past responses. Since the past responses follow a correlation structure, the present study reveals that if the longitudinal data subject to missing mechanism are generated by accommodating the longitudinal correlation structure, the conditional weights based on past correlated responses may yield biased and hence inconsistent regression estimates. The bias appears to get larger as the correlation increases. As a remedy, in this paper the authors proposed a modification to the formulation of the existing weights so that weights are not affected directly or indirectly by the correlations. They have then exploited these modified weights to form a weighted generalized quasi-likelihood estimating equation that yields unbiased and hence consistent estimates for the regression effects irrespective of the magnitude of correlation. The efficiencies of the regression estimates follow due to the use of the true correlation structure as a separate longitudinal weights matrix in the estimating equation. The Canadian Journal of Statistics © 2010 Statistical Society of Canada Dans un cadre de données longitudinales incomplètes, nous observons un petit nombre de réponses répétées sujettes à un mécanisme de valeurs manquantes approprié avec un ensemble de covariables provenant d'un grand nombre d'individus indépendants observés sur une petite période de temps. Dans ce cadre, les composantes de régression des covariables sont habituellement estimées en résolvant certains poids inverses obtenus à partir d'équations d'estimation généralisées. Ces poids inverses sont utilisés afin de rendre les équations d'estimation sans biais et ainsi permettre d'obtenir des estimateurs cohérents pour le vecteur des paramètres de régressions. Dans les études déjà existantes, ces poids sont généralement formulés conditionnement aux réponses passées. Puisque les réponses passées possèdent une structure de corrélation, cet article révèle que si les données longitudinales, soumises à un mécanisme de valeurs manquantes, sont générées en adaptant la structure de corrélation longitudinale, alors les poids conditionnels basés sur les réponses corrélées passées peuvent mener à des estimations biaisées, et conséquemment non cohérentes, des composantes de régression. Ce biais semble augmenter lorsque la corrélation augmente. Pour remédier à cette situation, les auteurs proposent dans cet article, une modification aux poids déjà existants afin que ceux-ci ne soient plus affectés directement ou indirectement par les corrélations. Par la suite, ils ont exploité ces poids modifiés pour obtenir une équation d'estimation généralisée pondérée basée sur la quasi-vraisemblance qui conduit à des estimateurs sans biais, et ainsi cohérents, pour les composantes de régression sans égard à l'ampleur de la corrélation. L'efficacité de ces estimateurs est attribuable à l'utilisation de la vraie structure de corrélation comme matrice de poids longitudinale à part dans l'équation d'estimation. La revue canadienne de statistique © 2010 Société statistique du Canada [source]

    Estimation methods for time-dependent AUC models with survival data

    Hung Hung
    Abstract The performance of clinical tests for disease screening is often evaluated using the area under the receiver-operating characteristic (ROC) curve (AUC). Recent developments have extended the traditional setting to the AUC with binary time-varying failure status. Without considering covariates, our first theme is to propose a simple and easily computed nonparametric estimator for the time-dependent AUC. Moreover, we use generalized linear models with time-varying coefficients to characterize the time-dependent AUC as a function of covariate values. The corresponding estimation procedures are proposed to estimate the parameter functions of interest. The derived limiting Gaussian processes and the estimated asymptotic variances enable us to construct the approximated confidence regions for the AUCs. The finite sample properties of our proposed estimators and inference procedures are examined through extensive simulations. An analysis of the AIDS Clinical Trials Group (ACTG) 175 data is further presented to show the applicability of the proposed methods. The Canadian Journal of Statistics 38:8,26; 2010 © 2009 Statistical Society of Canada La performance des tests cliniques pour le dépistage de maladie est souvent évaluée en utilisant l'aire sous la courbe caractéristique de fonctionnements du récepteur (, ROC , ), notée , AUC , . Des développements récents ont généralisé le cadre traditionnel à l'AUC avec un statut de panne binaire variant dans le temps. Sans considérer les covariables, nous commençons par proposer un estimateur non paramétrique pour l'AUC simple et facile à calculer. De plus, nous utilisons des modèles linéaires généralisés avec des coefficients dépendant du temps pour caractériser les AUC, dépendant du temps, comme fonction des covariables. Les procédures d'estimation asociées correspondantes sont proposées afin d'estimer les fonctions paramètres d'intérêt. Les processus gaussiens limites sont obtenus ainsi que les variances asymptotiques estimées afin de construire des régions de confiance approximatives pour les AUC. À l'aide de nombreuses simulations, les propriétés pour de petits échantillons des estimateurs proposés et des procédures d'inférence sont étudiées. Une analyse du groupe d'essais cliniques sur le sida 175 (ACTG 175) est aussi présentée afin de montrer l'applicabilité des méthodes proposées. La revue canadienne de statistique 38: 8,26; 2010 © 2009 Société statistique du Canada [source]

    Nonparametric covariate adjustment for receiver operating characteristic curves

    Fang Yao
    Abstract The accuracy of a diagnostic test is typically characterized using the receiver operating characteristic (ROC) curve. Summarizing indexes such as the area under the ROC curve (AUC) are used to compare different tests as well as to measure the difference between two populations. Often additional information is available on some of the covariates which are known to influence the accuracy of such measures. The authors propose nonparametric methods for covariate adjustment of the AUC. Models with normal errors and possibly non-normal errors are discussed and analyzed separately. Nonparametric regression is used for estimating mean and variance functions in both scenarios. In the model that relaxes the assumption of normality, the authors propose a covariate-adjusted Mann,Whitney estimator for AUC estimation which effectively uses available data to construct working samples at any covariate value of interest and is computationally efficient for implementation. This provides a generalization of the Mann,Whitney approach for comparing two populations by taking covariate effects into account. The authors derive asymptotic properties for the AUC estimators in both settings, including asymptotic normality, optimal strong uniform convergence rates and mean squared error (MSE) consistency. The MSE of the AUC estimators was also assessed in smaller samples by simulation. Data from an agricultural study were used to illustrate the methods of analysis. The Canadian Journal of Statistics 38:27,46; 2010 © 2009 Statistical Society of Canada La précision d'un test diagnostique est habituellement établie en utilisant les courbes caracté-ristiques de fonctionnement du récepteur (« ROC »). Des statistiques telles que l'aire sous la courbe ROC (« AUC ») sont utilisées afin de comparer différents tests et pour mesurer la différence entre deux populations. Souvent de l'information supplémentaire est disponible sur quelques covariables dont l'influence sur de telles statistiques est connue. Les auteurs suggèrent des méthodes non paramétriques afin d'ajuster la statistique AUC pour prendre en compte les covariables. Des modèles avec des erreurs gaussiennes et même non gaussiennes sont présentés et analysés séparément. Une régression non paramétrique est utilisée afin d'estimer les fonctions moyenne et variance dans les deux scénarios. Pour le modèle sans l'hypothèse de normalité, les auteurs proposent un estimateur de Mann-Whithney tenant compte des covariables pour l'AUC qui utilise l'information disponible dans les données afin de construire des échantillons d'analyse pour n'importe quelle valeur des covariables. Cet estimateur est implanté, car il est calculable de façon efficace. Il généralise l'approche de Mann-Whitney pour comparer deux populations en considérant l'effet des covariables. Les auteurs obtiennent les propriétés asymptotiques des estimateurs AUC pour les deux scénarios incluant la normalité asymptotique, les vitesses optimales de convergence uniforme forte et la convergence en erreur quadratique moyenne (« MSE »). Le MSE de l'estimateur de l'AUC est aussi étudié pour les petits échantillons à l'aide de simulations. Des données provenant d'une étude dans le domaine agricole sont utilisées afin d'illustrer les méthodes d'analyse. La revue canadienne de statistique 38: 27,46; 2010 © 2009 Sociètè statistique du Canada [source]

    An efficient computational approach for prior sensitivity analysis and cross-validation

    Luke Bornn
    Abstract Prior sensitivity analysis and cross-validation are important tools in Bayesian statistics. However, due to the computational expense of implementing existing methods, these techniques are rarely used. In this paper, the authors show how it is possible to use sequential Monte Carlo methods to create an efficient and automated algorithm to perform these tasks. They apply the algorithm to the computation of regularization path plots and to assess the sensitivity of the tuning parameter in g -prior model selection. They then demonstrate the algorithm in a cross-validation context and use it to select the shrinkage parameter in Bayesian regression. The Canadian Journal of Statistics 38:47,64; 2010 © 2010 Statistical Society of Canada La sensibilité à la loi a priori et la validation croisée sont des outils importants des statistiques bayésiennes. Toutefois, ces techniques sont rarement utilisées en pratique car les méthodes disponibles pour les implémenter sont numériquement très coûteuses. Dans ce papier, les auteurs montrent comment il est possible d'utiliser les méthodes de Monte Carlo séquentielles pour obtenir un algorithme efficace et automatique pour implémenter ces techniques. Ils appliquent cet algorithme au calcul des chemins de régularisation pour un problème de régression et à la sensibilité du paramètre de la loi a priori de Zellner pour un problème de sélection de variables. Ils appliquent ensuite cet algorithme pour la validation croisée et l'utilisent afin de sélectionner le paramètre de régularisation dans un problème de régression bayésienne. La revue canadienne de statistique 38: 47,64; 2010 © 2010 Société statistique du Canada [source]

    On the Ghoudi, Khoudraji, and Rivest test for extreme-value dependence

    Noomen Ben Ghorbal
    Abstract Ghoudi, Khoudraji & Rivest [The Canadian Journal of Statistics 1998;26:187,197] showed how to test whether the dependence structure of a pair of continuous random variables is characterized by an extreme-value copula. The test is based on a U -statistic whose finite- and large-sample variance are determined by the present authors. They propose estimates of this variance which they compare to the jackknife estimate of Ghoudi, Khoudraji & Rivest (1998) through simulations. They study the finite-sample and asymptotic power of the test under various alternatives. They illustrate their approach using financial and geological data. The Canadian Journal of Statistics © 2009 Statistical Society of Canada Ghoudi, Khoudraji & Rivest (1998) ont montré comment tester que la structure de dépendance d'un couple d'aléas continus est caractérisée par une copule de valeurs extrêmes. Le test s'appuie sur une U -statistique dont les auteurs déterminent ici la variance asymptotique et à taille finie. Ils proposent des estimations de cette variance qu'ils comparent à l'estimateur jackknife de Ghoudi, Khoudraji & Rivest (1998) à l'aide de simulations. Ils étudient les puissances à taille finie et asymptotique du test sous diverses contre-hypothèses. Ils illustrent leur propos avec des données financières et géologiques. La revue canadienne de statistique © 2009 Société statistique du Canada [source]

    Modeling multiple-response categorical data from complex surveys

    Christopher R. Bilder
    Abstract Although "choose all that apply" questions are common in modern surveys, methods for analyzing associations among responses to such questions have only recently been developed. These methods are generally valid only for simple random sampling, but these types of questions often appear in surveys conducted under more complex sampling plans. The purpose of this article is to provide statistical analysis methods that can be applied to "choose all that apply" questions in complex survey sampling situations. Loglinear models are developed to incorporate the multiple responses inherent in these types of questions. Statistics to compare models and to measure association are proposed and their asymptotic distributions are derived. Monte Carlo simulations show that tests based on adjusted Pearson statistics generally hold their correct size when comparing models. These simulations also show that confidence intervals for odds ratios estimated from loglinear models have good coverage properties, while being shorter than those constructed using empirical estimates. Furthermore, the methods are shown to be applicable to more general problems of modeling associations between elements of two or more binary vectors. The proposed analysis methods are applied to data from the National Health and Nutrition Examination Survey. The Canadian Journal of Statistics © 2009 Statistical Society of Canada Quoique les questions du type « Sélectionner une ou plusieurs réponses » sont courantes dans les enquêtes modernes, les méthodes pour analyser les associations entre les réponses à de telles questions viennent seulement d'être développées. Ces méthodes sont habituellement valides uni-quement pour des échantillons aléatoires simples, mais ce genre de questions apparaissent souvent dans les enquêtes conduites sous des plans de sondage beaucoup plus complexes. Le but de cet article est de donner des méthodes d'analyse statistique pouvant être appliquées aux questions de type « Sélectionner une ou plusieurs réponses » dans des enquêtes utilisant des plans de sondage complexes. Des modèles loglinéaires sont développés permettant d'incorporer les réponses multiples inhérentes à ce type de questions. Des statistiques permettant de comparer les modèles et de mesu-rer l'association sont proposées et leurs distributions asymptotiques sont obtenues. Des simulations de Monte-Carlo montrent que les tests basés sur les statistiques de Pearson ajustées maintiennent généralement leur niveau lorsqu'ils sont utilisés pour comparer des modèles. Ces études montrent également que les niveaux des intervalles de confiance pour les rapports de cotes estimés à par-tir des modèles loglinéaires ont de bonnes propriétés de couverture tout en étant plus courts que ceux utilisant les estimations empiriques. De plus, il est montré que ces méthodes peuvent aussi êtres utilisées dans un contexte plus général de modélisation de l'association entre les éléments de deux ou plusieurs vecteurs binaires. Les méthodes d'analyse proposées sont appliquées à des données provenant de l'étude américaine « National Health and Nutrition Examination Survey » (NHANES). La revue canadienne de statistique © 2009 Société statistique du Canada [source]

    André Dabrowski's work on limit theorems and weak dependence

    Herold Dehling
    Abstract André Robert Dabrowski, Professor of Mathematics and Dean of the Faculty of Sciences at the University of Ottawa, died October 7, 2006, after a short battle with cancer. The author of the present paper, a long-term friend and collaborator of André Dabrowski, gives a survey of André's work on weak dependence and limit theorems in probability theory. The Canadian Journal of Statistics 37: 307,326; 2009 © 2009 Statistical Society of Canada André Robert Dabrowski, professeur de mathématiques et doyen de la Faculté des sciences de l'Université d'Ottawa, est décédé le 7 octobre 2006 après une courte bataille avec le cancer. L'auteur de cet article, un collaborateur et ami de longue date d'André Dabrowski, présente un survol des travaux d'André sur la dépendance faible et les théorèmes limites en théorie des probabilités. La revue canadienne de statistique 37: 307,326; 2009 © 2009 Société statistique du Canada [source]

    Large deviations of multiclass M/G/1 queues,

    André Dabrowski
    Abstract Consider a multiclass M/G/1 queue where queued customers are served in their order of arrival at a rate which depends on the customer class. We model this system using a chain with states represented by a tree. Since the service time distribution depends on the customer class, the stationary distribution is not of product form so there is no simple expression for the stationary distribution. Nevertheless, we can find a harmonic function on this chain which provides information about the asymptotics of this stationary distribution. The associated h -transformation produces a change of measure that increases the arrival rate of customers and decreases the departure rate thus making large deviations common. The Canadian Journal of Statistics 37: 327,346; 2009 © 2009 Statistical Society of Canada Considérons une file d'attente M/G/1 multicatégorie où les consommateurs dans la file d'attente sont servis selon leur ordre d'arrivée à un taux dépendant de leur catégorie de consommateurs. Nous modélisons ce système en utilisant une chaîne où les états sont représentés à l'aide d'un arbre. Puisque la distribution du temps de service dépend du type de consommateurs, la distribution stationnaire ne peut pas s'écrire sous la forme d'un produit. Par conséquent, il n'existe pas d'expression simple pour représenter la distribution stationnaire. Cependant, nous pouvons obte-nir une transformation harmonique de cette chaîne contenant de l'information sur les propriétés asymptotiques de cette distribution stationnaire. La transformation- h associée conduit à un chan-gement de mesure qui augmente le taux d'arrivée et décroît le taux de service ce qui augmente la probabilité de grandes déviations. La revue canadienne de statistique 37: 327,346; 2009 © 2009 Société statistique du Canada [source]

    Some notes on poisson limits for empirical point processes

    André Dabrowski
    Abstract The authors define the scaled empirical point process. They obtain the weak limit of these point processes through a novel use of a dimension-free method based on the convergence of compensators of multiparameter martingales. The method extends previous results in several directions. They obtain limits at points where the density may be zero, but has regular variation. The joint limit of the empirical process evaluated at distinct points is given by independent Poisson processes. They provide applications both to nearest-neighbour density estimation in high dimensions, and to the asymptotic behaviour of multivariate extremes such as those arising from bivariate normal copulas. The Canadian Journal of Statistics 37: 347,360; 2009 © 2009 Statistical Society of Canada Les auteurs définissent un processus ponctuel empirique normalisé. Ils obtiennent une limite faible de ces processus ponctuels grâce à l'utilisation novatrice d'une méthode indépendante de la dimension basée sur la convergence des compensateurs de martingales à plusieurs paramètres. La méthode généralise des résultats précédents de différentes façons. Ils obtiennent des limites à des points où la densité peut être égale à 0, mais qui est à variation régulière. La limite conjointe du processus empirique évalué à des points distincts est représentée par des processus de Poisson indépendants. Les auteurs présentent deux applications, l'une sur l'estimation de densité de dimension élevée basée sur le plus proche voisin et l'autre sur le comportement asymptotique des extrêmes multidimensionnels provenant de copules normales bidimensionnelles. La revue canadienne de statistique 37: 347,360; 2009 © 2009 Société statistique du Canada [source]

    Likelihood analysis of joint marginal and conditional models for longitudinal categorical data

    Baojiang Chen
    MSC 2000: Primary 62H12; secondary 62F10 Abstract The authors develop a Markov model for the analysis of longitudinal categorical data which facilitates modelling both marginal and conditional structures. A likelihood formulation is employed for inference, so the resulting estimators enjoy the optimal properties such as efficiency and consistency, and remain consistent when data are missing at random. Simulation studies demonstrate that the proposed method performs well under a variety of situations. Application to data from a smoking prevention study illustrates the utility of the model and interpretation of covariate effects. The Canadian Journal of Statistics © 2009 Statistical Society of Canada Les auteurs développent un modèle de Markov pour l'analyse de données catégorielles longitudinales facilitant la représentation des structures marginales et conditionnelles. L'inférence est basée sur une fonction de vraisemblance afin d'obtenir des estimateurs efficaces, cohérents et qui le demeurent lorsqu'il y a des données manquantes au hasard. Des études de simulation montrent que la méthode proposée se comporte bien dans les différents scénarios considérés. L'application à des données provenant d'une étude sur la lutte contre le tabagisme illustre bien l'utilité de ce modèle et permet une interprétation des effets des covariables. La revue canadienne de statistique © 2009 Société statistique du Canada [source]

    On the incidence,prevalence relation and length-biased sampling

    Vittorio Addona
    MSC 2000: Primary 62N99; secondary 62G99 Abstract For many diseases, logistic constraints render large incidence studies difficult to carry out. This becomes a drawback, particularly when a new study is needed each time the incidence rate is investigated in a new population. By carrying out a prevalent cohort study with follow-up it is possible to estimate the incidence rate if it is constant. The authors derive the maximum likelihood estimator (MLE) of the overall incidence rate, ,, as well as age-specific incidence rates, by exploiting the epidemiologic relationship, (prevalence odds),=,(incidence rate),×,(mean duration) (P/[1,,,P],=,,,×,µ). The authors establish the asymptotic distributions of the MLEs and provide approximate confidence intervals for the parameters. Moreover, the MLE of , is asymptotically most efficient and is the natural estimator obtained by substituting the marginal maximum likelihood estimators for P and µ into P/[1,,,P],=,,,×,µ. Following-up the subjects allows the authors to develop these widely applicable procedures. The authors apply their methods to data collected as part of the Canadian Study of Health and Ageing to estimate the incidence rate of dementia amongst elderly Canadians. The Canadian Journal of Statistics © 2009 Statistical Society of Canada Pour beaucoup de maladies, des contraintes de logistique rendent les grandes études de cohortes difficiles à effectuer. Ceci devient un inconvénient particulièrement lorsqu'une nouvelle étude est devenue nécessaire afin d'étudier le taux d'indicence d'une nouvelle population. En menant une étude de cohorte prévalente avec relance, il est possible d'estimer le taux d'incidence lorsqu'il est constant. Les auteurs obtiennent l'estimateur du maximum de vraisemblance (EMV) pour le taux d'incidence globale, ,, ainsi que ceux par âge en exploitant la relation épidémiologique, (cote de prévalence)=(taux d'incidence),×,(durée moyenne) (P/[1,,,P],=,,,×,µ). Les auteurs obtiennent aussi les distributions des EMV et ils donnent des intervalles de confiance asymptotiques pour les paramètres. De plus, l'EMV de , est asymptotiquement le plus efficace et il est un estimateur naturel obtenu en substituant les estimateurs du maximum de vraisemblance marginale de P et µ dans P/[1,,,P],=,,,×,µ. La relance des sujets permet aux auteurs de développer ces procédures largement applicables. Les auteurs appliquent leur méthode à des données provenant de l'étude canadienne sur la santé et le vieillissement afin d'estimer le taux d'incidence de la démence parmi les Canadiens âgés. La revue canadienne de statistique © 2009 Société statistique du Canada [source]

    Log-rank permutation tests for trend: saddlepoint p -values and survival rate confidence intervals

    Ehab F. Abd-Elfattah
    MSC 2000: Primary 62N03; secondary 62N02 Abstract Suppose p,+,1 experimental groups correspond to increasing dose levels of a treatment and all groups are subject to right censoring. In such instances, permutation tests for trend can be performed based on statistics derived from the weighted log-rank class. This article uses saddlepoint methods to determine the mid- P -values for such permutation tests for any test statistic in the weighted log-rank class. Permutation simulations are replaced by analytical saddlepoint computations which provide extremely accurate mid- P -values that are exact for most practical purposes and almost always more accurate than normal approximations. The speed of mid- P -value computation allows for the inversion of such tests to determine confidence intervals for the percentage increase in mean (or median) survival time per unit increase in dosage. The Canadian Journal of Statistics 37: 5-16; 2009 © 2009 Statistical Society of Canada Supposons que p,+,1 groupes expérimentaux sont associés à un dosage croissant d'un traitement et que tous les groupes sont sujets à une censure à droite. Dans de tels cas, des tests de permutations pour la tendance peuvent être faits en se basant sur des statistiques obtenues à partir de la classe des statistiques de log-rangs pondérés. Cet article utilise l'approximation par le point de selle pour obtenir le seuil moyen de ces tests de permutation quelle que soit la statistique de test appartenant à la classe des statistiques de log-rangs pondérés. La simulation des permutations est remplacée par le calcul analytique du point de selle. Ce dernier procure un seuil moyen très précis qui peut être considéré exact pour la majorité des applications et qui est toujours plus précis que les approximations normales. La vitesse de calcul des seuils moyens permet l'inversion de ces tests afin de déterminer des intervalles de confiance pour le pourcentage d'augmentation moyen (ou médian) du temps de survie par unité de dosage supplémentaire. La revue canadienne de statistique 37: 5-16; 2009 © 2009 Société statistique du Canada [source]

    Discrete-time survival trees

    Imad Bou-hamad
    MSC 2000: Primary 62N99; secondary 62G08 Abstract Tree-based methods are frequently used in studies with censored survival time. Their structure and ease of interpretability make them useful to identify prognostic factors and to predict conditional survival probabilities given an individual's covariates. The existing methods are tailor-made to deal with a survival time variable that is measured continuously. However, survival variables measured on a discrete scale are often encountered in practice. The authors propose a new tree construction method specifically adapted to such discrete-time survival variables. The splitting procedure can be seen as an extension, to the case of right-censored data, of the entropy criterion for a categorical outcome. The selection of the final tree is made through a pruning algorithm combined with a bootstrap correction. The authors also present a simple way of potentially improving the predictive performance of a single tree through bagging. A simulation study shows that single trees and bagged-trees perform well compared to a parametric model. A real data example investigating the usefulness of personality dimensions in predicting early onset of cigarette smoking is presented. The Canadian Journal of Statistics 37: 17-32; 2009 © 2009 Statistical Society of Canada Arbres de survie à temps discret Les méthodes d'arbres sont fréquemment utilisées lors d'études impliquant des données censurées. La structure d'un arbre ainsi que la facilité avec laquelle il peut être interprété font de lui un outil utile afin d'identifier des facteurs de pronostique et de prédire les probabilités de survie conditionnelles d'un individu étant donné ses covariables. Les méthodes existantes ont été développées pour traiter une variable temporelle continue. En pratique, il arrive fréquemment que la variable mesurant le temps de survie soit mesurée selon une échelle discrète. Les auteurs proposent une nouvelle méthode pour construire un arbre qui est spécialement adaptée aux variables de survie à temps discret. Le critère de division peut être vu comme étant une extension, au cas de censure à droite, du critère d'entropie pour une variable catégorielle. La sélection de l'arbre final est basée sur une méthode d'élagage combinée avec une correction bootstrap. Les auteurs présentent également une méthode simple pour améliorer, potentiellement, la performance d'un seul arbre avec le bagging. Une étude par simulation montre que des arbres seuls et des arbres "baggés" performent bien comparativement à un modèle paramétrique. Les auteurs présentent aussi une illustration de la nouvelle méthode avec des vraies données qui investiguent l'utilité d'utiliser des dimensions de la personnalité afin de prévoir le début de l'utilisation de la cigarette. La revue canadienne de statistique 37: 17-32; 2009 © 2009 Société statistique du Canada [source]

    Inflation of Type I error rate in multiple regression when independent variables are measured with error

    Jerry Brunner
    MSC 2000: Primary 62J99; secondary 62H15 Abstract When independent variables are measured with error, ordinary least squares regression can yield parameter estimates that are biased and inconsistent. This article documents an inflation of Type I error rate that can also occur. In addition to analytic results, a large-scale Monte Carlo study shows unacceptably high Type I error rates under circumstances that could easily be encountered in practice. A set of smaller-scale simulations indicate that the problem applies to various types of regression and various types of measurement error. The Canadian Journal of Statistics 37: 33-46; 2009 © 2009 Statistical Society of Canada Lorsque les variables indépendantes sont mesurées avec erreur, la régression des moindres carrés ordinaires peut conduire à une estimation biaisée et incohérente des paramètres. Cet article montre qu'un accroissement de l'erreur de type I peut aussi se produire. En plus de résultats analytiques, une étude par simulations Monte-Carlo de grande envergure montre que, dans certaines conditions que nous pouvons rencontrer facilement en pratique, l'erreur de type I peut être trop élevée. Une autre étude de Monte-Carlo de moindre envergure suggère que ce problème se rencontre aussi dans plusieurs types de régression et différents types d'erreur de mesure. La revue canadienne de statistique 37: 33-46; 2009 © 2009 Société statistique du Canada [source]

    Screening for Partial Conjunction Hypotheses

    BIOMETRICS, Issue 4 2008
    Yoav Benjamini
    Summary We consider the problem of testing for partial conjunction of hypothesis, which argues that at least u out of n tested hypotheses are false. It offers an in-between approach to the testing of the conjunction of null hypotheses against the alternative that at least one is not, and the testing of the disjunction of null hypotheses against the alternative that all hypotheses are not null. We suggest powerful test statistics for testing such a partial conjunction hypothesis that are valid under dependence between the test statistics as well as under independence. We then address the problem of testing many partial conjunction hypotheses simultaneously using the false discovery rate (FDR) approach. We prove that if the FDR controlling procedure in Benjamini and Hochberg (1995, Journal of the Royal Statistical Society, Series B 57, 289,300) is used for this purpose the FDR is controlled under various dependency structures. Moreover, we can screen at all levels simultaneously in order to display the findings on a superimposed map and still control an appropriate FDR measure. We apply the method to examples from microarray analysis and functional magnetic resonance imaging (fMRI), two application areas where the need for partial conjunction analysis has been identified. [source]

    Stepwise Confidence Intervals for Monotone Dose,Response Studies

    BIOMETRICS, Issue 3 2008
    Jianan Peng
    Summary In dose,response studies, one of the most important issues is the identification of the minimum effective dose (MED), where the MED is defined as the lowest dose such that the mean response is better than the mean response of a zero-dose control by a clinically significant difference. Dose,response curves are sometimes monotonic in nature. To find the MED, various authors have proposed step-down test procedures based on contrasts among the sample means. In this article, we improve upon the method of Marcus and Peritz (1976, Journal of the Royal Statistical Society, Series B38, 157,165) and implement the dose,response method of Hsu and Berger (1999, Journal of the American Statistical Association94, 468,482) to construct the lower confidence bound for the difference between the mean response of any nonzero-dose level and that of the control under the monotonicity assumption to identify the MED. The proposed method is illustrated by numerical examples, and simulation studies on power comparisons are presented. [source]

    A Class of Multiplicity Adjusted Tests for Spatial Clustering Based on Case,Control Point Data

    BIOMETRICS, Issue 1 2007
    Toshiro Tango
    Summary A class of tests with quadratic forms for detecting spatial clustering of health events based on case,control point data is proposed. It includes Cuzick and Edwards's test statistic (1990, Journal of theRoyal Statistical Society, Series B52, 73,104). Although they used the property of asymptotic normality of the test statistic, we show that such an approximation is generally poor for moderately large sample sizes. Instead, we suggest a central chi-square distribution as a better approximation to the asymptotic distribution of the test statistic. Furthermore, not only to estimate the optimal value of the unknown parameter on the scale of cluster but also to adjust for multiple testing due to repeating the procedure by changing the parameter value, we propose the minimum of the profile p-value of the test statistic for the parameter as an integrated test statistic. We also provide a statistic to estimate the areas or cases which make large contributions to significant clustering. The proposed methods are illustrated with a data set concerning the locations of cases of childhood leukemia and lymphoma and another on early medieval grave site locations consisting of affected and nonaffected grave sites. [source]

    Generalized Hierarchical Multivariate CAR Models for Areal Data

    BIOMETRICS, Issue 4 2005
    Xiaoping Jin
    Summary In the fields of medicine and public health, a common application of areal data models is the study of geographical patterns of disease. When we have several measurements recorded at each spatial location (for example, information on p, 2 diseases from the same population groups or regions), we need to consider multivariate areal data models in order to handle the dependence among the multivariate components as well as the spatial dependence between sites. In this article, we propose a flexible new class of generalized multivariate conditionally autoregressive (GMCAR) models for areal data, and show how it enriches the MCAR class. Our approach differs from earlier ones in that it directly specifies the joint distribution for a multivariate Markov random field (MRF) through the specification of simpler conditional and marginal models. This in turn leads to a significant reduction in the computational burden in hierarchical spatial random effect modeling, where posterior summaries are computed using Markov chain Monte Carlo (MCMC). We compare our approach with existing MCAR models in the literature via simulation, using average mean square error (AMSE) and a convenient hierarchical model selection criterion, the deviance information criterion (DIC; Spiegelhalter et al., 2002, Journal of the Royal Statistical Society, Series B64, 583,639). Finally, we offer a real-data application of our proposed GMCAR approach that models lung and esophagus cancer death rates during 1991,1998 in Minnesota counties. [source]

    Models for Estimating Bayes Factors with Applications to Phylogeny and Tests of Monophyly

    BIOMETRICS, Issue 3 2005
    Marc A. Suchard
    Summary Bayes factors comparing two or more competing hypotheses are often estimated by constructing a Markov chain Monte Carlo (MCMC) sampler to explore the joint space of the hypotheses. To obtain efficient Bayes factor estimates, Carlin and Chib (1995, Journal of the Royal Statistical Society, Series B57, 473,484) suggest adjusting the prior odds of the competing hypotheses so that the posterior odds are approximately one, then estimating the Bayes factor by simple division. A byproduct is that one often produces several independent MCMC chains, only one of which is actually used for estimation. We extend this approach to incorporate output from multiple chains by proposing three statistical models. The first assumes independent sampler draws and models the hypothesis indicator function using logistic regression for various choices of the prior odds. The two more complex models relax the independence assumption by allowing for higher-lag dependence within the MCMC output. These models allow us to estimate the uncertainty in our Bayes factor calculation and to fully use several different MCMC chains even when the prior odds of the hypotheses vary from chain to chain. We apply these methods to calculate Bayes factors for tests of monophyly in two phylogenetic examples. The first example explores the relationship of an unknown pathogen to a set of known pathogens. Identification of the unknown's monophyletic relationship may affect antibiotic choice in a clinical setting. The second example focuses on HIV recombination detection. For potential clinical application, these types of analyses must be completed as efficiently as possible. [source]

    Sensitivity Analysis for Nonrandom Dropout: A Local Influence Approach

    BIOMETRICS, Issue 1 2001
    Geert Verbeke
    Summary. Diggle and Kenward (1994, Applied Statistics43, 49,93) proposed a selection model for continuous longitudinal data subject to nonrandom dropout. It has provoked a large debate about the role for such models. The original enthusiasm was followed by skepticism about the strong but untestable assumptions on which this type of model invariably rests. Since then, the view has emerged that these models should ideally be made part of a sensitivity analysis. This paper presents a formal and flexible approach to such a sensitivity assessment based on local influence (Cook, 1986, Journal of the Royal Statistical Society, Series B48, 133,169). The influence of perturbing a missing-at-random dropout model in the direction of nonrandom dropout is explored. The method is applied to data from a randomized experiment on the inhibition of testosterone production in rats. [source]

    Survival Analysis in Clinical Trials: Past Developments and Future Directions

    BIOMETRICS, Issue 4 2000
    Thomas R. Fleming
    Summary. The field of survival analysis emerged in the 20th century and experienced tremendous growth during the latter half of the century. The developments in this field that have had the most profound impact on clinical trials are the Kaplan-Meier (1958, Journal of the American Statistical Association53, 457,481) method for estimating the survival function, the log-rank statistic (Mantel, 1966, Cancer Chemotherapy Report50, 163,170) for comparing two survival distributions, and the Cox (1972, Journal of the Royal Statistical Society, Series B34, 187,220) proportional hazards model for quantifying the effects of covariates on the survival time. The counting-process martingale theory pioneered by Aalen (1975, Statistical inference for a family of counting processes, Ph.D. dissertation, University of California, Berkeley) provides a unified framework for studying the small- and large-sample properties of survival analysis statistics. Significant progress has been achieved and further developments are expected in many other areas, including the accelerated failure time model, multivariate failure time data, interval-censored data, dependent censoring, dynamic treatment regimes and causal inference, joint modeling of failure time and longitudinal data, and Baysian methods. [source]

    Report of the Council for the session 2006,2007

    Council Report
    President's foreword., This year's annual report shows another very successful year for the Society. The range of the Society's new initiatives bears testament to our vigour and to the energy and enthusiasm of Fellows and staff. It is difficult to summarize all of these but I offer a brief overview of some of the highlights. This year we have awarded the first annual prize for ,Statistical excellence in journalism'. It is too easy to bemoan the general quality of coverage of statistical issues in the press and other media. But simply moaning does not improve the situation. As a positive step, on the instigation of Sheila Bird and Andrew Garratt, the Society decided to initiate an award for the best journalistic coverage of a statistical issue. This year first prize was awarded to Ben Goldacre of The Guardian. I hope that these annual awards will offer a positive focus on good coverage and help us to promote best practice. This year, also, we have set up the Professional Development Centre to act as a focus for statistical training both for statisticians and for others who use statistical methods as part of their work. It thus reflects our support for continuing professional development for our Fellows and at the same time provides outreach to members of the statistical user community who want to improve their statistical skills. We welcome Nicola Bright as the Director of the Centre and wish her every success. I am pleased to say that it is not just the Society centrally that has taken new activities this year. The Manchester Local Group have initiated a prize for final year undergraduates from any higher education institute in the north-west. At a time when there are concerns about the number of well-qualified graduates coming into the statistics profession this seems an excellent way to attract the attention of final year undergraduates. I wish this initiative every success. Another development to which the Society has contributed is the Higher Education Funding Council for England project ,more maths grads' which is designed to promote participation in undergraduate degrees in the mathematical sciences. A good supply of mathematically trained graduates is essential to the UK economy in general and to the health of the statistics discipline in particular. It is good that the Society is involved in practical developments that are aimed at increasing participation. The final new initiative that I shall draw attention to is the ,first-in-man' report which is concerned with the statistical design of drug trials aimed at testing novel treatment types. The working party was set up as a result of the adverse reactions suffered by healthy volunteers to a first-in-man trial of monoclonal antibodies and who were subsequently admitted to Northwick Park hospital. The report makes a series of recommendations about the design of such trials and will, I hope, contribute to the safety of future trials. I would like to thank Stephen Senn and the members of the working party for their considerable efforts. As well as these new initiatives there were, of course, many other continuing activities that are noteworthy. The annual conference in Belfast was a great success with many lively sessions and a good number of participants. In particular it was good to see a high number of young statisticians participating in the conference, reflecting the continuing impact of the Young Statisticians Forum on which I commented in the previous annual report. Another continuing activity for the Society is the statistical legislation going through Parliament as I write. The Society has long campaigned for legislation for official statistics. The issue now is to try to get good legislation which will have the required effect and will help the Government Statistical Service and other statistical producers to produce high quality, authoritative statistics in an environment that commands public confidence. As first published, the Society was disappointed with the Bill but we have worked to build support for amendments that, in our view, are essential. Time alone will tell how effective the final legislation will be in meeting our aims. I would like to draw attention to the success of the Membership Services team. We, although with other statistical Societies, have experienced a decline in membership in recent years but the team have turned this round. They are helping to recruit new Fellows and to retain the commitment of existing Fellows. This is a fine achievement and I would like to thank Nicola Emmerson, Ed Swires-Hennessy and the whole team. Finally we have, at last, reached a conclusion in our dealings with the Privy Council and will implement the second phase of constitutional changes. In future our business year, financial year and year for elected appointments will all coincide on a calendar year basis. There will be transitional arrangements but in due course all our administrative arrangements will coincide and will improve efficiency and co-ordination. This has been a long journey, steered effectively by our Director General, Ivor Goddard, and I congratulate him for a successful outcome on your behalf. As you read this report, I hope that you will share my impression of a Society that is lively and spawning many new programmes. We have a dual commitment: to the well-being of statistics as a discipline and to the promotion of statistical understanding and practice to the benefit of Society at large. In both respects I feel that the Society is in good health. This is due to the unstinting efforts of a large number of individual volunteers, including in particular our Honorary Officers and also, of course, the staff at Errol Street. On behalf of all Fellows, I wish to express my thanks to everyone involved. Tim Holt [source]