R Package (r + package)

Distribution by Scientific Domains


Selected Abstracts


A General Algorithm for Univariate Stratification

INTERNATIONAL STATISTICAL REVIEW, Issue 3 2009
Sophie Baillargeon
Summary This paper presents a general algorithm for constructing strata in a population using,X, a univariate stratification variable known for all the units in the population. Stratum,h,consists of all the units with an,X,value in the interval[bh,1,,bh). The stratum boundaries{bh}are obtained by minimizing the anticipated sample size for estimating the population total of a survey variable,Y,with a given level of precision. The stratification criterion allows the presence of a take-none and of a take-all stratum. The sample is allocated to the strata using a general rule that features proportional allocation, Neyman allocation, and power allocation as special cases. The optimization can take into account a stratum-specific anticipated non-response and a model for the relationship between the stratification variable,X,and the survey variable,Y. A loglinear model with stratum-specific mortality for,Y,given,X,is presented in detail. Two numerical algorithms for determining the optimal stratum boundaries, attributable to Sethi and Kozak, are compared in a numerical study. Several examples illustrate the stratified designs that can be constructed with the proposed methodology. All the calculations presented in this paper were carried out with stratification, an R package that will be available on CRAN (Comprehensive R Archive Network). Résumé Cet article présente un algorithme général pour construire des strates dans une population à l'aide de,X, une variable de stratification unidimensionnelle connue pour toutes les unités de la population. La strate,h,contient toutes les unités ayant une valeur de,X,dans l'intervalle [bh,1,,bh). Les frontières des strates {bh} sont obtenues en minimisant la taille d'échantillon anticipée pour l'estimation du total de la variable d'intérêt,Y,avec un niveau de précision prédéterminé. Le critère de stratification permet la présence d'une strate à tirage nul et de strates recensement. L'échantillon est réparti dans les strates à l'aide d'une règle générale qui inclut l'allocation proportionnelle, l'allocation de Neyman et l'allocation de puissance comme des cas particuliers. L'optimisation peut tenir compte d'un taux de non réponse spécifique à la strate et d'un modèle reliant la variable de stratification,X,à la variable d'intérêt,Y. Un modèle loglinéaire avec un taux de mortalité propre à la strate est présenté en détail. Deux algorithmes numériques pour déterminer les frontières de strates optimales, dus à Sethi et Kozak, sont comparés dans une étude numérique. Plusieurs exemples illustrent les plans stratifiés qui peuvent être construits avec la méthodologie proposée. Tous les calculs présentés dans l'article ont été effectués avec stratification, un package R disponible auprès des auteurs. [source]


Deletion of the late cornified envelope genes, LCE3C and LCE3B, is associated with rheumatoid arthritis

ARTHRITIS & RHEUMATISM, Issue 5 2010
Elisa Docampo
Objective The risk of rheumatoid arthritis (RA) is increased in the offspring of individuals affected with various autoimmune disorders, including psoriasis. Recently, the deletion of 2 genes from the late cornified envelope (LCE) gene cluster, LCE3C and LCE3B, has been associated with psoriasis in several populations. The purpose of this study was to assess whether this polymorphic gene deletion could also be involved in susceptibility to RA. Methods We tested for association between the LCE3C_LCE3B copy number variant and a single-nucleotide polymorphism in strong linkage disequilibrium with this variant (rs4112788) and RA in 2 independent case,control data sets (197 and 400 samples from patients with RA, respectively, and 411 and 567 samples from control subjects, respectively), collected at 4 Spanish hospitals. All samples were directly typed for presence of the LCE3C_LCE3B deletion (LCE3C_LCE3B- del) by polymerase chain reaction, and association analysis was performed using the SNPassoc R package. Results An association of homozygosity for the LCE3C_LCE3B -del and rs4112788 C allele with the risk of RA was observed in the first data set and was replicated in an independent case,control set. A combined analysis showed an overall P value of 0.0012 (odds ratio [OR] 1.45, 95% confidence interval [95% CI] 1.16,1.81) for association of the LCE3C_LCE3B- del. When the analysis was stratified for serologic data, we observed association in anti,cyclic citrullinated peptide (anti-CCP),positive patients (P = 0.012, OR 1.51 [95% CI 1.09,2.13]) but not in anti-CCP,negative patients. Conclusion We have identified an association between the LCE3C_LCE3B -del and RA, and we have verified a pleiotropic effect of a common genetic risk factor (LCE3C_LCE3B- del) for autoimmune diseases that is involved in both psoriasis and RA. [source]


L1 Penalized Estimation in the Cox Proportional Hazards Model

BIOMETRICAL JOURNAL, Issue 1 2010
Jelle J. Goeman
Abstract This article presents a novel algorithm that efficiently computes L1 penalized (lasso) estimates of parameters in high-dimensional models. The lasso has the property that it simultaneously performs variable selection and shrinkage, which makes it very useful for finding interpretable prediction rules in high-dimensional data. The new algorithm is based on a combination of gradient ascent optimization with the Newton,Raphson algorithm. It is described for a general likelihood function and can be applied in generalized linear models and other models with an L1 penalty. The algorithm is demonstrated in the Cox proportional hazards model, predicting survival of breast cancer patients using gene expression data, and its performance is compared with competing approaches. An R package, penalized, that implements the method, is available on CRAN. [source]


Developmental microRNA expression profiling of murine embryonic orofacial tissue

BIRTH DEFECTS RESEARCH, Issue 7 2010
Partha Mukhopadhyay
Abstract BACKGROUND: Orofacial development is a multifaceted process involving precise, spatio-temporal expression of a panoply of genes. MicroRNAs (miRNAs), the largest family of noncoding RNAs involved in gene silencing, represent critical regulators of cell and tissue differentiation. MicroRNA gene expression profiling is an effective means of acquiring novel and valuable information regarding the expression and regulation of genes, under the control of miRNA, involved in mammalian orofacial development. METHODS: To identify differentially expressed miRNAs during mammalian orofacial ontogenesis, miRNA expression profiles from gestation day (GD) -12, -13 and -14 murine orofacial tissue were compared utilizing miRXplore microarrays from Miltenyi Biotech. Quantitative real-time PCR was utilized for validation of gene expression changes. Cluster analysis of the microarray data was conducted with the clValid R package and the UPGMA clustering method. Functional relationships between selected miRNAs were investigated using Ingenuity Pathway Analysis. RESULTS: Expression of over 26% of the 588 murine miRNA genes examined was detected in murine orofacial tissues from GD-12,GD-14. Among these expressed genes, several clusters were seen to be developmentally regulated. Differential expression of miRNAs within such clusters wereshown to target genes encoding proteins involved in cell proliferation, cell adhesion, differentiation, apoptosis and epithelial-mesenchymal transformation, all processes critical for normal orofacial development. CONCLUSIONS: Using miRNA microarray technology, unique gene expression signatures of hundreds of miRNAs in embryonic orofacial tissue were defined. Gene targeting and functional analysis revealed that the expression of numerous protein-encoding genes, crucial to normal orofacial ontogeny, may be regulated by specific miRNAs. Birth Defects Research (Part A), 2010. © 2010 Wiley-Liss, Inc. [source]


wombsoft: an r package that implements the Wombling method to identify genetic boundary

MOLECULAR ECOLOGY RESOURCES, Issue 4 2007
A. CRIDA
Abstract wombsoft is an r package that analyses individually georeferenced multilocus genotypes for the inferences of genetic boundaries between populations. It is based on the Wombling method that estimates the systemic function by looking for the local variation of the allele frequencies. This study presents an original way of estimating the systemic function, based on the local polynomial regression, and a binomial test to assess the significance of boundaries. The method applies to codominant or dominant markers and allows for missing data. The software r can be downloaded from http://www.r-project.org/ and wombsoft from http://www-leca.ujf-grenoble.fr/logiciels.htm or http://www.r-project.org/. [source]