Home About us Contact

Maximum Likelihood (maximum + likelihood)

Distribution by Scientific Domains

Mathematics and Statistics	29%
Life Sciences	29%
Medical Sciences	13%
Business, Economics, Finance and Accounting	7%
Humanities and Social Sciences	5%
Engineering	5%
Chemistry	3%
Earth and Environmental Science	2%
2 Other Domains	7%

Distribution within Mathematics and Statistics

Applied Probability & Statistics	17%
General & Introductory Statistics	10%

Kinds of Maximum Likelihood

simulated maximum likelihood

Terms modified by Maximum Likelihood

maximum likelihood analysis

maximum likelihood approach

maximum likelihood estimate

maximum likelihood estimation

maximum likelihood estimation method

maximum likelihood estimation procedure

maximum likelihood estimator

maximum likelihood method

maximum likelihood methods

maximum likelihood models

Selected Abstracts

SIMULATED MAXIMUM LIKELIHOOD APPLIED TO NON-GAUSSIAN AND NONLINEAR MIXED EFFECTS AND STATE,SPACE MODELS

AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 4 2004
Russell B. Millar
Summary The paper presents an overview of maximum likelihood estimation using simulated likelihood, including the use of antithetic variables and evaluation of the simulation error of the resulting estimates. It gives a general purpose implementation of simulated maximum likelihood and uses it to re-visit four models that have previously appeared in the published literature: a state,space model for count data; a nested random effects model for binomial data; a nonlinear growth model with crossed random effects; and a crossed random effects model for binary salamander-mating data. In the case of the last three examples, this appears to be the first time that maximum likelihood fits of these models have been presented. [source]

Recent evolutionary diversification of a protist lineage

ENVIRONMENTAL MICROBIOLOGY, Issue 5 2008
Ramiro Logares
Summary Here, we have identified a protist (dinoflagellate) lineage that has diversified recently in evolutionary terms. The species members of this lineage inhabit cold-water marine and lacustrine habitats, which are distributed along a broad range of salinities (0,32) and geographic distances (0,18 000 km). Moreover, the species present different degrees of morphological and sometimes physiological variability. Altogether, we analysed 30 strains, generating 55 new DNA sequences. The nuclear ribosomal DNA (nrDNA) sequences (including rapidly evolving introns) were very similar or identical among all the analysed isolates. This very low nrDNA differentiation was contrasted by a relatively high cytochrome b (COB) mitochondrial DNA (mtDNA) polymorphism, even though the COB evolves very slowly in dinoflagellates. The 16 Maximum Likelihood and Bayesian phylogenies constructed using nr/mtDNA indicated that the studied cold-water dinoflagellates constitute a monophyletic group (supported also by the morphological analyses), which appears to be evolutionary related to marine-brackish and sometimes toxic Pfiesteria species. We conclude that the studied dinoflagellates belong to a lineage which has diversified recently and spread, sometimes over long distances, across low-temperature environments which differ markedly in ecology (marine versus lacustrine communities) and salinity. Probably, this evolutionary diversification was promoted by the variety of natural selection regimes encountered in the different environments. [source]

Bayesian hierarchical generalized linear models for a geographical subset of recovery data

ENVIRONMETRICS, Issue 2 2002
Daniela Cocchi
Abstract The aim of this work is to check whether modifications in the length of the hunting seasons had an effect on the chance of reproduction of different species of ringed birds. We start from a national data set of ringing-recovered data on three species of game birds. Only data on birds recovered as juveniles are used. Data on recoveries are organized in a 4-way contingency table. Several generalized linear models are proposed for the counts of recovered birds. Bayesian hierarchical modeling is particularly suitable for this kind of data, for which an over-dispersion parameter can be introduced at the second level of the hierarchy. Maximum Likelihood and Bayesian solutions are computed for the different models: the Bayesian framework, in particular under an individual modeling of over-dispersion, exhibits the best fit in terms of Bayesian p -value. The results show that the modification in the length of the hunting seasons does not produce equal benefits for the three species considered. Copyright © 2002 John Wiley & Sons, Ltd. [source]

Bayesian inference strategies for the prediction of genetic merit using threshold models with an application to calving ease scores in Italian Piemontese cattle

JOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 4 2002
K. Kizilkaya
Summary First parity calving difficulty scores from Italian Piemontese cattle were analysed using a threshold mixed effects model. The model included the fixed effects of age of dam and sex of calf and their interaction and the random effects of sire, maternal grandsire, and herd-year-season. Covariances between sire and maternal grandsire effects were modelled using a numerator relationship matrix based on male ancestors. Field data consisted of 23 953 records collected between 1989 and 1998 from 4741 herd-year-seasons. Variance and covariance components were estimated using two alternative approximate marginal maximum likelihood (MML) methods, one based on expectation-maximization (EM) and the other based on Laplacian integration. Inferences were compared to those based on three separate runs or sequences of Markov Chain Monte Carlo (MCMC) sampling in order to assess the validity of approximate MML estimates derived from data with similar size and design structure. Point estimates of direct heritability were 0.24, 0.25 and 0.26 for EM, Laplacian and MCMC (posterior mean), respectively, whereas corresponding maternal heritability estimates were 0.10, 0.11 and 0.12, respectively. The covariance between additive direct and maternal effects was found to be not different from zero based on MCMC-derived confidence sets. The conventional joint modal estimates of sire effects and associated standard errors based on MML estimates of variance and covariance components differed little from the respective posterior means and standard deviations derived from MCMC. Therefore, there may be little need to pursue computation-intensive MCMC methods for inference on genetic parameters and genetic merits using conventional threshold sire and maternal grandsire models for large datasets on calving ease. Zusammenfassung Die Kalbeschwierigkeiten bei italienischen Piemonteser Erstkalbskühen wurden mittels eines gemischten Threshold Modells untersucht. Im Modell wurden die fixen Einflüsse vom Alter der Kuh und dem Geschlecht des Kalbes, der Interaktion zwischen beiden und die zufälligen Effekte des Großvaters der Mutter und der Herden-Jahr-Saisonklasse berücksichtigt. Die Kovarianz zwischen dem Vater der Kuh und dem Großvater der Mutter wurde über die nur auf väterlicher Verwandtschaft basierenden Verwandtschaftsmatrix berücksichtigt. Es wurden insgesamt 23953 Datensätze aus den Jahren 1989 bis 1998 von 4741 Herden-Jahr-Saisonklassen ausgewertet. Die Varianz- und Kovarianzkomponenten wurden mittels zweier verschiedener approximativer marginal Maximum Likelihood (MML) Methoden geschätzt, die erste basierend auf Expectation-Maximierung (EM) und die zweite auf Laplacian Integration. Rückschlüsse wurden verglichen mit solchen, basierend auf drei einzelne Läufe oder Sequenzen von Markov Chain Monte Carlo (MCMC) Stichproben, um die Gültigkeit der approximativen MML Schätzer aus Daten mit ähnlicher Größe und Struktur zu prüfen. Die Punktschätzer der direkten Heritabilität lagen bei 0,24; 0,25 und 0,26 für EM, Laplacian und MCMC (Posterior Mean), während die entsprechenden maternalen Heritabilitäten bei 0,10, 0,11 und 0,12 lagen. Die Kovarianz zwischen dem direkten additiven und dem maternalen Effekt wurden als nicht von Null verschieden geschätzt, basierend auf MCMC abgeleiteten Konfidenzintervallen. Die konventionellen Schätzer der Vatereffekte und deren Standardfehler aus den MML-Schätzungen der Varianz- und Kovarianzkomponenten differieren leicht von denen aus der MCMC Analyse. Daraus folgend besteht wenig Bedarf die rechenintensiven MCMC-Methoden anzuwenden, um genetische Parameter und den genetischen Erfolg zu schätzen, wenn konventionelle Threshold Modelle für große Datensätze mit Vätern und mütterlichen Großvätern mit Kalbeschwierigkeiten genutzt werden. [source]

Estimates of environmental effects and genetic parameters for body measurements and weight in Brahman cattle raised in Mexico

JOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 4 2002
C. D. U. Magnabosco
Summary A Derivative Free Restricted Maximum Likelihood (DFREML) algorithm was used with single trait and two traits animal models to estimate the variance and covariance components and thus, heritabilities and phenotypic, genetic and environmental correlations among nine different body measurements and weights of Brahman cattle raised in Mexico. The following measurements were considered: hip width, pin width, hip-pin width, anterior height, posterior height, body length, thorax perimeter, scrotal circumference and weight. The analysis was based on a total of 1018 animals, born between 1992 and 1995, from 17 herds in the Mexican States of Chiapas, San Luis Potosi, Tabasco, Tamaulipas and Veracruz. The model included the following fixed effects: herd, year-season of birth, sex, age of the animal and feed management. The only random effect was the direct additive genetic contribution of each animal. All fixed effects in the model were significant for all traits (p < 0.05). Estimated heritabilities for the traits were: hip width 0.57, pin width 0.32, hip-pin width 0.41, anterior height 0.56, posterior height 0.54, body length 0.32, thorax perimeter 0.49, scrotal circumference 0.02 and weight 0.66. The magnitude of the heritabilities was medium to high, with the exception of scrotal circumference. The genetic correlations among all body measurements were consistently positive and high, ranging from 0.64 to 1.00. Although other measures showed higher genetic correlations with weight, thorax perimeter combines a high value (0.70) with ease and repeatability, making it a useful field measurement to estimate body weight when scales are not available. Resumen Estimados de efectos ambientales y parámetros genéticos para medidas corporales y peso vivo en ganado brahman criado en méjico Fue usado un algoritmo de Máxima Verosimilitud Restricta Libre de Derivadas (DFREML) con modelos animales para una y dos características para estimar componentes de (co)varianzas, heredabilidades y correlaciones fenotípicas, genéticas y ambientales entre nueve diferentes medidas corporales y peso vivo de ganado Brahman criado en México. Fueron considerados los siguientes rasgos: ancho anterior de la grupa, ancho posterior de la grupa, largo de la grupa, altura a la cruz, altura a la grupa, largo del cuerpo, perímetro toráxico, perímetro escrotal y peso vivo. Se usaron datos de 1018 animales, nacidos entre 1992 y 1995, procedentes de 17 rebaños de los Estados mejicanos de Chiapas, San Luis Potosí, Tabasco, Tamaulipas y Veracruz. El modelo matemático incluyó los siguientes efectos fijos: rebaño, año-época de nacimiento, sexo, clase de edad del animal y manejo alimentar. Se consideró el efecto aditivo directo de cada animal como el único efecto aleatorio. Todos los efectos fijos del modelo fueron significativos para todas las características (P < 0.05). Las heredabilidades estimadas fueron: ancho anterior de la grupa 0.57, ancho posterior de la grupa 0.32, largo de la grupa 0.41, altura a la cruz 0.56, altura a la grupa 0.54, largo del cuerpo 0.32, perímetro toráxico 0.49, perímetro escrotal 0.02 y peso vivo 0.66. Las magnitudes de las heredabilidades fueron de medias a altas, con excepción del perímetro escrotal. Las correlaciones genéticas entre todas las medidas corporales fueron consistentemente positivas y altas, variando de 0.64 a 1.00. Aunque otras medidas corporales mostraron altas correlaciones genéticas con el peso vivo, el perímetro toráxico combina un alto valor de esa correlación (0.70) con facilidad de medición y alta repetibilidad, haciendo de esta una medida útil, para estimar el peso vivo, en condiciones de campo donde no se dispone de balanza. [source]

Estimation of heritability for hip dysplasia in German Shepherd Dogs in Finland

JOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 2 2000
M. Leppaänen
The heritability of hip dysplasia in the German Shepherd Dog was estimated by applying the animal model and the Restricted Maximum Likelihood (REML) method to a data-set which consisted of the hip scores of 10 335 dogs. Fixed effects of the model were the month and the year of birth, screening age, the panelist responsible for screening and the origin of the animal's sire. The litter and the breeder had only minor effects on hip joints. Heritability estimates were moderate (0.31,0.35). The moderate heritability, which was found in this study, enables a much better genetic gain in the breeding programme, if proper evaluation methods, such as BLUP animal model, and effective selection is used instead of phenotypic selection. Zusammenfassung Schätzung der Heritabilität der Hüftgelenksdysplasie beim Deutschen Schäferhund in Finnland. Die Heritabilität der Hüftgelenksdysplasie beim Deutschen Schäferhund wurde mit Hilfe des Tiermodells und der Restricted Maximum Likelihood (REML) Methode anhand von Hüftgelenksgutachten von 10 335 Hunden geschätzt. Als fixe Effekte wurden im Modell ,Geburtsmonat' und ,-jahr', ,Röntgenalter', Einfluß des ,Gutachters' und ,Herkunft des Vaters' berücksichtigt. Die Effekte ,Wurf' und ,Züchter' hatten nur einen geringen Einfluß auf die Hüftgelenke. Die Heritabilitätsschätzungen betrugen 0.31 bis 0.35. Die in dieser Studie geschätzten Heritabilitäten ermöglichen es, zusammen mit geeigneten Methoden, wie beispielsweise dem BLUP-Tiermodell und einer effektiven Selektion, einen schnelleren Zuchtfortschritt zu erreichen, als nur phänotypisch zu selektieren. [source]

QUANTIFYING ADULTERATION IN ROAST COFFEE POWDERS BY DIGITAL IMAGE PROCESSING

JOURNAL OF FOOD QUALITY, Issue 2 2003
EDSON E. SANO
Pure arabica coffee and mixtures of coffee husks and straw, maize, brown sugar and soybean were produced in our laboratory as investigation materials. Red/Green/Blue (RGB) color composites, magnified twelve times, were generated using a Charge Coupled Device (CCD) camera connected to a stereo microscope and a personal computer with an image processing software package. The percent areas of the contaminants in each image were calculated by the Maximum Likelihood supervised classification technique. Best-fit equations relating weight percentage (g.kg -1) and the percent areas were obtained for each coffee contaminant. To test the method, 247 coffee samples of different amounts and types of adulterants were analyzed in the laboratory. The results showed that the new method developed can analyze precisely and quickly a large number of ground coffee powders. [source]

Systematic position of the pelagic Thecosomata and Gymnosomata within Opisthobranchia (Mollusca, Gastropoda) , revival of the Pteropoda

JOURNAL OF ZOOLOGICAL SYSTEMATICS AND EVOLUTIONARY RESEARCH, Issue 2 2006
A. Klussmann-Kolb
Abstract The complete 18S (SSU) rRNA as well partial 28S (LSU) rRNA and partial mitochondrial COI sequences have been used to reconstruct the phylogenetic relationships within Opisthobranchia with special focus on the pelagic orders Thecosomata and Gymnosomata. Maximum parsimony, maximum likelihood, distance as well as Bayesian analysis of a combined dataset of the three genes reveals that Thecosomata and Gymnosomata are sister groups and together are closely related to Anaspidea. Possible sister taxon to Thecosomata, Gymnosomata and Anaspidea is Cephalaspidea s. str. Analysis of a taxon-extended dataset of partial 28S sequences supported a basal position of Limacina within Euthecosomata. Within Cavolinidae, Creseis is basal to the other taxa. Other phylogenetic implications from the present results are also discussed. Investigation of the morphology and histology of Thecosomata and Gymnosomata as well as several other opisthobranch taxa helped to identify autapomorphies for Thecosomata and Gymnosomata as well as apomorphies for the clades including these taxa. Zusammenfassung Auf Basis der kompletten 18S rRNA- und partiellen 28S rRNA- sowie partiellen COI- Sequenzen wurde die Phylogenie der Opisthobranchia unter besonderer Berücksichtigung der pelagischen Thecosomata und Gymnosomata rekonstruiert. Maximum Parsimonie-, Maximum Likelihood- sowie Distanz- Berechnungen und Bayes'sche Analysen zeigen, dass die Thecosomata und Gymnosomata Schwestergruppen und nah verwandt mit den Anaspidea sind. Die potentielle Schwestergruppe zu Thecosomata, Gymnosomata und Anaspidea sind die Cephalaspidea s. str. Die Analyse eines taxonerweiterten Datensatzes von partiellen 28S rRNA-Sequenzen unterstützt die basale Position von Limacina innerhalb der Euthecosomata. Innerhalb der Cavolinidae stellt Creseis das basalste Taxon dar. Weitere Schlussfolgerungen zu phylogenetischen Verwandtschaftsverhältnissen der Opisthobranchia auf Grundlage der vorliegenden Untersuchungen werden diskutiert. Die Untersuchungen der Morphologie und Histologie der Thecosomata und Gymnosomata sowie anderer Opisthobranchia ließen apomorphe Merkmale der Thecosomata und Gymnosomata sowie Apomorphien der Kladen, die diese beiden pelagischen Taxa enthalten, erkennen. [source]

A Maximum Likelihood-Based Method for Mining Major Genes Affecting a Quantitative Character

BIOMETRICS, Issue 3 2001
Rongling Wu
Summary. In this article, we present a maximum likelihood-based analytical approach for detecting a major gene of large effect on a quantitative trait in a progeny population derived from a mating design. Our analysis is based on a mixed genetic model specifying both major gene and background polygenic inheritance. The likelihood of the data is formulated by combining the information about population behaviors of the major gene during hybridization and its phenotypic distribution densities. The EM algorithm is implemented to obtain maximum likelihood estimates for population and quantitative genetic parameters of the major locus. This approach is applied to detect an overdominant gene governing stem volume growth in a factorial mating design of aspen trees. It is suggested that further molecular genetic research toward mapping single genes affecting aspen growth and production based on the same experimental data has a high probability of success. [source]

Maximum likelihood constrained deconvolution.

CONCEPTS IN MAGNETIC RESONANCE, Issue 2 2003
II: Application to experimental two-, three-dimensional NMR spectra
Abstract The maximum likelihood method (MLM) and related protocols were applied to the experimental 2-D nuclear Overhauser effect (NOE) spectrum of a 24-nucleotide RNA hairpin loop molecule. The output becomes more valuable when diagonal symmeterization is followed by MLM. This symmeterized maximum likelihood (SML) protocol restores the original spectral information with high fidelity by accurately partitioning components from overlapped peaks and provides substantial improvements in line shape and spectral resolution, in particular in the F1 dimension. These advantages lead to a simpler interpretation of the resonance frequencies, intensities, multiplet fine structure, and J -coupling values from a heavily overlapped peak region. This promises a more effective tool for peak picking, assignment, and integration. Also, application of MLM and related protocols to the 2-D NOE proton spectrum of a 24-mer RNA dramatically increases the number of NOE-based distance constraints that can be used for determination of its 3-D molecular structure. By application of 3-D MLM to a simple 3-D spectrum, the spectral resolution and signal-to-noise (S/N) ratio was greatly improved by effective line sharpening and reduction of cross-talk between planes. © 2003 Wiley Periodicals, Inc. Concepts Magn Reson 18A: 146,156, 2003 [source]

Maximum likelihood constrained deconvolution.

CONCEPTS IN MAGNETIC RESONANCE, Issue 6 2002
I: Algorithm, qualitative, quantitative enhancement in synthetic two-dimensional NMR spectra
Abstract The maximum likelihood method is a constrained iterative spectral deconvolution technique in which a spectral fitting model is determined by minimizing the variance of fit in the time domain in a nonlinear iterative manner. Application of this method to synthetic 2-dimensional (2-D) NMR spectra, which have heavily overlapped multiplets associated with low signal to noise ratios, yields contrast-enhanced spectra with simultaneous noise suppression and resolution improvement. This protocol greatly facilitates peak recognition and often partitions overlapping multiplets into individual components, leading to a more accurate interpretation of resonance frequencies, coupling constants, and multiplets than does the conventional apodization or Fourier transform method. These advantages are useful for constructing reliable 3-D molecular structures for complex molecular systems. © 2002 Wiley Periodicals, Inc. Concepts Magn Reson 14: 402,415, 2002 [source]

Using linked markers to infer the age of a mutation

HUMAN MUTATION, Issue 2 2001
Bruce Rannala
Abstract Advances in sequencing and genotyping technologies over the last decade have enabled geneticists to easily characterize genetic variation at the nucleotide level. Hundreds of genes harboring mutations associated with genetic disease have now been identified by positional cloning. Using variation at closely linked genetic markers, it is possible to predict the times in the past at which particular mutations arose. Such studies suggest that many of the rare mutations underlying human genetic disorders are relatively young. Studies of variation at genetic markers linked to particular mutations can provide insights into human geographic history, and historical patterns of natural selection and disease, that are not available from other sources. We review two approaches for estimating allele age using variation at linked genetic markers. A phylogenetic approach aims to reconstruct the gene tree underlying a sample of chromosomes carrying a particular mutation, obtaining a "direct" estimate of allele age from the age of the root of this tree. A population genetic approach relies on models of demography, mutation, and/or recombination to estimate allele age without explicitly reconstructing the gene tree. Phylogenetic methods are best suited for studies of ancient mutations, while population genetic methods are better suited for studies of recent mutations. Methods that rely on recombination to infer the ages of alleles can be fine-tuned by choosing linked markers at optimal map distances to maximize the information available about allele age. A limitation of methods that rely on recombination is the frequent lack of a fine-scale linkage map. Maximum likelihood and Bayesian methods for estimating allele age that rely on intensive numerical computation are described, as well as "composite" likelihood and moment-based methods that lead to simple estimators. The former provide more accurate estimates (particularly for large samples of chromosomes) and should be employed if computationally practical. Hum Mutat 18:87,100, 2001. © 2001 Wiley-Liss, Inc. [source]

Population structure in the South American tern Sterna hirundinacea in the South Atlantic: two populations with distinct breeding phenologies

JOURNAL OF AVIAN BIOLOGY, Issue 4 2010
Patrícia J. Faria
The South American tern Sterna hirundinacea is a migratory species for which dispersal, site fidelity and migratory routes are largely unknown. Here, we used five microsatellite loci and 799,bp partial mitochondrial DNA sequences (Cytochrome b and ND2) to investigate the genetic structure of South American terns from the South Atlantic Ocean (Brazilian and Patagonian colonies). Brazilian and Patagonian colonies have two distinct breeding phenologies (austral winter and austral summer, respectively) and are under the influence of different oceanographic features (e.g. Brazil and Falklands/Malvinas ocean currents, respectively), that may promote genetic isolation between populations. Results show that the Atlantic populations are not completely panmictic, nevertheless, contrary to our expectations, low levels of genetic structure were detected between Brazilian and Patagonian colonies. Such low differentiation (despite temporal isolation of the colonies) could be explained by demographic history of these populations coupled with ongoing levels of gene flow. Interestingly, estimations of gene flow through Maximum likelihood and Bayesian approaches has indicated asymmetrical long term and contemporary gene flow from Brazilian to Patagonian colonies, approaching a source,sink metapopulation dynamic. Genetic analysis of other South American tern populations (especially those from the Pacific coast and Falklands,Malvinas Islands) and other seabird species showing similar geographical distribution (e.g. royal tern Thalasseus maximus), are fundamental in gaining a better understanding of the main processes involved in the diversification of seabirds in the southern hemisphere. [source]

Genetic divergences pre-date Pleistocene glacial cycles in the New Zealand speckled skink, Oligosoma infrapunctatum

JOURNAL OF BIOGEOGRAPHY, Issue 5 2008
Stephanie N. J. Greaves
Abstract Aim, To examine the hypothesis raised by Graham S. Hardy that Pleistocene glacial cycles suffice to explain divergence among lineages within the endemic New Zealand speckled skink, Oligosoma infrapunctatum Boulenger. Location, Populations were sampled from across the entire range of the species, on the North and South Islands of New Zealand. Methods, We sequenced the mitochondrial genes ND2 (550 bp), ND4 + tRNAs (773 bp) and cytochrome b (610 bp) of 45 individuals from 21 locations. Maximum likelihood, maximum parsimony and Bayesian methods were used for phylogenetic reconstruction. The Shimodaira,Hasegawa test was used to examine hypotheses about the taxonomic status of morphologically distinctive populations. Results, Our analysis revealed four strongly supported clades within O. infrapunctatum. Clades were largely allopatric, except on the west coast of the South Island, where representatives from all four clades were found. Divergences among lineages within the species were extremely deep, reaching over 5%. Two contrasting phylogeographical patterns are evident within O. infrapunctatum. Main conclusions, The deep genetic divisions we found suggest that O. infrapunctatum is a complex of cryptic species which diverged in the Pliocene, contrary to the existing Pleistocene-based hypothesis. Although Pleistocene glacial cycles do not underlie major divergences within this species, they may be responsible for the shallower phylogeographical patterns that are found within O. infrapunctatum, which include a radiation of haplotypes in the Nelson and Westland regions. [source]

Maximum likelihood fitting using ordinary least squares algorithms,

JOURNAL OF CHEMOMETRICS, Issue 8-10 2002
Rasmus Bro
Abstract In this paper a general algorithm is provided for maximum likelihood fitting of deterministic models subject to Gaussian-distributed residual variation (including any type of non-singular covariance). By deterministic models is meant models in which no distributional assumptions are valid (or applied) on the parameters. The algorithm may also more generally be used for weighted least squares (WLS) fitting in situations where either distributional assumptions are not available or other than statistical assumptions guide the choice of loss function. The algorithm to solve the associated problem is called MILES (Maximum likelihood via Iterative Least squares EStimation). It is shown that the sought parameters can be estimated using simple least squares (LS) algorithms in an iterative fashion. The algorithm is based on iterative majorization and extends earlier work for WLS fitting of models with heteroscedastic uncorrelated residual variation. The algorithm is shown to include several current algorithms as special cases. For example, maximum likelihood principal component analysis models with and without offsets can be easily fitted with MILES. The MILES algorithm is simple and can be implemented as an outer loop in any least squares algorithm, e.g. for analysis of variance, regression, response surface modeling, etc. Several examples are provided on the use of MILES. Copyright © 2002 John Wiley & Sons, Ltd. [source]

Phylogenetic evidence for a single, ancestral origin of a ,true' worker caste in termites

JOURNAL OF EVOLUTIONARY BIOLOGY, Issue 6 2000
G. J. Thompson
Phylogenetic analysis based on sequence variation in mitochondrial large-subunit rRNA and cytochrome oxidase II genes was used to investigate the evolutionary relationships among termite families. Maximum likelihood and parsimony analyses of a combined nucleotide data set yield a single well-supported topology, which is: (((((Termitidae, Rhinotermitidae), Serritermitidae), Kalotermitidae), (Hodotermitidae, Termopsidae)), Mastotermitidae). Although some aspects of this topology are consistent with previous schemes, overall it differs from any published. Optimization of ,true' workers onto the tree suggests that this caste originated once, early in the history of the lineage and has been lost secondarily twice. This scenario differs from the more widely accepted notion that workers are derived and of polyphyletic origin and that extant pseudergates, or ,false' workers, are their developmentally unspecialized ancestor caste. Worker gains and losses covary directly in number and direction with shifts in ,ecological life type'. A test for correlated evolution which takes phylogenetic structure into account indicates that this pattern is of biological significance and suggests that the variable occurrence of a worker caste in termites has ecological determinants, apparently linked to differences in feeding and nesting habits. [source]

Molecular phylogeny of Turkish Trachurus species (Perciformes: Carangidae) inferred from mitochondrial DNA analyses

JOURNAL OF FISH BIOLOGY, Issue 5 2008
Y. Bektas
Genetic variation among three species of Trachurus (T. trachurus, T. mediterraneus and T. picturatus) from Turkey was investigated by phylogenetic analysis of the entire mtDNA control region (CR) (862 bp, n = 182) and partial cytochrome (cyt) b (239 bp, n = 174) sequences. Individuals were collected at nine stations in four geographic locations: North-eastern Mediterranean Sea, Aegean Sea, Sea of Marmara and Black Sea. Polymerase chain reaction-direct sequencing of the CR and the partial cyt b genes produced 28 and 131 distinct haplotypes, respectively. Maximum likelihood, neighbour-joining and maximum parsimony methods produced similar tree topologies. The results of both CR and cyt b sequence analyses revealed the existence of several species-specific nucleotide sites that can be used to discriminate between the three species. Genetic distances indicated that T. mediterraneus and T. picturatus are more closely related to each other than either is to T. trachurus. Inter-nucleotide and intra-nucleotide diversities of T. picturatus were larger than those of T. mediterraneus and T. trachurus. There was no evidence of a geographical difference in haplotype frequencies of these two mtDNA regions to be clustered. [source]

MOLECULAR AND MORPHOLOGICAL DATA IDENTIFY A CRYPTIC SPECIES COMPLEX IN ENDOPHYTIC MEMBERS OF THE GENUS COLEOCHAETE BRÉB. (CHAROPHYTA: COLEOCHAETACEAE),

JOURNAL OF PHYCOLOGY, Issue 6 2002
Matthew T. Cimino
The genus Coleochaete Bréb. is a relatively small group of freshwater microscopic green algae with about 15 recognized species. Although Coleochaete has long been considered to be a close relative of embryophytes, a comprehensive study of the genus has not been published since Pringsheim's 1860 monograph. As part of a systematic study of Coleochaete, we investigated four accessions of the genus that are morphologically similar to the endophytic species C. nitellarum Jost. Each of the four cultures was determined to be capable of endophytic growth in Nitella C. A. Agardh, a member of the closely related order Charales. Maximum likelihood and maximum parsimony analyses were performed on nucleotide data from the chloroplast genes atpB and rbcL that were sequenced from 16 members of the Coleochaetales and from other members of the Charophyceae, embryophytes, and outgroup taxa. These analyses indicate that the Coleochaetales are monophyletic and that the endophytic accessions are members of the scutata group of species. In addition, cell size and nucleotide data suggest that at least three different endophytic species may be represented. Herbivory, nutritional benefits, and substrate competition are three hypotheses that could explain the evolution and maintenance of the endophytic habit in Coleochaete. These data also imply that diversity in the genus may be markedly underestimated. [source]

Estimating life expectancy in health and ill health by using a hidden Markov model

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 4 2009
Ardo Van Den Hout
Summary., Population studies with longitudinal follow-up and mortality information can be used to estimate transitions between healthy and unhealthy states before death. When health is defined with respect to cognitive ability during old age, the trajectory of performance is either static or downwards. The paper presents a hidden Markov model to describe the underlying categorized cognitive decline, where observed improvement of cognitive ability is modelled as misclassification. Maximum likelihood is used to estimate the transition intensities between the normal cognitive state, the cognitively impaired state and death. The methodology is extended to estimate total life expectancy and life expectancy with and without cognitive impairment. The paper presents estimates from the Medical Research Council cognitive function and ageing study that began in 1991 and where individuals have had up to eight interviews over the next 10 years. It is shown that the misclassification of the states is mainly caused by not detecting an impaired state. Individuals with more years of education have lower impaired life expectancies. [source]

Modelling the spread in space and time of an airborne plant disease

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 3 2008
Samuel Soubeyrand
Summary., A spatiotemporal model is developed to analyse epidemics of airborne plant diseases which are spread by spores. The observations consist of measurements of the severity of disease at different times, different locations in the horizontal plane and different heights in the vegetal cover. The model describes the joint distribution of the occurrence and the severity of the disease. The three-dimensional dispersal of spores is modelled by combining a horizontal and a vertical dispersal function. Maximum likelihood combined with a parametric bootstrap is suggested to estimate the model parameters and the uncertainty that is attached to them. The spatiotemporal model is used to analyse a yellow rust epidemic in a wheatfield. In the analysis we pay particular attention to the selection and the estimation of the dispersal functions. [source]

Correlating two continuous variables subject to detection limits in the context of mixture distributions

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 5 2005
Haitao Chu
Summary., In individuals who are infected with human immunodeficiency virus (HIV), distributions of quantitative HIV ribonucleic acid measurements may be highly left censored with an extra spike below the limit of detection LD of the assay. A two-component mixture model with the lower component entirely supported on [0, LD] is recommended to model the extra spike in univariate analysis better. Let LD1 and LD2 be the limits of detection for the two HIV viral load measurements. When estimating the correlation coefficient between two different measures of viral load obtained from each of a sample of patients, a bivariate Gaussian mixture model is recommended to model the extra spike on [0, LD1] and [0, LD2] better when the proportion below LD is incompatible with the left-hand tail of a bivariate Gaussian distribution. When the proportion of both variables falling below LD is very large, the parameters of the lower component may not be estimable since almost all observations from the lower component are falling below LD. A partial solution is to assume that the lower component's entire support is on [0, LD1]×[0, LD2]. Maximum likelihood is used to estimate the parameters of the lower and higher components. To evaluate whether there is a lower component, we apply a Monte Carlo approach to assess the p -value of the likelihood ratio test and two information criteria: a bootstrap-based information criterion and a cross-validation-based information criterion. We provide simulation results to evaluate the performance and compare it with two ad hoc estimators and a single-component bivariate Gaussian likelihood estimator. These methods are applied to the data from a cohort study of HIV-infected men in Rio de Janeiro, Brazil, and the data from the Women's Interagency HIV oral study. These results emphasize the need for caution when estimating correlation coefficients from data with a large proportion of non-detectable values when the proportion below LD is incompatible with the left-hand tail of a bivariate Gaussian distribution. [source]

Attributing Hardy-Weinberg Disequilibrium to Population Stratification and Genetic Association in Case-Control Studies

ANNALS OF HUMAN GENETICS, Issue 1 2010
Vaneeta K. Grover
Summary Loci exhibiting Hardy-Weinberg disequilibrium (HWD) are often excluded from association studies, because HWD may indicate genotyping error, population stratification or selection bias. For case-control studies, HWD can result from a genetic effect at the locus. We extend the modelling to accommodate both stratification and genetic effects. Theoretical genotype frequencies and HWD coefficients are derived under a general genetic model for a population with two strata. Maximum likelihood is used to estimate model parameters and a test for lack of fit identifies the models most consistent with the data. Simulations were used to assess the method. The technique was applied to a group of ethnically and clinically heterogeneous kidney stone formers and controls, both exhibiting HWD for the R990G SNP of the CASR gene. Results indicate the best fitting model incorporates both stratification and genetic association. The ability of our method to apportion HWD to stratification and genetic effects may well be a significant advance in dealing with heterogeneity in case-control genetic association studies. [source]

A molecular phylogeny of the peacock-pheasants (Galliformes: Polyplectron spp.) indicates loss and reduction of ornamental traits and display behaviours

BIOLOGICAL JOURNAL OF THE LINNEAN SOCIETY, Issue 2 2001
REBECCA T. KIMBALL
The South-east Asian pheasant genus Polyplectron is comprised of six or seven species which are characterized by ocelli (ornamental eye-spots) in all but one species, though the sizes and distribution of ocelli vary among species. All Polyplectron species have lateral displays, but species with ocelli also display frontally to females, with feathers held erect and spread to clearly display the ocelli. The two least ornamented Polyplectron species, one of which completely lacks ocelli, have been considered the primitive members of the genus, implying that ocelli are derived. We examined this hypothesis phylogenetically using complete mitochondrial cytochrome b and control region sequences, as well as sequences from intron G in the nuclear ovomucoid gene, and found that the two least ornamented species are in fact the most recently evolved. Thus, the absence and reduction of ocelli and other ornamental traits in Polyplectronare recent losses. The only variable that may correlate with the reduction in ornamentation is habitat, as the two less-ornamented species inhabit montane regions, while the ornamented species inhabit lowland regions. The implications of these findings are discussed in light of models of sexual selection. The phylogeny is not congruent with current geographical distributions, and there is little evidence that Pleistocene sea level changes promoted speciation in this genus. Maximum likelihood and maximum parsimony analyses of cytochrome b sequences suggest that the closest relatives of Polyplectron are probably the peafowl and the argus pheasants. [source]

Bayesian Semiparametric Multiple Shrinkage

BIOMETRICS, Issue 2 2010
Richard F. MacLehose
Summary High-dimensional and highly correlated data leading to non- or weakly identified effects are commonplace. Maximum likelihood will typically fail in such situations and a variety of shrinkage methods have been proposed. Standard techniques, such as ridge regression or the lasso, shrink estimates toward zero, with some approaches allowing coefficients to be selected out of the model by achieving a value of zero. When substantive information is available, estimates can be shrunk to nonnull values; however, such information may not be available. We propose a Bayesian semiparametric approach that allows shrinkage to multiple locations. Coefficients are given a mixture of heavy-tailed double exponential priors, with location and scale parameters assigned Dirichlet process hyperpriors to allow groups of coefficients to be shrunk toward the same, possibly nonzero, mean. Our approach favors sparse, but flexible, structure by shrinking toward a small number of random locations. The methods are illustrated using a study of genetic polymorphisms and Parkinson's disease. [source]

Maximum likelihood constrained deconvolution.

Phenotypic variation and FMRP levels in fragile X

DEVELOPMENTAL DISABILITIES RESEARCH REVIEW, Issue 1 2004
Danuta Z. Loesch
Abstract Data on the relationships between cognitive and physical phenotypes, and a deficit of fragile X mental retardation 1 (FMR1) gene-specific protein product, FMRP, are presented and discussed in context with earlier findings. The previously unpublished results obtained, using standard procedures of regression and correlations, showed highly significant associations in males between FMRP levels and the Wechsler summary and subtest scores and in females between these levels and the full-scale intelligence quotient (FSIQ), verbal and performance IQ, and some Wechsler subtest scores. The published results based on data from 144 extended families with fragile X, recruited from Australia and the United States within a collaborative NIH-supported project, were obtained using robust modification of maximum likelihood in pedigrees. The results indicated that processing speed, short-term memory, and the ability to control attention, especially in the context of regulating goal-directed behavior, may be primarily affected by the FMRP depletion. The effect of this depletion on physical phenotype was also demonstrated, especially on body and head height and extensibility of finger joints. It is recommended that further studies should rely on more accurate measures of FMRP levels, and use of larger samples, to overcome extensive variability in the data. MRDD Research Reviews 2004;10:31,41. © 2004 Wiley-Liss, Inc. [source]

Reconstructing ancestral ecologies: challenges and possible solutions

DIVERSITY AND DISTRIBUTIONS, Issue 1 2006
Christopher R. Hardy
ABSTRACT There are several ways to extract information about the evolutionary ecology of clades from their phylogenies. Of these, character state optimization and ,ancestor reconstruction' are perhaps the most widely used despite their being fraught with assumptions and potential pitfalls. Requirements for robust inferences of ancestral traits in general (i.e. those applicable to all types of characters) include accurate and robust phylogenetic hypotheses, complete species-level sampling and the appropriate choice of optimality criterion. Ecological characters, however, also require careful consideration of methods for accounting for intraspecific variability. Such methods include ,Presence Coding' and ,Polymorphism Coding' for discrete ecological characters, and ,Range Coding' and ,MaxMin Coding' for continuously variable characters. Ultimately, however, historical inferences such as these are, as with phylogenetic inference itself, associated with a degree of uncertainty. Statistically based uncertainty estimates are available within the context of model-based inference (e.g. maximum likelihood and Bayesian); however, these measures are only as reliable as the chosen model is appropriate. Although generally thought to preclude the possibility of measuring relative uncertainty or support for alternative possible reconstructions, certain useful non-statistical support measures (i.e. ,Sharkey support' and ,Parsimony support') are applicable to parsimony reconstructions. [source]

End-of-Sample Instability Tests

ECONOMETRICA, Issue 6 2003
D. W. K. Andrews
This paper considers tests for structural instability of short duration, such as at the end of the sample. The key feature of the testing problem is that the number, m, of observations in the period of potential change is relatively small,possibly as small as one. The well-known F test of Chow (1960) for this problem only applies in a linear regression model with normally distributed iid errors and strictly exogenous regressors, even when the total number of observations, n+m, is large. We generalize the F test to cover regression models with much more general error processes, regressors that are not strictly exogenous, and estimation by instrumental variables as well as least squares. In addition, we extend the F test to nonlinear models estimated by generalized method of moments and maximum likelihood. Asymptotic critical values that are valid as n,, with m fixed are provided using a subsampling-like method. The results apply quite generally to processes that are strictly stationary and ergodic under the null hypothesis of no structural instability. [source]

Choosing the Number of Instruments

ECONOMETRICA, Issue 5 2001
Stephen G. Donald
Properties of instrumental variable estimators are sensitive to the choice of valid instruments, even in large cross-section applications. In this paper we address this problem by deriving simple mean-square error criteria that can be minimized to choose the instrument set. We develop these criteria for two-stage least squares (2SLS), limited information maximum likelihood (LIML), and a bias adjusted version of 2SLS (B2SLS). We give a theoretical derivation of the mean-square error and show optimality. In Monte Carlo experiments we find that the instrument choice generally yields an improvement in performance. Also, in the Angrist and Krueger (1991) returns to education application, when the instrument set is chosen in the way we consider, it turns out that both 2SLS and LIML give similar (large) returns to education. [source]

A Parametric Approach to Flexible Nonlinear Inference

ECONOMETRICA, Issue 3 2001
James D. Hamilton
This paper proposes a new framework for determining whether a given relationship is nonlinear, what the nonlinearity looks like, and whether it is adequately described by a particular parametric model. The paper studies a regression or forecasting model of the form yt=,(xt)+,t where the functional form of ,(,) is unknown. We propose viewing ,(,) itself as the outcome of a random process. The paper introduces a new stationary random field m(,) that generalizes finite-differenced Brownian motion to a vector field and whose realizations could represent a broad class of possible forms for ,(,). We view the parameters that characterize the relation between a given realization of m(,) and the particular value of ,(,) for a given sample as population parameters to be estimated by maximum likelihood or Bayesian methods. We show that the resulting inference about the functional relation also yields consistent estimates for a broad class of deterministic functions ,(,). The paper further develops a new test of the null hypothesis of linearity based on the Lagrange multiplier principle and small-sample confidence intervals based on numerical Bayesian methods. An empirical application suggests that properly accounting for the nonlinearity of the inflation-unemployment trade-off may explain the previously reported uneven empirical success of the Phillips Curve. [source]