Genomic Sequence Data (genomic + sequence_data)

Distribution by Scientific Domains


Selected Abstracts


Characterization of 11 microsatellite loci derived from genomic sequences of polychaete Capitella capitata complex

MOLECULAR ECOLOGY RESOURCES, Issue 6 2007
HONGWEI DU
Abstract Eleven microsatellite loci derived from the genomic sequence data of Capitella capitata were characterized using 30 samples. The observed number of alleles per locus ranged from four to 36. The levels of observed and expected heterozygosities for polymorphic loci were from 0.10 to 0.87 and from 0.37 to 0.98, averaging 0.52 and 0.77, respectively. Analyses of Hardy,Weinberg equilibrium and genotypic linkage disequilibrium suggest the possible presence of both null alleles and Wahlund effect. One of the 11 loci was difficult to amplify for genotyping. Therefore, the rest 10 loci are good molecular markers for population genetic analysis. [source]


Cloning and characterization of small RNAs from Medicago truncatula reveals four novel legume-specific microRNA families

NEW PHYTOLOGIST, Issue 1 2009
Guru Jagadeeswaran
Summary ,,MicroRNAs (miRNAs) and small-interfering RNAs (siRNAs) have emerged as important regulators of gene expression in higher eukaryotes. Recent studies indicate that genomes in higher plants encode lineage-specific and species-specific miRNAs in addition to the well-conserved miRNAs. Leguminous plants are grown throughout the world for food and forage production. To date the lack of genomic sequence data has prevented systematic examination of small RNAs in leguminous plants. Medicago truncatula, a diploid plant with a near-completely sequenced genome has recently emerged as an important model legume. ,,We sequenced a small RNA library generated from M. truncatula to identify not only conserved miRNAs but also novel small RNAs, if any. ,,Eight novel small RNAs were identified, of which four (miR1507, miR2118, miR2119 and miR2199) are annotated as legume-specific miRNAs because these are conserved in related legumes. Three novel transcripts encoding TIR-NBS-LRR proteins are validated as targets for one of the novel miRNA, miR2118. Small RNA sequence analysis coupled with the small RNA blot analysis, confirmed the expression of around 20 conserved miRNA families in M. truncatula. Fifteen transcripts have been validated as targets for conserved miRNAs. We also characterized Tas3-siRNA biogenesis in M. truncatula and validated three auxin response factor (ARF) transcripts that are targeted by tasiRNAs. ,,These findings indicate that M. truncatula and possibly other related legumes have complex mechanisms of gene regulation involving specific and common small RNAs operating post-transcriptionally. [source]


An amino acid "transmembrane tendency" scale that approaches the theoretical limit to accuracy for prediction of transmembrane helices: Relationship to biological hydrophobicity

PROTEIN SCIENCE, Issue 8 2006
Gang Zhao
Abstract Hydrophobicity analyses applied to databases of soluble and transmembrane (TM) proteins of known structure were used to resolve total genomic hydrophobicity profiles into (helical) TM sequences and mainly "subhydrophobic" soluble components. This information was used to define a refined "hydrophobicity"-type TM sequence prediction scale that should approach the theoretical limit of accuracy. The refinement procedure involved adjusting scale values to eliminate differences between the average amino acid composition of populations TM and soluble sequences of equal hydrophobicity, a required property of a scale having maximum accuracy. Application of this procedure to different hydrophobicity scales caused them to collapse to essentially a single TM tendency scale. As expected, when different scales were compared, the TM tendency scale was the most accurate at predicting TM sequences. It was especially highly correlated (r = 0.95) to the biological hydrophobicity scale, derived experimentally from the percent TM conformation formed by artificial sequences passing though the translocon. It was also found that resolution of total genomic sequence data into TM and soluble components could be used to define the percent probability that a sequence with a specific hydrophobicity value forms a TM segment. Application of the TM tendency scale to whole genomic data revealed an overlap of TM and soluble sequences in the "semihydrophobic" range. This raises the possibility that a significant number of proteins have sequences that can switch between TM and non-TM states. Such proteins may exist in moonlighting forms having properties very different from those of the predominant conformation. [source]


Nonoverlapping Clusters: Approximate Distribution and Application to Molecular Biology

BIOMETRICS, Issue 2 2001
Xiaoping Su
Summary. An approach is developed for the screening of genomic sequence data to identify gene regulatory regions. This approach is based on deciding if putative transcription factor binding sites are clustered together to a greater extent than one would expect by chance. Given n events occurring on an interval of width L (L base pairs), an r:w cluster is defined as r+ 1 consecutive events all contained within a window of length wL. Accurate and easily computable approximations are derived for the distribution of the number of nonoverlapping r:w clusters under the model that the positions of the n events have a uniform distribution. Simulations demonstrate that these approximations have greater accuracy than existing methods. The approximation is applied to detect erythroid-specific regulatory regions in genomic DNA sequences, first in an artificial case where r is specified a priori and then as part of an exploratory approach. [source]