Protein Sequences (protein + sequence)

Distribution by Scientific Domains
Distribution within Chemistry


Selected Abstracts


Protein Sequences as Literature Text

MACROMOLECULAR THEORY AND SIMULATIONS, Issue 5 2006
Valentina V. Vasilevskaya
Abstract Summary: We have performed analysis of protein sequences treating them as texts written in a "protein" language. We have shown that repeating patterns (words) of various lengths can be identified in these sequences. It was found that the maximum word lengths are different for proteins belonging to different classes; therefore, the corresponding values can be used to characterize the protein type. The suggested technique was first applied to analyze (decompose into words) normal (literature) texts written as a gapless symbolic sequence without spaces and punctuation marks. The tests using fiction, scientific, and popular scientific English texts proved the relative efficiency of the technique. Maximum word length for various proteins: ,fibrillar proteins, ,globular proteins, ,membrane proteins. [source]


Proteome analysis of human foetal, aged and advanced nuclear cataract lenses

PROTEOMICS - CLINICAL APPLICATIONS, Issue 12 2008
Peter G. Hains
Abstract The most complete proteome of human lenses has been compiled using 2-D LC-MS/MS analysis of foetal, aged normal and advanced nuclear cataract lenses. A total of 231 proteins were identified across all lens groups, including 112 proteins that have not been reported previously. Proteins were grouped according to their PANTHER molecular function classification in order to facilitate comparisons. Previously unreported N-terminal acetylation was detected in a number of proteins, with the majority being associated with the prior removal of a methionine residue. This pattern of proteolysis may indicate that methionine aminopeptidase activity is present in human lenses. Acetylation is likely to aid in the stability of proteins that are present in the lens for many decades. Protein sequences were also used to interrogate the three human lens cDNA libraries publicly available. Surprisingly, 84 proteins we identified were not present in the cDNA libraries. [source]


The N-glycans of yellow jacket venom hyaluronidases and the protein sequence of its major isoform in Vespula vulgaris

FEBS JOURNAL, Issue 20 2005
Daniel Kolarich
Hyaluronidase (E.C. 3.2.1.35), one of the three major allergens of yellow jacket venom, is a glycoprotein of 45 kDa that is largely responsible for the cross-reactivity of wasp and bee venoms with sera of allergic patients. The asparagine-linked carbohydrate often appears to constitute the common IgE-binding determinant. Using a combination of MALDI MS and HPLC of 2-aminopyridine-labelled glycans, we found core-difucosylated paucimannosidic glycans to be the major species in the 43,45 kDa band of Vespula vulgaris and also in the corresponding bands of venoms from five other wasp species (V. germanica, V. maculifrons, V. pensylvanica, V. flavopilosa and V. squamosa). Concomitant peptide mapping of the V. vulgaris 43 kDa band identified the known hyaluronidase, Ves v 2 (SwissProt P49370), but only as a minor component. De novo sequencing by tandem MS revealed the predominating peptides to resemble a different, yet homologous, sequence. cDNA cloning retrieved a sequence with 58 and 59% homology to the previously known isoform and to the Dolichovespula maculata and Polistes annularis hyaluronidases. Close homologues of this new, putative hyaluronidase b (Ves v 2b) were also the major isoform in the other wasp venoms. [source]


Sjögren-Larsson syndrome: Diversity of mutations and polymorphisms in the fatty aldehyde dehydrogenase gene (ALDH3A2),

HUMAN MUTATION, Issue 1 2005
William B. Rizzo
Abstract Sjögren-Larsson syndrome (SLS) is an autosomal recessive disorder characterized by ichthyosis, mental retardation, and spastic diplegia or tetraplegia. The disease is caused by mutations in the ALDH3A2 gene (also known as FALDH and ALDH10) on chromosome 17p11.2 that encodes fatty aldehyde dehydrogenase (FALDH), an enzyme that catalyzes the oxidation of long-chain aldehydes derived from lipid metabolism. In SLS patients, 72 mutations have been identified, with a distribution that is scattered throughout the ALDH3A2 gene. Most mutations are private but several common mutations have been detected, which probably reflect founder effects or recurrent mutational events. Missense mutations comprise the most abundant class (38%) and expression studies indicate that most of these result in a profound reduction in enzyme activity. Deletions account for about 25% of the mutations and range from single nucleotides to entire exons. Twelve splice-site mutations have been demonstrated to cause aberrant splicing in cultured fibroblasts. To date, more than a dozen intragenic ALDH3A2 polymorphisms consisting of SNPs and one microsatellite marker have been characterized, although none of them alter the FALDH protein sequence. The striking mutational diversity in SLS offers a challenge for DNA-based diagnosis, but promises to provide a wealth of information about enzyme structure,function correlations. Hum Mutat 26(1), 1,10, 2005. © 2005 Wiley-Liss, Inc. [source]


Major histocompatibility complex class I binding predictions as a tool in epitope discovery

IMMUNOLOGY, Issue 3 2010
Claus Lundegaard
Summary Over the last decade, in silico models of the major histocompatibility complex (MHC) class I pathway have developed significantly. Before, peptide binding could only be reliably modelled for a few major human or mouse histocompatibility molecules; now, high-accuracy predictions are available for any human leucocyte antigen (HLA) -A or -B molecule with known protein sequence. Furthermore, peptide binding to MHC molecules from several non-human primates, mouse strains and other mammals can now be predicted. In this review, a number of different prediction methods are briefly explained, highlighting the most useful and historically important. Selected case stories, where these ,reverse immunology' systems have been used in actual epitope discovery, are briefly reviewed. We conclude that this new generation of epitope discovery systems has become a highly efficient tool for epitope discovery, and recommend that the less accurate prediction systems of the past be abandoned, as these are obsolete. [source]


Cloning and characterization of an immunoglobulin A Fc receptor from cattle

IMMUNOLOGY, Issue 2 2004
H. Craig Morton
Summary Here, we describe the cloning, sequencing and characterization of an immunoglobulin A (IgA) Fc receptor from cattle (bFc,R). By screening a translated EST database with the protein sequence of the human IgA Fc receptor (CD89) we identified a putative bovine homologue. Subsequent polymerase chain reaction (PCR) amplification confirmed that the identified full-length cDNA was expressed in bovine cells. COS-1 cells transfected with a plasmid containing the cloned cDNA bound to beads coated with either bovine or human IgA, but not to beads coated with bovine IgG2 or human IgG. The bFc,R cDNA is 873 nucleotides long and is predicted to encode a 269 amino-acid transmembrane glycoprotein composed of two immunoglobulin-like extracellular domains, a transmembrane region and a short cytoplasmic tail devoid of known signalling motifs. Genetically, bFc,R is more closely related to CD89, bFc,2R, NKp46, and the KIR and LILR gene families than to other FcRs. Moreover, the bFc,R gene maps to the bovine leucocyte receptor complex on chromosome 18. Identification of the bFc,R will aid in the understanding of IgA,Fc,R interactions, and may facilitate the isolation of Fc,R from other species. [source]


cDNA of an arylphorin-type storage protein from Pieris rapae with parasitism inducible expression by the endoparasitoid wasp Pteromalus puparum

INSECT SCIENCE, Issue 3 2009
Jia-Ying Zhu
Abstract, This report presents the cDNA cloning of a storage protein, PraAry, from Pieris rapae and investigates its expression regulated by parasitism of an endoparasitoid wasp Pteromalus puparum. The full-length cDNA of PraAry is 2 270 nucleotides and contains a 2 121 nucleotide open reading frame encoding 707 amino acids with calculated molecular weights of approximately 83 kDa. Analysis of the primary protein sequence revealed that it possesses a signal peptide of 16 amino acids at the N-terminus and contains two highly conserved storage protein signature motifs. According to both phylogenetic analysis and the criteria for amino acid composition, PraAry belongs to the subfamily of arylphorin-type storage protein (1.42% methionine and 18.82% aromatic amino acids). Reverse transcription , polymerase chain reaction analysis indicated that the transcriptional level of PraAry mRNA in P. rapae pupae fat body is inducible in response to parasitism by P. puparum. [source]


Purification of Matrix Gla Protein From a Marine Teleost Fish, Argyrosomus regius: Calcified Cartilage and Not Bone as the Primary Site of MGP Accumulation in Fish,

JOURNAL OF BONE AND MINERAL RESEARCH, Issue 2 2003
DC Simes
Abstract Matrix Gla protein (MGP) belongs to the family of vitamin K-dependent, Gla-containing proteins, and in mammals, birds, and Xenopus, its mRNA was previously detected in extracts of bone, cartilage, and soft tissues (mainly heart and kidney), whereas the protein was found to accumulate mainly in bone. However, at that time, it was not evaluated if this accumulation originated from protein synthesized in cartilage or in bone cells because both coexist in skeletal structures of higher vertebrates and Xenopus. Later reports showed that MGP also accumulated in costal calcified cartilage as well as at sites of heart valves and arterial calcification. Interestingly, MGP was also found to accumulate in vertebra of shark, a cartilaginous fish. However, to date, no information is available on sites of MGP expression or accumulation in teleost fishes, the ancestors of terrestrial vertebrates, who have in their skeleton mineralized structures with both bone and calcified cartilage. To analyze MGP structure and function in bony fish, MGP was acid-extracted from the mineralized matrix of either bone tissue (vertebra) or calcified cartilage (branchial arches) from the bony fish, Argyrosomus regius,, separated from the mineral phase by dialysis, and purified by Sephacryl S-100 chromatography. No MGP was recovered from bone tissue, whereas a protein peak corresponding to the MGP position in this type of gel filtration was obtained from an extract of branchial arches, rich in calcified cartilage. MGP was identified by N-terminal amino acid sequence analysis, and the resulting protein sequence was used to design specific oligonucleotides suitable to amplify the corresponding DNA by a mixture of reverse transcription-polymerase chain reaction (RT-PCR) and 5,rapid amplification of cDNA (RACE)-PCR. In parallel, ArBGP (bone Gla protein, osteocalcin) was also identified in the same fish, and its complementary DNA cloned by an identical procedure. Tissue distribution/accumulation was analyzed by Northern blot, in situ hybridization, and immunohistochemistry. In mineralized tissues, the MGP gene was predominantly expressed in cartilage from branchial arches, with no expression detected in the different types of bone analyzed, whereas BGP mRNA was located in bone tissue as expected. Accordingly, the MGP protein was found to accumulate, by immunohistochemical analysis, mainly in the extracellular matrix of calcified cartilage. In soft tissues, MGP mRNA was mainly expressed in heart but in situ hybridization, indicated that cells expressing the MGP gene were located in the bulbus arteriosus and aortic wall, rich in smooth muscle and endothelial cells, whereas no expression was detected in the striated muscle myocardial fibers of the ventricle. These results show that in marine teleost fish, as in mammals, the MGP gene is expressed in cartilage, heart, and kidney tissues, but in contrast with results obtained in Xenopus and higher vertebrates, the protein does not accumulate in vertebra of non-osteocytic teleost fish, but only in calcified cartilage. In addition, our results also indicate that the presence of MGP mRNA in heart tissue is due, at least in fish, to the expression of the MGP gene in only two specific cell types, smooth muscle and endothelial cells, whereas no expression was found in the striated muscle fibers of the ventricle. In light of these results and recent information on expression of MGP gene in these same cell types in mammalian aorta, it is likely that the levels of MGP mRNA previously detected in Xenopus, birds, and mammalian heart tissue may be restricted toregions rich in smooth muscle and endothelial cells. Our results also emphasize the need to re-evaluate which cell types are involved in MGP gene expression in other soft tissues and bring further evidence that fish are a valuable model system to study MGP gene expression and regulation. [source]


An efficient algorithm for multistate protein design based on FASTER

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 5 2010
Benjamin D. Allen
Abstract Most of the methods that have been developed for computational protein design involve the selection of side-chain conformations in the context of a single, fixed main-chain structure. In contrast, multistate design (MSD) methods allow sequence selection to be driven by the energetic contributions of multiple structural or chemical states simultaneously. This methodology is expected to be useful when the design target is an ensemble of related states rather than a single structure, or when a protein sequence must assume several distinct conformations to function. MSD can also be used with explicit negative design to suggest sequences with altered structural, binding, or catalytic specificity. We report implementation details of an efficient multistate design optimization algorithm based on FASTER (MSD-FASTER). We subjected the algorithm to a battery of computational tests and found it to be generally applicable to various multistate design problems; designs with a large number of states and many designed positions are completely feasible. A direct comparison of MSD-FASTER and multistate design Monte Carlo indicated that MSD-FASTER discovers low-energy sequences much more consistently. MSD-FASTER likely performs better because amino acid substitutions are chosen on an energetic basis rather than randomly, and because multiple substitutions are applied together. Through its greater efficiency, MSD-FASTER should allow protein designers to test experimentally better-scoring sequences, and thus accelerate progress in the development of improved scoring functions and models for computational protein design. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2010 [source]


Prediction of integral membrane protein type by collocated hydrophobic amino acid pairs

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 1 2009
Ke Chen
Abstract A computational model, IMP-TYPE, is proposed for the classification of five types of integral membrane proteins from protein sequence. The proposed model aims not only at providing accurate predictions but most importantly it incorporates interesting and transparent biological patterns. When contrasted with the best-performing existing models, IMP-TYPE reduces the error rates of these methods by 19 and 34% for two out-of-sample tests performed on benchmark datasets. Our empirical evaluations also show that the proposed method provides even bigger improvements, i.e., 29 and 45% error rate reductions, when predictions are performed for sequences that share low (40%) identity with sequences from the training dataset. We also show that IMP-TYPE can be used in a standalone mode, i.e., it duplicates significant majority of correct predictions provided by other leading methods, while providing additional correct predictions which are incorrectly classified by the other methods. Our method computes predictions using a Support Vector Machine classifier that takes feature-based encoded sequence as its input. The input feature set includes hydrophobic AA pairs, which were selected by utilizing a consensus of three feature selection algorithms. The hydrophobic residues that build up the AA pairs used by our method are shown to be associated with the formation of transmembrane helices in a few recent studies concerning integral membrane proteins. Our study also indicates that Met and Phe display a certain degree of hydrophobicity, which may be more crucial than their polarity or aromaticity when they occur in the transmembrane segments. This conclusion is supported by a recent study on potential of mean force for membrane protein folding and a study of scales for membrane propensity of amino acids. © 2008 Wiley Periodicals, Inc. J Comput Chem, 2009 [source]


Using grey dynamic modeling and pseudo amino acid composition to predict protein structural classes

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 12 2008
Xuan Xiao
Abstract Using the pseudo amino acid (PseAA) composition to represent the sample of a protein can incorporate a considerable amount of sequence pattern information so as to improve the prediction quality for its structural or functional classification. However, how to optimally formulate the PseAA composition is an important problem yet to be solved. In this article the grey modeling approach is introduced that is particularly efficient in coping with complicated systems such as the one consisting of many proteins with different sequence orders and lengths. On the basis of the grey model, four coefficients derived from each of the protein sequences concerned are adopted for its PseAA components. The PseAA composition thus formulated is called the "grey-PseAA" composition that can catch the essence of a protein sequence and better reflect its overall pattern. In our study we have demonstrated that introduction of the grey-PseAA composition can remarkably enhance the success rates in predicting the protein structural class. It is anticipated that the concept of grey-PseAA composition can be also used to predict many other protein attributes, such as subcellular localization, membrane protein type, enzyme functional class, GPCR type, protease type, among many others. © 2008 Wiley Periodicals, Inc. J Comput Chem 2008. [source]


Using pseudo amino acid composition to predict protein structural classes: Approached with complexity measure factor

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 4 2006
Xuan Xiao
Abstract The structural class is an important feature widely used to characterize the overall folding type of a protein. How to improve the prediction quality for protein structural classification by effectively incorporating the sequence-order effects is an important and challenging problem. Based on the concept of the pseudo amino acid composition [Chou, K. C. Proteins Struct Funct Genet 2001, 43, 246; Erratum: Proteins Struct Funct Genet 2001, 44, 60], a novel approach for measuring the complexity of a protein sequence was introduced. The advantage by incorporating the complexity measure factor into the pseudo amino acid composition as one of its components is that it can catch the essence of the overall sequence pattern of a protein and hence more effectively reflect its sequence-order effects. It was demonstrated thru the jackknife crossvalidation test that the overall success rate by the new approach was significantly higher than those by the others. It has not escaped our notice that the introduction of the complexity measure factor can also be used to improve the prediction quality for, among many other protein attributes, subcellular localization, enzyme family class, membrane protein type, and G-protein couple receptor type. © 2006 Wiley Periodicals, Inc. J Comput Chem 27: 478,482, 2006 [source]


Genetic heterogeneity of G and F protein genes from Argentinean human metapneumovirus strains

JOURNAL OF MEDICAL VIROLOGY, Issue 5 2006
Monica Galiano
Abstract Human metapneumovirus (hMPV) is a newly identified paramixovirus, associated with respiratory illnesses in all age groups. Two genetic groups of hMPV have been described. The nucleotide sequences of the G and F genes from 11 Argentinean hMPV strains (1998,2003) were determined by RT-PCR and direct sequencing. Phylogenetic analysis showed that hMPV strains clustered into two main genetic lineages, A and B. Strains clustered into A group were split into two sublineages, A1 and A2. All strains belonging to group B clustered with representative strains from sublineage B1. No Argentinean strains belonged to sublineage B2. F sequences showed high percentage identities at nucleotide and amino acid levels. In contrast, G sequences showed high diversity between A and B groups. Most changes observed in the deduced G protein sequence were amino acid substitutions in the extracellular domain, and changes in stop codon usage leading to different lengths in the G proteins. High content of serine and threonine residues were also shown, suggesting that this protein would be highly glycosylated. The potential sites for N- and O-glycosylation seem to have a different conservation pattern between the two main groups. This is the first report on the genetic variability of the G and F protein genes of hMPV strains in South America. Two main genetic groups and at least three subgroups were revealed among Argentinean hMPV strains. The F protein seems to be highly conserved, whereas the G protein showed extensive diversity between groups A and B. J. Med. Virol. 78:631,637, 2006. © 2006 Wiley-Liss, Inc. [source]


Machine learning approaches for prediction of linear B-cell epitopes on proteins

JOURNAL OF MOLECULAR RECOGNITION, Issue 3 2006
Johannes Söllner
Abstract Identification and characterization of antigenic determinants on proteins has received considerable attention utilizing both, experimental as well as computational methods. For computational routines mostly structural as well as physicochemical parameters have been utilized for predicting the antigenic propensity of protein sites. However, the performance of computational routines has been low when compared to experimental alternatives. Here we describe the construction of machine learning based classifiers to enhance the prediction quality for identifying linear B-cell epitopes on proteins. Our approach combines several parameters previously associated with antigenicity, and includes novel parameters based on frequencies of amino acids and amino acid neighborhood propensities. We utilized machine learning algorithms for deriving antigenicity classification functions assigning antigenic propensities to each amino acid of a given protein sequence. We compared the prediction quality of the novel classifiers with respect to established routines for epitope scoring, and tested prediction accuracy on experimental data available for HIV proteins. The major finding is that machine learning classifiers clearly outperform the reference classification systems on the HIV epitope validation set. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Arthropod defensins illuminate the divergence of scorpion neurotoxins

JOURNAL OF PEPTIDE SCIENCE, Issue 12 2004
Dr Oren Froy
Abstract Defensins are phylogenetically ancient antibacterial polypeptides found in plants and animals. Isolation of the cDNA and genomic sequences encoding the scorpion (Leiurus quinquestriatus hebraeus) defensin revealed similarity to scorpion neurotoxins in gene organization (two exons and a phase I intron) and intron characteristics (conserved acceptor, donor and putative branch sites). This commonality, alongside a similar core structure, protein sequence and bioactivity suggest that arthropod defensins and scorpion neurotoxins share a common ancestor. Interestingly, phylogenetic analysis of defensins and scorpion neurotoxins illuminates for the first time a putative evolutionary trajectory for scorpion sodium and potassium channel neurotoxins. Copyright © 2004 European Peptide Society and John Wiley & Sons, Ltd. [source]


Formulation considerations for proteins susceptible to asparagine deamidation and aspartate isomerization

JOURNAL OF PHARMACEUTICAL SCIENCES, Issue 11 2006
Aditya A. Wakankar
Abstract The asparagine (Asn) deamidation and aspartate (Asp) isomerization reactions are nonenzymatic intra-molecular reactions occurring in peptides and proteins that are a source of major stability concern in the formulation of these biomolecules. The mechanisms for the deamidation and isomerization reactions are similar since they both proceed through an intra-molecular cyclic imide (Asu) intermediate. The formation of the Asu intermediate, which involves the attack by nitrogen of the peptide backbone on the carbonyl carbon of the Asn or the Asp side chain, is the rate-limiting step in both the deamidation and the isomerization reactions at physiological pH. In this article, the influence of factors such as formulation conditions, protein primary sequence, and protein structure on the reactivity of Asn and Asp residues in proteins are reviewed. The importance of formulation conditions such as pH and solvent dielectric in influencing deamidation and isomerization reaction rates is addressed. Formulation strategies that could improve the stability of proteins to deamidation and isomerization reactions are described. The review is intended to provide information to formulation scientists, based on protein sequence and structure, to predict potential degradative sites on a protein molecule and to enable formulation scientists to set appropriate formulation conditions to minimize reactivity of Asn and Asp residues in protein therapeutics. © 2006 Wiley-Liss, Inc. and the American Pharmacists Association J Pharm Sci 95:2321,2336, 2006 [source]


Cytological Alterations Produced by Sweet Potato Mild Speckling Virus

JOURNAL OF PHYTOPATHOLOGY, Issue 7-8 2006
C. F. Nome
Abstract The potyvirus sweet potato mild speckling (SPMSV) has the biological properties and the coat protein sequence already described. In this work, cytological alterations and the intracellular localization in Ipomoea setosa and Ipomoea batatas was studied. The observations were carried out by means of transmission electron microscopy, complemented with immunogold techniques for the viral localization with SPMSV antiserum of local production. The observations carried out showed almost no alteration on cell components but the presence of cylindrical inclusion in the cytoplasm (bundles, laminate aggregates, and pinwheels, neither circles nor scrolls) belonging to the type-2 in the classification of Edwardson and Christie (Cylindrical Inclusions. Bulletin 894, 1996, pp. 1,11). Gold particles were localized in cytoplasms of all tissues of the leaf. [source]


Applications of mass spectrometry for the structural characterization of recombinant protein pharmaceuticals

MASS SPECTROMETRY REVIEWS, Issue 3 2007
Catherine A. Srebalus Barnes
Abstract Therapeutic proteins produced using recombinant DNA technologies are generally complex, heterogeneous, and subject to a variety of enzymatic or chemical modifications during expression, purification, and long-term storage. The use of mass spectrometry (MS) for the evaluation of recombinant protein sequence and structure provides detailed information regarding amino acid modifications and sequence alterations that have the potential to affect the safety and activity of therapeutic protein products. General MS approaches for the characterization of recombinant therapeutic protein products will be reviewed with particular attention given to the standard MS tools available in most biotechnology laboratories. A number of recent examples will be used to illustrate the utility of MS strategies for evaluation of recombinant protein heterogeneity resulting from post-translational modifications (PTMs), sequence variations generated from proteolysis or transcriptional/translational errors, and degradation products which are formed during processing or final product storage. Specific attention will be given to the MS characterization of monoclonal antibodies as a model system for large, glycosylated, recombinant proteins. Detailed examples highlighting the use of MS for the analysis of monoclonal antibody glycosylation, deamidation, and disulfide mapping will be used to illustrate the application of these techniques to a wide variety of heterogeneous therapeutic protein products. The potential use of MS to support the selection of cell line/clone selection and formulation development for therapeutic antibody products will also be discussed. © 2007 Wiley Periodicals, Inc., Mass Spec Rev [source]


Functional characterization of AP3, SOC1 and WUS homologues from citrus (Citrus sinensis)

PHYSIOLOGIA PLANTARUM, Issue 3 2007
Fui-Ching Tan
Flowering and flower formation are defining features of angiosperms and the control of these developmental processes involves a common repertoire of genes which are shared among different species of flowering plants. These genes were first identified using various homeotic and flowering time mutants of Arabidopsis and snapdragon, and homologous genes have subsequently been isolated from a wide range of different plant species based on the conservation of protein sequence and function. Using degenerate reverse-transcriptase polymerase chain reaction, we have isolated one APETALA3 -like (CitMADS8) and two SOC1 (SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1)-like (CsSL1 and CsSL2) homologues from sweet orange (Citrus sinensis L.). Although the translated amino acid sequence of CitMADS8 shares many similarities with other higher plant APETALA3 proteins, CitMADS8 fails to complement the floral organ identity defects of the Arabidopsis ap3-3 mutant. By contrast, the two citrus SOC1 -like genes, particularly CsSL1, are able to shorten the time taken to flower in the Arabidopsis wild-type ecotypes Columbia and C24, and functionally complement the late flowering phenotype of the soc1 mutant, essentially performing the endogenous function of Arabidopsis SOC1. Once flowering has commenced, interactions between specific flowering genes and a gene required for meristem maintenance, WUSCHEL, ensure that the Arabidopsis flower is a determinate structure with four whorls. We have isolated a citrus WUSCHEL homologue (CsWUS) that is capable of restoring most of the meristem function to the shoots and flowers of the Arabidopsis wus-1 mutant, implying that CsWUS is the functional equivalent of Arabidopsis WUSCHEL. [source]


Up-Regulation of OsBIHD1, a Rice Gene Encoding BELL Homeodomain Transcriptional Factor, in Disease Resistance Responses

PLANT BIOLOGY, Issue 5 2005
H. Luo
Abstract: In the present study, we cloned and identified a full-length cDNA of a rice gene, OsBIHD1, encoding a homeodomain type transcriptional factor. OsBIHD1 is predicted to encode a 642 amino acid protein and the deduced protein sequence of OsBIHD1 contains all conserved domains, a homeodomain, a BELL domain, a SKY box, and a VSLTLGL box, which are characteristics of the BELL type homedomain proteins. The recombinant OsBIHD1 protein expressed in Escherichia coli bound to the TGTCA motif that is the characteristic cis -element DNA sequence of the homeodomain transcriptional factors. Subcellular localization analysis revealed that the OsBIHD1 protein localized in the nucleus of the plant cells. The OsBIHD1 gene was mapped to chromosome 3 of the rice genome and is a single-copy gene with four exons and three introns. Northern blot analysis showed that expression of OsBIHD1 was activated upon treatment with benzothiadiazole (BTH), which is capable of inducing disease resistance. Expression of OsBIHD1 was also up-regulated rapidly during the first 6 h after inoculation with Magnaporthe grisea in BTH-treated rice seedlings and during the incompatible interaction between M. grisea and a resistant genotype. These results suggest that OsBIHD1 is a BELL type of homeodomain transcription factor present in the nucleus, whose induction is associated with resistance response in rice. [source]


Evolutionary constraints on structural similarity in orthologs and paralogs,

PROTEIN SCIENCE, Issue 6 2009
Mark E. Peterson
Abstract Although a quantitative relationship between sequence similarity and structural similarity has long been established, little is known about the impact of orthology on the relationship between protein sequence and structure. Among homologs, orthologs (derived by speciation) more frequently have similar functions than paralogs (derived by duplication). Here, we hypothesize that an orthologous pair will tend to exhibit greater structural similarity than a paralogous pair at the same level of sequence similarity. To test this hypothesis, we used 284,459 pairwise structure-based alignments of 12,634 unique domains from SCOP as well as orthology and paralogy assignments from OrthoMCL DB. We divided the comparisons by sequence identity and determined whether the sequence-structure relationship differed between the orthologs and paralogs. We found that at levels of sequence identity between 30 and 70%, orthologous domain pairs indeed tend to be significantly more structurally similar than paralogous pairs at the same level of sequence identity. An even larger difference is found when comparing ligand binding residues instead of whole domains. These differences between orthologs and paralogs are expected to be useful for selecting template structures in comparative modeling and target proteins in structural genomics. [source]


NMR solution structure of KP-TerB, a tellurite-resistance protein from Klebsiella pneumoniae

PROTEIN SCIENCE, Issue 4 2008
Sheng-Kuo Chiang
Abstract Klebsiella pneumoniae (KP), a Gram-negative bacterium, is a common cause of hospital-acquired bacterial infections worldwide. Tellurium (Te) compounds, although relatively rare in the environment, have a long history as antimicrobial and therapeutic agents. In bacteria, tellurite (TeO3,2) resistance is conferred by the ter (Ter) operon (terZABCDEF). Here, on the basis of 2593 restraints derived from NMR analysis, we report the NMR structure of TerB protein (151 amino acids) of KP (KP-TerB), which is mainly composed of seven ,-helices and a 310 helix, with helices II to V apparently forming a four-helix bundle. The ensemble of 20 NMR structures was well-defined, with a RMSD of 0.32 ± 0.06 Å for backbone atoms and 1.11 ± 0.07 Å for heavy atoms, respectively. A unique property of the KP-TerB structure is that the positively and negatively charged clusters are formed by the N-terminal positively and C-terminal negatively charged residues, respectively. To the best of our knowledge, the protein sequence and structures of KP-TerB are unique. [source]


Crystal structure of the yeast His6 enzyme suggests a reaction mechanism

PROTEIN SCIENCE, Issue 6 2006
Sophie Quevillon-Cheruel
Abstract The Saccharomycescerevisiae His6 gene codes for the enzyme phosphoribosyl-5-amino-1-phosphoribosyl-4-imidazolecarboxamide isomerase, catalyzing the fourth step in histidine biosynthesis. To get an insight into the structure and function of this enzyme, we determined its X-ray structure at a resolution of 1.30 Å using the anomalous diffraction signal of the protein's sulphur atoms at 1.77 Å wavelength. His6 folds in an (,/,)8 barrel similar to HisA, which performs the same function in bacteria and archaea. We found a citrate molecule from the buffer bound in a pocket near the expected position of the active site and used it to model the open form of the substrate (phosphoribulosyl moiety), which is a reaction intermediate. This model enables us to identify catalytic residues and to propose a reaction mechanism where two aspartates act as acid/base catalysts: Asp134 as a proton donor for ring opening, and Asp9 as a proton acceptor and donor during enolization of the aminoaldose. Asp9 is conserved in yeast His6 and bacterial or archaeal HisA sequences, and Asp134 has equivalents in both HisA and TrpF, but they occur at a different position in the protein sequence. [source]


Simultaneous assignment and structure determination of a membrane protein from NMR orientational restraints

PROTEIN SCIENCE, Issue 3 2003
Francesca M. Marassi
Abstract A solid-state NMR approach for simultaneous resonance assignment and three-dimensional structure determination of a membrane protein in lipid bilayers is described. The approach is based on the scattering, hence the descriptor "shotgun," of 15N-labeled amino acids throughout the protein sequence (and the resulting NMR spectra). The samples are obtained by protein expression in bacteria grown on media in which one type of amino acid is labeled and the others are not. Shotgun NMR short-circuits the laborious and time-consuming process of obtaining complete sequential assignments prior to the calculation of a protein structure from the NMR data by taking advantage of the orientational information inherent to the spectra of aligned proteins. As a result, it is possible to simultaneously assign resonances and measure orientational restraints for structure determination. A total of five two-dimensional 1H/15N PISEMA (polarization inversion spin exchange at the magic angle) spectra, from one uniformly and four selectively 15N-labeled samples, were sufficient to determine the structure of the membrane-bound form of the 50-residue major pVIII coat protein of fd filamentous bacteriophage. Pisa (polarity index slat angle) wheels are an essential element in the process, which starts with the simultaneous assignment of resonances and the assembly of isolated polypeptide segments, and culminates in the complete three-dimensional structure of the protein with atomic resolution. The principles are also applicable to weakly aligned proteins studied by solution NMR spectroscopy. [The structure we determined for the membrane-bound form of the Fd bacteriophage pVIII coat protein has been deposited in the Protein Data Bank as PDB file 1MZT.] [source]


Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments

PROTEIN SCIENCE, Issue 11 2000
Iddo Friedberg
Abstract The PSI-BLAST algorithm has been acknowledged as one of the most powerful tools for detecting remote evolutionary relationships by sequence considerations only. This has been demonstrated by its ability to recognize remote structural homologues and by the greatest coverage it enables in annotation of a complete genome. Although recognizing the correct fold of a sequence is of major importance, the accuracy of the alignment is crucial for the success of modeling one sequence by the structure of its remote homologue. Here we assess the accuracy of PSI-BLAST alignments on a stringent database of 123 structurally similar, sequence-dissimilar pairs of proteins, by comparing them to the alignments defined on a structural basis. Each protein sequence is compared to a nonredundant database of the protein sequences by PSI-BLAST. Whenever a pair member detects its pair-mate, the positions that are aligned both in the sequential and structural alignments are determined, and the alignment sensitivity is expressed as the per-centage of these positions out of the structural alignment. Fifty-two sequences detected their pair-mates (for 16 pairs the success was bi-directional when either pair member was used as a query). The average percentage of correctly aligned residues per structural alignment was 43.5 ± 2.2%. Other properties of the alignments were also examined, such as the sensitivity vs. specificity and the change in these parameters over consecutive iterations. Notably, there is an improvement in alignment sensitivity over consecutive iterations, reaching an average of 50.9 + 2.5% within the five iterations tested in the current study. [source]


A combination of neutral loss and targeted product ion scanning with two enzymatic digestions facilitates the comprehensive mapping of phosphorylation sites

PROTEINS: STRUCTURE, FUNCTION AND BIOINFORMATICS, Issue 15 2007
Juan Casado-Vela
Abstract We propose here a new strategy for the exhaustive mapping of phosphorylation sites in the Xenopus laevis Cdc25 phosphatase, which regulates cell cycle progression in eukaryotic cells. Two different MS analyses in a linear IT were used to identify the phosphorylated residues. First, a data-dependent neutral loss (DDNL) analysis triggered the fragmentation of peptides that show enhanced neutral loss of phosphoric acid. Second, a targeted product ion scanning (TPIS) mass analysis was carried out in which MS2 events are triggered for specific m/z values. Full coverage of the protein sequence was obtained by combining the two analyses with two enzymatic digestions, trypsin and chymotrypsin, yielding a comprehensive map of the phosphorylation sites. Previous reports have shown Cdc25C to be phosphorylated by Cdc2,cyclin B at four residues (Thr48, Thr67, Thr138 and Ser205). By using this combination of scan modes, we have identified four additional phosphorylation sites (Thr86, Ser99, Thr112 and Ser163) in a recombinant Cdc25C protein containing 198 residues of the NH2 -terminal noncatalytic domain. The sensitivity of this combined approach makes it extremely useful for the comprehensive characterization of phosphorylation sites, virtually permitting complete coverage of the protein sequence with peptides within the mass detection range of the linear IT. [source]


GNBSL: A new integrative system to predict the subcellular location for Gram-negative bacteria proteins

PROTEINS: STRUCTURE, FUNCTION AND BIOINFORMATICS, Issue 19 2006
Jian Guo
Abstract This paper proposes a new integrative system (GNBSL , Gram-negative bacteria subcellular localization) for subcellular localization specifized on the Gram-negative bacteria proteins. First, the system generates a position-specific frequency matrix (PSFM) and a position-specific scoring matrix (PSSM) for each protein sequence by searching the Swiss-Prot database. Then different features are extracted by four modules from the PSFM and the PSSM. The features include whole-sequence amino acid composition, N- and C-terminus amino acid composition, dipeptide composition, and segment composition. Four probabilistic neural network (PNN) classifiers are used to classify these modules. To further improve the performance, two modules trained by support vector machine (SVM) are added in this system. One module,extracts the residue-couple distribution from the amino acid sequence and the other module,applies a pairwise profile alignment kernel to measure the local similarity between every two sequences. Finally, an additional SVM is used to fuse the outputs from the six modules. Test on a benchmark dataset shows that the overall success rate of GNBSL is higher than those of PSORT-B, CELLO, and PSLpred. A web server GNBSL can be visited from http://166.111.24.5/webtools/GNBSL/index.htm. [source]


Effect of myostatin F94L on carcass yield in cattle

ANIMAL GENETICS, Issue 5 2007
G. S. Sellick
Summary In this study, a highly significant quantitative trait locus (QTL) for meat percentage, eye muscle area (EMA) and silverside percentage was found on cattle chromosome 2 at 0,15 cM, a region containing the positional candidate gene growth differentiation factor 8 (GDF8), which has the common alias myostatin (MSTN). Loss-of-function mutations in the MSTN gene are known to cause an extreme ,double muscling' phenotype in cattle. In this study, highly significant associations of MSTN with cattle carcass traits were found using maternally inherited MSTN haplotypes from outbred Limousin and Jersey cattle in a linkage disequilibrium analysis. A previously reported transversion in MSTN (AF320998.1:g.433C>A), resulting in the amino acid substitution of phenylalanine by leucine at position 94 of the protein sequence (F94L), was the only polymorphism consistently related to increased muscling. Overall, the size of the g.433C>A additive effect on carcass traits was moderately large, with the g.433A allele found to be associated with a 5.5% increase in silverside percentage and EMA and a 2.3% increase in total meat percentage relative to the g.433C allele. The phenotypic effects of the g.433A allele were partially recessive. This study provides strong evidence that a MSTN genotype can produce an intermediate, non-double muscling phenotype, which should be of significant value for beef cattle producers. [source]


Isolation, sequence, and chromosomal localisation of the human I,BR gene (NFKBIL2)

ANNALS OF HUMAN GENETICS, Issue 1 2000
D. A. M. NORMAN
The inhibitors of NF-,B (I,Bs) play an important role in the regulation of the NF-,B pathway. I,BR (for I,B -Related) is proposed to be a novel member of this family. We report the cloning and characterization of the region of the human gene encoding the previously reported mRNA. This region contains 13 exons, spread over 6550 bp of genomic sequence. The coding sequence is only weakly similar to other I,Bs and the exons display a more complicated structure than has been found in other members of the I,B gene family. Moreover, the positions of intron-exon junctions are different from those found in other I,B genes, even within the otherwise conserved ankyrin-like repeat region, suggesting that the I,BR gene is not a member of this extended gene family. We report a revised mRNA and protein sequence for I,BR, which predicts that the protein is larger than originally described. We also report the chromosomal localisation of the human I,BR gene (approved gene symbol NFKBIL2) to 8q24.3 using PCR-based somatic cell hybrid panel analysis and fluorescence in situ hybridization (FISH) mapping. [source]


Cloning, sequence and crystallographic structure of recombinant iron superoxide dismutase from Pseudomonas ovalis

ACTA CRYSTALLOGRAPHICA SECTION D, Issue 11 2000
Christopher J. Bond
The gene encoding the iron-dependent superoxide dismutase from Pseudomonas ovalis was cloned from a genomic library and sequenced. The ORF differs from the previously published protein sequence, which was used for the original structure determination, at 16 positions. The differences include three additional inserted residues, one deleted residue and 12 point substitutions. The gene was subcloned and the recombinant protein overexpressed, purified and crystallized in a trigonal space group. The structure was determined by molecular replacement and was refined to 2.1,Å resolution. [source]