Home About us Contact | |||
Chemical Space (chemical + space)
Selected AbstractsStructuring Chemical Space: Similarity-Based Characterization of the PubChem DatabaseMOLECULAR INFORMATICS, Issue 1-2 2010Giovanni Cincilla Abstract The ensemble of conceivable molecules is referred to as the Chemical Space. In this article we describe a hierarchical version of the Affinity Propagation (AP) clustering algorithm and apply it to analyze the LINGO-based similarity matrix of a 500 000-molecule subset of the PubChem database, which contains more than 19,million compounds. The combination of two highly efficient methods, namely the AP clustering algorithm and LINGO-based molecular similarity calculations, allows the unbiased analysis of large databases. Hierarchical clustering generates a numerical diagonalization of the similarity matrix. The target-independent, intrinsic structure of the database , derived without any previous information on the physical or biological properties of the compounds, maps together molecules experimentally shown to bind the same biological target or to have similar physical properties. [source] Data and Graph Mining in Chemical Space for ADME and Activity Data SetsMOLECULAR INFORMATICS, Issue 3 2006Abstract We present a classification method, which is based on a coordinate-free chemical space. Thus, it does not depend on descriptor values commonly used by coordinate-based chemical space methods. In our method the molecular similarity of chemical structures is evaluated by a generalized maximum common graph isomorphism, which supports the usage of numerical physicochemical atom property labels in addition to discrete-atom-type labels. The Maximum Common Substructure (MCS) algorithm applies the Highest Scoring Common Substructure (HSCS) ranking of Sheridan and co-workers, which penalizes discontinuous fragments. For all compared classification algorithms used in this work we analyze their usefulness based on two objectives. First, we are interested in highly accurate and general hypotheses and second, the interpretation ability is highly important to increase our structural knowledge for the ADME data sets and the activity data set investigated in this work. [source] Organic dyes as small molecule protein,protein interaction inhibitors for the CD40,CD154 costimulatory interactionJOURNAL OF MOLECULAR RECOGNITION, Issue 1 2010Peter Buchwald Abstract It is becoming increasingly clear that small molecules can often act as effective protein,protein interaction (PPI) inhibitors, an area of increasing interest for its many possible therapeutic applications. We have identified several organic dyes and related small molecules that (i) concentration-dependently inhibit the important CD40,CD154 costimulatory interaction with activities in the low micromolar (µM) range, (ii) show selectivity toward this particular PPI, (iii) seem to bind on the surface of CD154, and (iv) concentration-dependently inhibit the CD154-induced B cell proliferation. They were identified through an iterative activity screening/structural similarity search procedure starting with suramin as lead, and the best smaller compounds, the main focus of the present work, achieved an almost 3-fold increase in ligand efficiency (,G0/nonhydrogen atom,=,0.8,kJ/NnHa) approaching the average of known promising small-molecule PPI inhibitors (,1.0,kJ/NnHa). Since CD154 is a member of the tumor necrosis factor (TNF) superfamily of cell surface interaction molecules, inhibitory activities on the TNF-R1,TNF- , interactions were also determined to test for specificity, and the compounds selected here all showed more than 30-fold selectivity toward the CD40,CD154 interaction. Because of their easy availability in various structural scaffolds and because of their good protein-binding ability, often explored for tissue-specific staining and other purposes, such organic dyes can provide a valuable addition to the chemical space searched to identify small molecule PPI inhibitors in general. Copyright © 2009 John Wiley & Sons, Ltd. [source] Predicting P-glycoprotein substrates by a quantitative structure,activity relationship modelJOURNAL OF PHARMACEUTICAL SCIENCES, Issue 4 2004Vijay K. Gombar Abstract A quantitative structure,activity relationship (QSAR) model has been developed to predict whether a given compound is a P-glycoprotein (Pgp) substrate or not. The training set consisted of 95 compounds classified as substrates or non-substrates based on the results from in vitro monolayer efflux assays. The two-group linear discriminant model uses 27 statistically significant, information-rich structure quantifiers to compute the probability of a given structure to be a Pgp substrate. Analysis of the descriptors revealed that the ability to partition into membranes, molecular bulk, and the counts and electrotopological values of certain isolated and bonded hydrides are important structural attributes of substrates. The model fits the data with sensitivity of 100% and specificity of 90.6% in the jackknifed cross-validation test. A prediction accuracy of 86.2% was obtained on a test set of 58 compounds. Examination of the eight "mispredicted" compounds revealed two distinct categories. Five mispredictions were explained by experimental limitations of the efflux assay; these compounds had high permeability and/or were inhibitors of calcein-AM transport. Three mispredictions were due to limitations of the chemical space covered by the current model. The Pgp QSAR model provides an in silico screen to aid in compound selection and in vitro efflux assay prioritization. © 2004 Wiley-Liss, Inc. and the American Pharmacists Association J Pharm Sci 93: 957,968, 2004 [source] Data and Graph Mining in Chemical Space for ADME and Activity Data SetsMOLECULAR INFORMATICS, Issue 3 2006Abstract We present a classification method, which is based on a coordinate-free chemical space. Thus, it does not depend on descriptor values commonly used by coordinate-based chemical space methods. In our method the molecular similarity of chemical structures is evaluated by a generalized maximum common graph isomorphism, which supports the usage of numerical physicochemical atom property labels in addition to discrete-atom-type labels. The Maximum Common Substructure (MCS) algorithm applies the Highest Scoring Common Substructure (HSCS) ranking of Sheridan and co-workers, which penalizes discontinuous fragments. For all compared classification algorithms used in this work we analyze their usefulness based on two objectives. First, we are interested in highly accurate and general hypotheses and second, the interpretation ability is highly important to increase our structural knowledge for the ADME data sets and the activity data set investigated in this work. [source] Research Article: Effective and Specific Inhibition of the CD40,CD154 Costimulatory Interaction by a Naphthalenesulphonic Acid DerivativeCHEMICAL BIOLOGY & DRUG DESIGN, Issue 4 2010Emilio Margolles-Clark Costimulatory interactions are important regulators of T-cell activation and, hence, promising therapeutic targets in autoimmune diseases as well as in transplant recipients. Following our recent identification of the first small-molecule inhibitors of the CD40,CD154 costimulatory protein,protein interaction (J Mol Med 87, 2009, 1133), we continued our search within the chemical space of organic dyes, and we now report the identification of the naphthalenesulphonic acid derivative mordant brown 1 as a more active, more effective, and more specific inhibitor. Flow cytometry experiments confirmed its ability to concentration-dependently inhibit the CD154(CD40L)-induced cellular responses in human THP-1 cells at concentrations well below cytotoxic levels. Binding experiments showed that it not only inhibits the CD40,CD154 interaction with sub-micromolar activity, but it also has considerably more than 100-fold selectivity toward this interaction even when compared to other members of the tumor necrosis factor superfamily pairs such as TNF-R1,TNF-,, BAFF-R(CD268),BAFF(CD257/BLys), OX40(CD134),OX40L(CD252), RANK(CD265),RANKL(CD254/TRANCE), or 4-1BB(CD137),4-1BBL. There is now sufficient structure-activity relationship information to serve as the basis of a drug discovery initiative targeting this important costimulatory interaction. [source] Quantitative Structure,Activity Relationship Models for Predicting Biological Properties, Developed by Combining Structure- and Ligand-Based Approaches: An Application to the Human Ether-a-go-go-Related Gene Potassium Channel InhibitionCHEMICAL BIOLOGY & DRUG DESIGN, Issue 4 2009Alessio Coi A strategy for developing accurate quantitative structure,activity relationship models enabling predictions of biological properties, when suitable knowledge concerning both ligands and biological target is available, was tested on a data set where molecules are characterized by high structural diversity. Such a strategy was applied to human ether-a-go-go-related gene K+ channel inhibition and consists of a combination of ligand- and structure-based approaches, which can be carried out whenever the three-dimensional structure of the target macromolecule is known or may be modeled with good accuracy. Molecular conformations of ligands were obtained by means of molecular docking, performed in a previously built theoretical model of the channel pore, so that descriptors depending upon the three-dimensional molecular structure were properly computed. A modification of the directed sphere-exclusion algorithm was developed and exploited to properly splitting the whole dataset into Training/Test set pairs. Molecular descriptors, computed by means of the codessa program, were used for the search of reliable quantitative structure,activity relationship models that were subsequently identified through a rigorous validation analysis. Finally, pIC50 values of a prediction set, external to the initial dataset, were predicted and the results confirmed the high predictive power of the model within a quite wide chemical space. [source] Prospective Validation of a Comprehensive In silico hERG Model and its Applications to Commercial Compound and Drug DatabasesCHEMMEDCHEM, Issue 5 2010Munikumar Abstract Ligand-based in silico hERG models were generated for 2,644 compounds using linear discriminant analysis (LDA) and support vector machines (SVM). As a result, the dataset used for the model generation is the largest publicly available (see Supporting Information). Extended connectivity fingerprints (ECFPs) and functional class fingerprints (FCFPs) were used to describe chemical space. All models showed area under curve (AUC) values ranging from 0.89 to 0.94 in a fivefold cross-validation, indicating high model consistency. Models correctly predicted 80,% of an additional, external test set; Y-scrambling was also performed to rule out chance correlation. Additionally models based on patch clamp data and radioligand binding data were generated separately to analyze their predictive ability when compared to combined models. To experimentally validate the models, 50 of the predicted hERG blockers from the Chembridge database and ten of the predicted non-hERG blockers from an in-house compound library were selected for biological evaluation. Out of those 50 predicted hERG blockers, tested at a concentration of 10,,M, 18 compounds showed more than 50,% displacement of [3H]astemizole binding to cell membranes expressing the hERG channel. Ki values of four of the selected binders were determined to be in the micromolar and high nanomolar range (Ki (VH01)=2.0,,M, Ki (VH06)=0.15,,M, Ki (VH19)=1.1,,M and Ki (VH47)=18 ,M). Of these four compounds, VH01 and VH47 showed also a second, even higher affinity binding site with Ki values of 7.4,nM and 36,nM, respectively. In the case of non-hERG blockers, all ten compounds tested were found to be inactive, showing less than 50,% displacement of [3H]astemizole binding at 10,,M. These experimentally validated models were then used to virtually screen commercial compound databases to evaluate whether they contain hERG blockers. 109,784 (23,%) of Chembridge, 133,175 (38,%) of Chemdiv, 111,737 (31,%) of Asinex and 11,116 (18,%) of the Maybridge database were predicted to be hERG blockers by at least two of the models, a prediction which could, for example, be used as a pre-filtering tool for compounds with potential hERG liabilities. [source] |