http://www.vcclab.org

Virtual Computational Chemistry Laboratory

Home
About
Partners
Software
Articles
Servers
Jobs
Web Services
How to cite?
Contact










232th ACS National Meeting, September 10-14, 2006, San Francisco, CA

COMP 101 :  What is a property-based similarity?
Igor V. Tetko, Institute for Bioinformatics, Neuherberg D-85764, Germany

The goal of any similarity search is to identify molecules similar to the query molecule. Since target properties can be different there cannot be a "universal" similarity measure, and the similarity should be "tailored" to the property. The main question of any similarity measure is how to provide the best selection and normalization of the descriptors. While similarity search is an unsupervised method, its counterpart is a supervised modeling that has its goal to predict the target property. By building a model we actually do the variable selection and normalization that is also required for the unsupervised search. Thus we implicitly introduce a new physical property-based similarity measure. The Associative Neural Network method defines the property-based similarity as a correlation of residuals of the ensemble of models. The advantages of the introduced similarity measure and examples of similarity search in the descriptors and property-based spaces will be exemplified and discussed. download (1.3MB)


COMP 174 :  Can we estimate the accuracy of ADMET predictions?
Igor V. Tetko1, Pierre Bruneau2, Hans-Werner Mewes1, Douglas Rohrer3, and Gennadiy Poda,3.
(1) GSF - National Centre for Environment and Health, Institute for Bioinformatics, Ingolstaedter Landstrasse 1, Neuherberg, 85764, Germany, (2) Centre de Recherche, AstraZeneca, Parc Industriel Pompelle, BP 1050, Reims, France, (3) Structural & Computational Chemistry, Pfizer Global R & D, 700 Chesterfield Parkway West, Mail Zone BB4G, Chesterfield, MO 63017

This article reviews recent developments in methods to access the accuracy of prediction and applicability domain of ADMET models and methods to predict physico-chemical properties of compounds in the early stages of drug development. The methods are classified into two main groups, namely, methods based on the analysis of similarity of molecules and methods based on the analysis of calculated properties. Using the example of octanol-water distribution coefficients we exemplify consistency of estimated and calculated accuracy of the ALOGPS program (http://www.vcclab.org) to predict in house and publicly available datasets. The importance of the methods for improvement of the quality of the high-throughput screening and hits triage, and in particular to avoid improper filtering of compounds standing far from the investigated chemical space is discussed. download (1.2MB)


MEDI 549 :  Virtual computational chemistry laboratory (VCCLAB) http://www.vcclab.org
Igor V. Tetko1, Johann Gasteiger2, Roberto Todeschini3, Andrea Mauri3, David J. Livingstone4, Peter Ertl5, Vladimir A. Palyulin6, Eugene V. Radchenko6, Nikolay S. Zefirov,7, Alexander S. Makarenko8, Vsevolod Y. Tanchuk9, and Volodymyr V. Prokopenko9.
(1) Institute of Bioorganic & Petrochemistry, Kiev, Ukraine and Institute for Bioinformatics, Neuherberg, D-85764, Germany, (2) Computer-Chemie-Centrum, University of Erlangen-Nurnberg, Nagelsbachstrasse 25, Erlangen, D-91052, Germany, (3) Department of Environmental Sciences, Milano Chemometrics and QSAR Research Group, Milano, Italy, (4) ChemQuest, Portmouth, United Kingdom, (5) Novartis AG, Basel, Switzerland, (6) Moscow State University, Moscow, Russia, (7) Department of Chemistry, University, Vorob'evy Gory, Moscow, 119992, Russia, (8) Institute of Applied System Analysis, Kyiv, Ukraine, (9) Institute of Bioorganic & Petrochemistry, Kiev, Ukraine, 02094 Kyiv, Ukraine

Internet technology offers an excellent opportunity for the development of tools by the cooperative effort of various groups and institutions. We developed a multi-platform software system, Virtual Computational Chemistry Laboratory, http://www.vcclab.org, allowing the computational chemist to perform a comprehensive series of molecular indices/properties calculations and data analysis. The developed software contains several popular programs, including the molecular descriptors generation program, E-DRAGON, a 3D structure generator, CORINA, a program to predict lipophilicity and aqueous solubility of chemicals, ALOGPS and others. All these programs are running at the host institutes located in five countries over Europe. We describe the main features and statistics of the developed system that can be used as a prototype for academic and industry models and discuss perspectives of development of chemoinformatics Web.
download (1.6MB)


MEDI 548:  ALOGPS (http://www.vcclab.org) is a free on-line program to predict lipophilicity and aqueous solubility of chemical compounds
Igor V. Tetko, Institute of Bioorganic & Petrochemistry, Kiev, Ukraine and Institute for Bioinformatics, Neuherberg D-85764, Germany, and Vsevolod Yu. Tanchuk, Institute of Bioorganic & Petrochemistry, Kiev, Ukraine

The ALOGPS 2.1 program was developed using Associative Neural Network method and combines models to predict lipophilicity and aqueous solubility of chemicals. The lipophilicity module was developed using 12908 molecules from the PHYSPROP database with 75 E-state indicies. 64 neural networks were trained using 50% of molecules selected by chance from the whole set. The prediction accuracy for molecules not used in the training set is root mean squared error rms=0.35 and standard mean error s=0.26. The aqueous solubility module calculated rms=0.49, s=0.38. The main feature of the ALOGPS is a possibility to add user's �in-house� molecules (the �LIBRARY� mode) without a need to retrain the neural networks or/and generate new molecular indices. The use of the LIBRARY can increase prediction ability of the method for the users molecules in several times and will be illustrated using examples of data analysis in BASF, AstraZeneca and Pfizer.
download (1.1MB)


229th ACS National Meeting, March 13-17, 2005, San Diego, CA

CINF 41:  Encoding molecular structures as SHUFFLED ranks of models: A new, secure way for sharing chemical data and development of ADME/T models
Igor V. Tetko, Institute of Bioorganic & Petrochemistry, Kiev, Ukraine and Institute for Bioinformatics, Neuherberg D-85764, Germany

In order that the lead compound will become a drug it has to possess a number of important ADME/T properties, e.g. favorable lipophilicity and solubility. The poor ADME/T profiling of drugs may result in their fail during the late stages of development. Some companies have experimental databases of such properties. A sharing of these data could develop much better models for the whole community but the proprietary value of chemical structures is a major impediment to do this. Recently we developed ALOGPS program (http://www.vcclab.org) . It can incorporate the user-specific data and dramatically improve its prediction ability for similar series of compounds. The external molecules are represented in it as ranks of 64 neural network models, i.e. as an array of 64 numbers where each number is in [0,63] range. Such representation makes it impossible to disclosure the underlining chemical structures and allows a secure sharing of corporate data.
download (3.3MB)


CINF 80:  ALOGPS (http://www.vcclab.org) is a free on-line program to predict lipophilicity and aqueous solubility of chemical compounds
Igor V. Tetko, Institute of Bioorganic & Petrochemistry, Kiev, Ukraine and Institute for Bioinformatics, Neuherberg D-85764, Germany, and Vsevolod Yu. Tanchuk, Institute of Bioorganic & Petrochemistry, Kiev, Ukraine

The program was developed using Associative Neural Network and it combines models to predict lipophilicity and aqueous solubility of chemicals. The cross-validation (CV) accuracy is rms=0.35 and standard mean error s=0.26 for 12908 molecules. The aqueous solubility module calculated CV rms=0.49 and s=0.38. The main feature of the ALOGPS is a possibility to add user's �in-house� molecules (the �LIBRARY� mode) without a need to retrain the neural networks or/and generate new indices. The use of the LIBRARY increases prediction ability of the method for the users molecules up to 5 times using just few new compounds per series. This makes it an invaluable tool for applications in pharmaceutical firms that have private �in-house� collections of compounds. A description of the concept of LIBRARIES as well as introduction to the property-based similarity space will be provided using several illustrative examples. This study was supported with INTAS 00-0363 grant.
download (2.8MB)


MEDI 514 : Towards predictive ADME profiling of drug candidates: Lipophilicity and solubility
Gennadiy Poda, Structural & Computational Chemistry, Pfizer Global R & D, 700 Chesterfield Parkway West, Mail Zone BB4G, Chesterfield, MO 63017 and Igor Tetko, GSF - Forschungszentrum fuer Umwelt und Gesundheit, GmbH, Institute for Bioinformatics, Ingolstaedter Landstrasse 1, Neuherberg, D-85764, Germany

To reach the target receptor or enzyme in the human body, drugs have to pass numerous membrane barriers by passive diffusion or carrier-mediated uptake. To achieve that drugs have to be soluble both in water and lipids. This makes lipophilicity and solubility the two major properties responsible for absorption and bioavailability of drugs. Reliable prediction of these parameters would significantly facilitate selection of drug candidates from virtual libraries. Within the ALOGPS approach, a statistical ensemble of associative neural networks trained on the dataset of publicly available data globally maps input parameters to the target property. The final tuning of the model is done using a self-learning feature of the ALOGPS based on a user-defined set of the data and was shown to remarkably improve the accuracy in logD and solubility predictions for proprietary compounds. Thus, the ALOGPS combines the best properties of both global and local models
download (1.3MB)



Copyright 2001 -- 2016 http://www.vcclab.org. All rights reserved.