Virtual Computational Chemistry Laboratory

https://vcclab.org



ALOGPS 2.1 is described in two artilces:

Application of Associative Neural Networks for Prediction of Lipophilicity in ALOGPS 2.1 program

Igor V. Tetko,1,2 Vsevolod Yu. Tanchuk,2

1-Laboratoire de Neuro-Heuristique, Institut de Physiologie, Rue du Bugnon 7, Lausanne, CH-1005, Switzerland
2-Biomedical Department, Institute of Bioorganic & Petroleum Chemistry, Murmanskaya 1, Kiev-660, 253660, Ukraine
    This article provides a systematic study of several important parameters of the Associative Neural Network (ASNN), such as the number of networks in the ensemble, distance measures, neighbor functions, selection of smoothing parameters and strategies for the user-training feature of the algorithm. The performance of the different methods is assessed with several training/test sets used to predict lipophilicity of chemical compounds. The Spearman rank-order correlation coefficient and Parzen-window regression methods provide the best performance of the algorithm. If additional user data is available, an improved prediction of lipophilicity of chemicals up to 2-5 times can be calculated when the appropriate smoothing parameters for the neural network are selected. The detected best combinations of parameters and strategies are implemented in the ALOGPS 2.1 program that is publicly available at https://vcclab.org/lab/alogps.

Current status: published in J. Chem. Inf. Comput. Sci., 2002, 42, 1136-1145.


Prediction of n-Octanol/Water Partition Coefficients from PHYSPROP Database Using Artificial Neural Networks and E-state Indices

Igor V. Tetko,1,2 Vsevolod Yu. Tanchuk,2 Alessandro E. P. Villa1

1-Laboratoire de Neuro-Heuristique, Institut de Physiologie, Rue du Bugnon 7, Lausanne, CH-1005, Switzerland
2-Biomedical Department, Institute of Bioorganic & Petroleum Chemistry, Murmanskaya 1, Kiev-660, 253660, Ukraine
    A new method, ALOGPS v 2.0 (https://www.lnh.unil.ch/~itetko/logp/*), for the assessment of n-octanol/water partition coefficient, logP, was developed on the basis of neural network ensemble analysis of 12908 organic compounds available from PHYSPROP database of Syracuse Research Corporation. The atom and bond-type E-state indices as well as the number of hydrogen and non-hydrogen atoms were used to represent the molecular structures. A preliminary selection of indices was performed by multiple linear regression analysis and 75 input parameters were chosen. Some of the parameters combined several atom-type or bond-type indices with similar physico-chemical properties. The neural network ensemble training was performed by Efficient Partition Algorithm developed by the authors. The ensemble contained 50 neural networks and each neural network had 10 neurons in one hidden layer. The prediction ability of the developed approach was estimated using both leave-one-out (LOO) technique and training/test protocol. In case of inter-series predictions, i.e. when molecules in the test and in the training sub-sets were selected by chance from the same set of compounds, both approaches provided similar results. ALOGPS performance was significantly better than the results obtained by other tested methods. For a subset of 12777 molecules the LOO results, namely correlation coefficient r2=0.95, Root Mean Squared Error, RMSE=0.39, and an absolute mean error, MAE=0.29, were calculated.
    For two cross-series predictions, i.e. when molecules in the training and in the test sets belong to different series of compounds, all analyzed methods performed less efficiently. The decrease in the performance could be explained by a different diversity of molecules in the training and in the test sets. However, even for such difficult cases the ALOGPS method provided better prediction ability than the other tested methods. We have shown that the diversity of the training sets rather than the design of the methods is the main factor determining their prediction ability for new data. A comparative performance of the methods as well as a dependence on the number of non-hydrogen atoms in a molecule is also presented.

Current status: published in J. Chem. Inf. Comput. Sci., 2001, 41, 1407-1421.

*- unfortunately after termination of Dr. Tetko's work in Lausanne this site is no more supported. Please, contact Dr. Tetko if you need the old version of the program.

See FAQ if you have questions. How to cite this applet? Are you looking for a new job in chemoinformatics?

Copyright 2001 -- 2023 https://vcclab.org. All rights reserved.