|
|
Welcome to the Partial Least Squares Regression (PLSR)
start the programJava security issues: recently Java has dramatically increased security requirements to applets. Thus, please, follow instructions in this FAQ to correcly setup access to the software.PLSR statistical analysis module performs model construction and prediction of activity/property using the Partial Least Squares (PLS) regression technique [1-3]. It is based on linear transition from a large number of original descriptors to a small number of orthogonal factors (latent variables) providing the optimal linear model in terms of predictivity (characterized by the Q2 value). More detailed explanation of method and algorithms is available. It is well known that Partial Least Squares (PLS) regression is quite sensitive to the noise created by the excessive irrelevant descriptors. To achieve the best model quality, two-step descriptor selection procedure [4] is applied. The first step consists in the elimination of the low-variable (almost constant) descriptors that are different from a constant only for a few (2-3) compounds in the training set. Such descriptors cannot provide useful statistical information and simply help to fit these particular compounds, thus decreasing the predictivity. At the second step, the descriptor subset is optimized using Q2-guided descriptor selection by means of a genetic algorithm. Despite the stochastic nature of this technique, computational experiments demonstrate reasonable stability of the results. The same code base is successfully employed in software implementing the Molecular Field Topology Analysis (MFTA) technique proposed by us [5] for QSAR studies of organic compounds. This software was developed by E.V. Radchenko, V.A. Palyulin and N.S. Zefirov, Department of Chemistry, Moscow State University, Moscow 119992 Russia. The data input format is described here.
|
|||||||
|
||||||||
|
Copyright 2001 -- 2023 https://vcclab.org. All rights reserved. |
|
|||
|