Optimization of Gaussian Kernel Function in Support Vector Machine aided QSAR studies of C-aryl glucoside SGLT2 inhibitors

Rebekah K. Prasoona1,2, A. Jyoti1,2, Yadav Mukesh3, Sharma Nishant3, Nayarisseri S. Anuraj1,4, Joshi Shobha5
1Department of Genetics, Osmania University, Hyderabad, India
2Institute of Genetics & Hospital for Genetic Diseases, Osmania University, Begumpet, India
3Department of Pharmaceutical Chemistry, Softvision College, Indore, India
4Bioinformatics Research Laboratory, Eminent Biosciences, Indore, India
5Government Degree College, Depalpur, Indore, India

Tóm tắt

The present investigations include utility of latest statistical algorithm Support Vector Machine (SVM) to identify non-linear structure activity relationship between IC50 values and structures of C-aryl glucoside SGLT2 inhibitors. Training dataset consisted of forty molecules and the remaining six molecules were chosen for test set validation. SVM under Gaussian Kernel Function yielded non-linear QSAR models. Forward selection algorithm was applied after pruning and redundancy check on molecular descriptors. Internal validations of QSAR models have been achieved using R CV 2 (LOO), PRESS, SDEP and Y-Scrambling. SVM aided non-linear models are more efficient when optimization of Gaussian Kernel Function was introduced. Non-linear QSAR studies further identified atomic van der Waals volumes, atomic masses, sum of geometrical distances between O..S and degree of unsaturation as molecular descriptors and crucial structural requirements to model IC50 of C-aryl glucoside derivatives.

Tài liệu tham khảo