Study of applications of machine learning based classification methods for virtual screening of lead molecules

TitleStudy of applications of machine learning based classification methods for virtual screening of lead molecules
Publication TypeJournal Article
Year of Publication2015
AuthorsVyas, R, Bapat, S, Jain, E, Tambe, SS, Karthikeyan, M, Kulkarni, BD
JournalCombinatorial Chemistry & High Throughput Screening
Volume18
Issue7
Pagination658-672
Date PublishedAUG
ISSN1386-2073
KeywordsAnti-anginal, anti-arrythmic, anti-bacterial, anti-convulsant, anti-depressant anti-diabetic, binary QSAR, chemophore, machine learning, pharmacophore, toxicophore
Abstract

The ligand-based virtual screening of combinatorial libraries employs a number of statistical modeling and machine learning methods. A comprehensive analysis of the application of these methods for the diversity oriented virtual screening of biological targets/drug classes is presented here. A number of classification models have been built using three types of inputs namely structure based descriptors, molecular fingerprints and therapeutic category for performing virtual screening. The activity and affinity descriptors of a set of inhibitors of four target classes DHFR, COX, LOX and NMDA have been utilized to train a total of six classifiers viz. Artificial Neural Network (ANN), k nearest neighbor (k-NN), Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree - (DT) and Random Forest - (RF). Among these classifiers, the ANN was found as the best classifier with an AUC of 0.9 irrespective of the target. New molecular fingerprints based on pharmacophore, toxicophore and chemophore (PTC), were used to build the ANN models for each dataset. A good accuracy of 87.27% was obtained using 296 chemophoric binary fingerprints for the COX-LOX inhibitors compared to pharmacophoric (67.82 %) and toxicophoric (70.64 %). The methodology was validated on the classical Ames mutagenecity dataset of 4337 molecules. To evaluate it further, selectivity and promiscuity of molecules from five drug classes viz. anti-anginal, anti-convulsant, anti-depressant, anti-arrhythmic and anti-diabetic were studied. The TPC fingerprints computed for each category were able to capture the drug-class specific features using the k-NN classifier. These models can be useful for selecting optimal molecules for drug design.

DOI10.2174/1386207318666150703112447
Type of Journal (Indian or Foreign)

Foreign

Impact Factor (IF)1.041
Divison category: 
Chemical Engineering & Process Development