MegaMiner: a tool for lead identification through text mining using chemoinformatics tools and cloud computing environment

TitleMegaMiner: a tool for lead identification through text mining using chemoinformatics tools and cloud computing environment
Publication TypeJournal Article
Year of Publication2015
AuthorsKarthikeyan, M, Pandit, Y, Pandit, D, Vyas, R
JournalCombinatorial Chemistry & High Throughput Screening
Volume18
Issue6
Pagination591-603
Date PublishedJAN
ISSN1386-2073
KeywordsChemoinformatics, cloud computing, malaria, text mining, virtual screening
Abstract

Virtual screening is an indispensable tool to cope with the massive amount of data being tossed by the high throughput omics technologies. With the objective of enhancing the automation capability of virtual screening process a robust portal termed MegaMiner has been built using the cloud computing platform wherein the user submits a text query and directly accesses the proposed lead molecules along with their drug-like, lead-like and docking scores. Textual chemical structural data representation is fraught with ambiguity in the absence of a global identifier. We have used a combination of statistical models, chemical dictionary and regular expression for building a disease specific dictionary. To demonstrate the effectiveness of this approach, a case study on malaria has been carried out in the present work. MegaMiner offered superior results compared to other text mining search engines, as established by F score analysis. A single query term `malaria' in the portlet led to retrieval of related PubMed records, protein classes, drug classes and 8000 scaffolds which were internally processed and filtered to suggest new molecules as potential anti-malarials. The results obtained were validated by docking the virtual molecules into relevant protein targets. It is hoped that MegaMiner will serve as an indispensable tool for not only identifying hidden relationships between various biological and chemical entities but also for building better corpus and ontologies.

DOI10.2174/1386207318666150703113525
Type of Journal (Indian or Foreign)

Foreign

Impact Factor (IF)1.041
Divison category: 
Chemical Engineering & Process Development