Question answering Chatbot trained using machine learning and natural language processing methods on domain specific scientific articles
Journal Club Semninar
07-09-2022; 03:00 PM
07-09-2022

 Question answering Chatbot trained using machine learning and natural language processing methods on domain specific scientific articles

Speaker - Nitin Kumar Research Scholar

ABSTRACT

We present a proof of concept implementation of the development of personal assistant chatbot that can accept input/output in both text and verbal form using google’s gTTS (Google Text-to-Speech) speech-synthesiser. We have used the machine learning (ML) library scikit-learn naive bayes classifier for question detection and natural language processing (NLP) library NLTK and nps_chat corpora for implementing the model at the backend and processing the data. Model is trained on over a small sample of full-text articles on which NLP based preprocessing steps & feature extraction are performed which is used down the line model training. The initial development of the bot is based on Keyword Recognition-Based method which is a command line based deployment. Similarly, we also introduce Transformer-based masked language models ScholarBERT, Which is pretrained on a large collection of scientific research articles (2.2B tokens) for future integration. We also propose an updated version with training across a larger range and number of articles and adding more features such as contextual and semantic interpretation and inclusion of Language models and more Robust NLU’s like RASA.

Room no. 30, SCIS, JNU