Question and answers (Q&A) in natural language processing research and give insights on the human-machine interface. The ultimate goal of any QA system is to provide a concise and exact answer to a question asked in a natural language. Most research relates to finding answers to open-domain questions by searching a large collection of documents like Wikipedia. Unlike Internet search engines, open-domain QA systems provide short, relevant answers to questions.
Open-domain QA is classified into two categories, semantic parsing and information retrieval.
- Semantic parsing interprets the meaning of a question by semantic analysis. The correct interpretation converts the question into an exact database query that returns a correct answer.
- Information retrieval converts the question into a query, then retrieve a set of answers by querying a corpus and a knowledge base.
Both categories require human expertise to tune the lexicons, grammars and knowledge bases.
The information retrieval is what I used because advanced syntactic and semantic parsers for Vietnamese are not readily available. Furthermore, the building of a QA system cannot scale because again, most research is for English as the natural language.
I will attempt to build a QA system using Dialogflow and the help of several scientific papers written by Vietnamese scientists. EHLAI’s QA system for the Vietnamese language will combine both statistical models and knowledge-based methods.
Our question answering system can only answer accounting related questions and not a wide range of general knowledge questions.