Part-of-speech (POS) tagging, also called grammatical tagging, to determine whether a word in the sentence is a noun, verb, adjective, adverb, etc. POS tagging is not easy since words can represent more than one part of speech. Vietnamese POS tagging is much tougher than English because of the ambiguities of Vietnamese words and syntactic mutation. Furthermore, POS tagging depends on word segmentation, which is difficult, as indicated in the previous post.
Conclusion
I will attempt to adapt the QA system of Phuong Le-Hong and DBPedia database for my accounting FAQ module. I have confidence that this will work because all questions and answers are related to small businesses in the US. I want to improve further the performance of our system by integrating modules developed by Phuong Le-Hong, Dat Nguyen, Thai-Hoang Pham, Mai-Vu Tran and other groups.
I will also post my progress!
References
- A Vietnamese Question Answering System in Vietnam’s Legal Documents.
- Vietnamese named entity recognition using token regular expressions and bidirectional inference.
- Fast Dependency Parsing using Distributed Word Representations
- A Hybrid Approach to Word Segmentation of Vietnamese Texts
- A syntactic component for Vietnamese language processing
- An empirical study of maximum entropy approach for part-of-speech tagging of Vietnamese texts
- Using Dependency Analysis to Improve Question Classification
- Ripple Down Rules for Question Answering
- Building a Semantic Role Labelling System for Vietnamese
- NNVLP: A Neural Network-Based Vietnamese Language Processing Toolkit
- An Experiment Study of Vietnamese Question Answering System