Abstract
Generating short textual descriptions from structured data is an important problem in the field of Natural Language Generation. Recently, significant progress has been made by neural models in generating short descriptive texts from structured data. But a drawback with such techniques is that they require a large amount of data for training the model. We present an unsupervised approach to this problem. We use a method that relies on finding a Hamiltonian path through a graph of information triples which are connected via edges representing discourse relations. In addition to this, we present a rule-based approach to domain-independent surface realization. We conduct experiments on a dataset of infoboxes extracted from Wikipedia. By comparing against human-generated discourses, we report high quality of discourses generated by our system, which are close enough to textual descriptions authored by human beings.
Additional information
Notes on contributors
![](/cms/asset/6e440a46-49d5-4f61-bdbe-2cef9791902c/titr_a_1516522_ilg0001.gif)
Anjali Singh
Anjali Singh completed her Integrated MTech in Mathematics and Computing from IIT Delhi in 2017. Her MTech Thesis was supervised by Prof. Niladri Chatterjee. Presently, she is working as a Research Engineer at IBM Research, Bangalore. Her research interests include Machine Learning Applications, Natural Language Processing, and Natural Language Generation.
![](/cms/asset/6fe3d490-9681-49d6-baad-c7a45c162f71/titr_a_1516522_ilg0002.gif)
Niladri Chatterjee
Niladri Chatterjee is a Professor of Statistics and Computer Science in the Department of Mathematics, IIT Delhi. His primary research areas are: Natural Language Processing, Semantic Web, Statistical Modeling, He obtained Ph.D. in Computer Science from University College London. He has nearly 100 publications in international and national journals and conferences. He has been the Organizing Chair of “CICLING – 2012” – 13th International Conference Computational Linguistics and Intelligent Text Processing. He has also been a Visiting Professor in Dipartimento di Informatica, University of Pisa, Italy. He has supervised 8 PhDs and more than 70 Masters’ thesis so far. Email: [email protected]