456
Views
0
CrossRef citations to date
0
Altmetric
Research Article

Exploring graph representation strategies for text classification

ORCID Icon, ORCID Icon & ORCID Icon
Article: 2289832 | Received 16 Apr 2023, Accepted 27 Nov 2023, Published online: 21 Dec 2023
 

ABSTRACT

Since 2005, the deep learning community has had access to input graphs to their models. So, the natural language processing (NLP) community started using this technique to process text. However, a challenge that the graph neural networks (GNN) may encounter is the sensibility to representation format. Since different graphs can represent the same text, the model’s performance may change depending on the representation used. Even though many practitioners have this intuition, only some works touch on this aspect of GNN. Therefore, we explore twelve different text representation strategies that build graphs from text and apply them to the same GNN to investigate how different graphs may affect the results. We divide these strategies into four groups: reading order, dependency-based, binary tree, and graph of words. From these groups, we created the binary tree group for this paper. Nevertheless, in our tests, we observed that the dependency-based representations tend to achieve better performance: The dependency-based methods allow us to stay competitive in five relevant datasets and beat the state-of-the-art in another dataset. These results suggest that performing representation tuning can be a valuable technique to improve a deep learning model.

Acknowledgements

Research financed with funds from the National Development Fund (FNDE, in Portuguese) and the Ministry of Education (MEC, in Portuguese) of the Federal Government of Brazil, carried out by the Center for Scientific Computing and Free Software (C3SL, in Portuguese) of the Federal University of Paraná (UFPR, in Portuguese). We also want to thank the Coordination for the Improvement of Higher Education Personnel (CAPES) - Program of Academic Excellence (PROEX) (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) - Programa de Excelência Acadêmica (PROEX) in Portuguese).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes