283
Views
7
CrossRef citations to date
0
Altmetric
Original Article

A word-building method based on neural network for text classification

, ORCID Icon, , &
Pages 455-474 | Received 12 Aug 2017, Accepted 20 Dec 2018, Published online: 30 Jan 2019
 

ABSTRACT

Text classification is a foundational task in many natural language processing applications. All traditional text classifiers take words as the basic units and conduct the pre-training process (like word2vec) to directly generate word vectors at the first step. However, none of them have considered the information contained in word structure which is proved to be helpful for text classification. In this paper, we propose a word-building method based on neural network model that can decompose a Chinese word to a sequence of radicals and learn structure information from these radical level features which is a key difference from the existing models. Then, the convolutional neural network is applied to extract structure information of words from radical sequence to generate a word vector, and the long short-term memory is applied to generate the sentence vector for the prediction purpose. The experimental results show that our model outperforms other existing models on Chinese dataset. Our model is also applicable to English as well where an English word can be decomposed down to character level, which demonstrates the excellent generalisation ability of our model. The experimental results have proved that our model also outperforms others on English dataset.

Acknowledgments

The authors would like to thank the anonymous reviewers for the constructive comments. This work was sponsored by National Key Research & Development Program of China (2016QY01W0200) and the open project of Science and Technology on Communication Networks Laboratory (614210403070617).

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

Additional information

Funding

This work was supported by the National Key Research & Development Program of China [2016QY01W0200]and Science and Technology on Communication Networks Laboratory [614210403070617].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 373.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.