59
Views
0
CrossRef citations to date
0
Altmetric
Articles

A discriminative random sampling strategy with individual-author feature selection for writeprint recognition of Chinese texts

, , , , &
Pages 94-101 | Received 14 Aug 2015, Accepted 28 Feb 2016, Published online: 23 Mar 2016
 

Abstract

The auto authorship recognition has become a novel technique to investigate cybercrimes. But the challenge of the research is that a huge number of features exist in the moderate-sized corpus, which causes the curse of over-training. Besides, it is hard to distinguish between potential authors only by a single feature set. In this paper, we proposed a random sampling style ensemble method with individual-author feature selection to exploit the high-dimensional feature space. The proposed method randomly picks writing-style features on each individual-author feature set (IAFS) partitioned from the whole feature set. The IAFSs are heuristically selected with training set of each author. Then, multiple base classifiers (BCs) are formed on the sampled feature sets. Finally, all BCs are fused to get a final decision. Experimental results on the real-life Chinese forum data verify the robustness of the proposed method compared with conventional ensemble methods. We also analyze the diversity of algorithm to reveal that the ensemble strategy is more effective and can construct more diverse BCs than random subspace methods.

Acknowledgments

The authors sincerely thank anonymous reviewers for their constructive comments, which helped improve this paper.

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 288.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.