683
Views
6
CrossRef citations to date
0
Altmetric
Articles

Authorship verification of opinion articles in online newspapers using the idiolect of author: a comparative study

&
Pages 1603-1621 | Received 01 Mar 2019, Accepted 06 Jan 2020, Published online: 04 Feb 2020
 

ABSTRACT

Many approaches have been introduced to solve the authorship verification problem, including the use of machine learning techniques. These techniques proved to be effective in detecting a person’s distinctive way of speaking or writing. The main aim of this study was to show that every writer has an idiolect which is presented through the use of several types of stylometric features unique to individual authors. For this purpose, 120 online opinion articles written by non-native speakers of English were chosen from four newspapers published in the Arab world, while 145 articles written by native speakers of English were taken from other four newspapers. All of these articles were classified and compared using the SMO and MLP algorithms via a tool called ‘JStylo’. The proposed framework achieved a competitive performance with an accuracy of 80% using the SMO classifier. The results of the study indicate that each author has an individual style of writing (idiolect), and that idiolect is not shown by one group of writers better than the other, namely the native and non-native authors.

Acknowledgements

We would like to express our sincerest gratitude to Farouq Bani-Ata from the Center of Information and Communication Technology at Jordan University of Science and Technology for the support in understanding and dealing with JStylo.

Disclosure statement

No potential conflict of interest was reported by the authors.

Notes

3 A type of classification in which the problem-set consists solely of documents which have known true authors. These documents are divided into K (usually 10) equal-sized folds. Nine of these folds are used as training data with the tenth being used as testing data. The folds are then rotated so that each of the K folds has a turn being testing data. The results are useful in evaluating the classifier and feature set effectiveness.

Additional information

Notes on contributors

Mahmoud A. Al-Khatib

Mahmoud A. Al-Khatib is a professor of English and linguistics in the Department of English Language and Linguistics at Jordan University of Science & Technology. His major area of specialization is sociolinguistics, but he is also interested in pragmatics, bilingualism, discourse analysis, contrastive linguistics (Arabic-English), and English for specific purposes (ESP) [email: [email protected]].

Juman K. Al-qaoud

Juman K. Al-qaoud received her B.A degree in English Language and Linguistics and her master’s degree in Applied Linguistics from Jordan University of Science and Technology. Her major areas of specialization are discourse analysis and computational linguistics [email: [email protected]].

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 53.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 304.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.