172
Views
3
CrossRef citations to date
0
Altmetric
Computers and Computing

Semantic-Based Integrated Plagiarism Detection Approach for English Documents

, &
Pages 6120-6136 | Published online: 05 Dec 2021
 

Abstract

The proposed work models a novel plagiarism detection system based on the semantic features to uncover the cases of plagiarism. The system constructs the dynamic relation matrix for each suspicious and source sentence pair to measure the degree of similarity using semantic features. Two Weighted Inverse Distance and GlossDice procedures show several text properties (synonyms, shortest path, etc.) to overcome the limitations of the existing features and new similarity metric for plagiarism detection are presented in this paper. Moreover, this research investigates the independent performance of various features to detect plagiarized cases and combine the best features by assigning different weight contributions to further enhance the system performance. Weighted Inverse Distance integrated with SynJaccard boosts the system performance and shows promising results. Initially, all the experiments were performed on PAN-PC-11dataset, and then PAN-14 text alignment dataset was used to validate the results of the proposed system. The effectiveness of the proposed system has been measured using standard performance measures i.e. Precision, Recall, F-measure, Granularity, and Plagdet score. The proposed system has outperformed the other baseline systems with precision (0.9459), recall (0.8861), f-measure (0.8917), and plagdet (0.8857) on the PAN-PC-11 dataset. For PAN-14 text alignment, the system exhibits precision (0.9257), recall (0.9055), f-measure (0.8931), and plagdet (0.8806).

Additional information

Notes on contributors

Manpreet Kaur

Manpreet Kaur received her master's degree in computer science and engineering from the University Institute of Engineering and Technology, Panjab University, Chandigarh, in 2021. Her research interests include natural language processing, image processing and computer vision.

Vishal Gupta

Vishal Gupta is currently working as associate professor in the Department of Computer Science & Engineering, University Institute of Engineering & Technology, Panjab University, Chandigarh. His main research interests include natural language processing, deep learning, machine learning, information retrieval, and text mining. He has won Young Scientist Award and Faculty Research Award. He was selected in World top 2% scientists, ranking list-2019 released by Stanford University in computer science. Corresponding author. Email: [email protected]

Ravreet Kaur

Ravreet Kaur is currently working as assistant professor in the Department of Computer Science & Engineering, University Institute of Engineering & Technology, Panjab University, Chandigarh. Her main research interests include parallel & distributed computing, future network architecture, deep learning and the architecture and key technology of the new generation internet of things. Email: [email protected]

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 100.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.