Detecting unknown vulnerabilities in smart contracts using opcode sequences

Peiqiang LiSchool of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, People's Republic of ChinaView further author information

Guojun WangSchool of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, People's Republic of ChinaCorrespondence[email protected]
View further author information

Xiaofei XingSchool of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, People's Republic of ChinaView further author information

Xiangbin LiSchool of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, People's Republic of ChinaView further author information

Jinyao ZhuSchool of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, People's Republic of ChinaView further author information

Abstract

Unknown vulnerabilities, also known as zero-day vulnerabilities, are vulnerabilities in software, systems, or networks that have not yet been publicly disclosed or fixed. If these vulnerabilities are ever discovered by hackers, intentionally or unintentionally, they pose a major threat to network security. This is particularly true in the blockchain field, as smart contracts hold a lot of money, and if they are discovered and exploited by hackers, the financial losses to users will be even greater. However, the current research on smart contract vulnerabilities mainly focuses on known vulnerabilities, and the research on unknown vulnerabilities has been limited. Based on this, we introduce a machine learning-based method for detecting unknown vulnerabilities in smart contracts. First, the method obtains the opcode sequences executed by smart contract transactions in the EVM by instrumenting Geth and replaying the Ethereum transactions. Next, we employ an n-gram model and a vector weight penalty mechanism to extract the opcode sequence features. We then use machine learning algorithms to detect unknown vulnerabilities based on the similarity principle. Finally, we test the effectiveness of our method with four machine learning models: the K-Nearest Neighbor algorithm (KNN), Support Vector Machine (SVM), Logistic Regression (LR), and Decision Tree (DT). The SVM model performs best at detecting unknown vulnerabilities, with an accuracy of 96%, a precision of 91%, a recall of 100%, and an F1-score of 95%. We also discuss the benefits of the method: timely detection of attacks due to unknown vulnerabilities, thus reducing user losses.

Keywords:

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported in part by the National Key Research and Development Program of China (2020YFB1005804), and in part by the National Natural Science Foundation of China under Grant 62372121.

Detecting unknown vulnerabilities in smart contracts using opcode sequences

Information for

Open access

Opportunities

Help and information

Detecting unknown vulnerabilities in smart contracts using opcode sequences

Abstract

Disclosure statement

Additional information

Funding

Related research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature