Cointime

Download App
iOS & Android

How Effective Is GPT for Auditing Smart Contracts?

Introduction

Recently, ChatGPT has gained a great deal of popularity, impressing its users with its capacity to enhance traditional text, work efficiency, and provide concise overviews. Following closely behind is CodeGPT, a GPT-based plugin that further enhances coding efficiency. With the recent release of GPT-4, can it be applied to auditing blockchain and Solidity smart contracts? Based on this question, we conducted various feasibility tests.’

Testing Environment and Methodology

The comparison models used in this test are: GPT-3.5(Web),GPT-3.5-turbo-0301,GPT-4(Web).

Prompt used in the test: Help me discover vulnerabilities in this Solidity smart contract.

Comparison of Vulnerability Code Snippet Detectio

We performed three rounds of testing. In tests 1 and 2, we utilized historical vulnerability codes commonly encountered in the past as test cases to evaluate the model’s ability to detect fundamental vulnerabilities. In Test 3, we introduced moderately challenging vulnerability codes as the primary test cases.

Test 1:

Example: “Intro to Smart Contract Audit Series: Phishing With tx.orgin”

Vulnerability Code:

Sent to GPT:

GPT-3.5(Web) Response:

GPT-3.5-turbo-0301 Response:

GPT-4(Web) Response:

As you can see from the results, all three models identified critical issues related to tx.origin.

Test 2:

Example: “Intro to Smart Contract Security Audits | Overflow”

Sent to GPT:

GPT-3.5(Web) Response:

GPT-3.5-turbo-0301 Response:

GPT-4(Web) Response:

It is worth noting that both GPT-3.5 (Web) and gpt-3.5-turbo-0301 were able to identify a critical overflow vulnerability, whereas surprisingly, GPT-4 (Web) did not provide any relevant prompt.

Test 3:

Example: “Empty-handed with a White Wolf — Analysis of the Popsicle Hack”

Sent to GPT:

GPT-3.5(Web) Response:

GPT-3.5-turbo-0301 Response:

Looking at the results,, we can see that none of the three versions detected any of the critical vulnerability points.

Summary of Code Snippet Detection

While the GPT models displayed adequate detection capabilities for simple vulnerability code snippets, it falls short when it comes to identifying more complex ones. Throughout the tests, GPT-4 (Web) showcased exceptional readability and a clear output format. However, its ability to audit code does not appear to surpass that of GPT-3.5 (Web) or GPT-3.5-turbo-0301. In some cases, due to the inherent uncertainties in the transformer output, GPT-4 (Web) managed to overlook certain critical issues.

Comparative Detection of Known Vulnerabilities in Full Contracts

To better accommodate the practical requirements of projects during contract audits, we raised the difficulty level by importing contracts with an extensive codebase. This allowed us to comprehensively test the GPT-4 model’s auditing capabilities, as opposed to GPT-3 which has a smaller contextual character limit and thus was not evaluated in this context.

For this instance, we used previous case studies as a test template to simulate real-world scenarios:

Example: “Detailed analysis of the $31 Million MonoX Protocol Hack”.

To initiate the audit, we inputted the complete contract in batches and submitted a vulnerability detection request towards the end of the dialogue.

The following prompt was utilized for this test:

“Here is a Solidity smart contract”

Insert Contract Code

“The above is the complete code,help me discover vulnerabilities in this smart contract.”

As demonstrated, despite GPT-4 having the highest single input character limit, according to the information published by OpenAI, it still encountered contextual challenges due to text overflow during the final vulnerability detection request. Consequently, the model can only identify a portion of the content, rendering it incapable of conducting a thorough contextual audit for large-scale contracts.

Batched Auditing: Unpacked Contracts through Incremental Input and Detection:

Prompt 1:

“Help me discover vulnerabilities in this Solidity smart contract.”

Batch 1 of the contract code.

Prompt 2:

“Help me discover vulnerabilities in this Solidity smart contract.”

Batch 2 of the contract code.

Prompt 3:

“Help me discover vulnerabilities in this Solidity smart contract.”

Batch 3 of the contract code.

It is worth mentioning that GPT-4 failed to identify any critical vulnerability points.

Summary: While the current state of GPT’s capabilities may not be entirely suitable for contract analysis, the potential of AI in this domain remains impressive.

Advantages:

While GPT’s detection capabilities for complex vulnerabilities in contract code may be limited, it has shown impressive partial detection capabilities for basic and simple vulnerabilities. Additionally, once a vulnerability is identified, the model provides an explanation in an easily understandable and user-readable format. This unique feature is especially beneficial for novice contract auditors who require quick guidance and straightforward answers during their initial training phase.

Challenges:

There is a certain amount of variation in GPT’s output for each dialogue, which can be adjusted through API interface parameters. However, the output is still not constant. Although such variability is beneficial for language dialogues and greatly enhances the authenticity of the conversation, it is not ideal for code analysis work. In order to cover multiple possible vulnerability answers that AI may provide, we had to make multiple requests for the same question and compare and filter the results. This inadvertently increases the workload, ultimately undermining the fundamental objective of AI in assisting humans to improve their efficiency.

For instance, we conducted an additional test by running Test 2 of the Comparison of Vulnerability Code Snippet Detection with a slight modification of the function name before generating again.

As we can see, its output results have added some additional content compared to the previous test.

There is still significant room for improvement in its vulnerability analysis capabilities.

It is worth noting that the current (as of March 16, 2024) training models of GPT are unable to accurately analyze and identify critical vulnerability points for slightly complex vulnerabilities.

Despite the current limitations of GPT’s analysis and mining capabilities for contract vulnerabilities, its ability to analyze and generate reports on simple code blocks for common vulnerabilities still sparks excitement among users. With continued training and development of GPT and other AI models, we firmly believe that assisted auditing of large and complex contracts will achieve faster, more intelligent, and more comprehensive outcomes in the foreseeable future. As technological development exponentially improves human efficiency, a transformative shift is imminent. We eagerly anticipate the benefits of AI in enhancing blockchain security and remain vigilant in monitoring the impact of emerging AI products on this vital field. In the visible future, we will inevitably integrate with AI to some extent. May AI and blockchain be with you.

Read more: https://slowmist.medium.com/how-effective-is-gpt-for-auditing-smart-contracts-cdeddfa76dbe

Comments

All Comments

Recommended for you

  • Cointime May 5th News Express

    1.The Federal Reserve reduced its balance sheet by $77 billion in April, and the size of its balance sheet fell below $7.4 trillion2.Former Bitmex CEO: Bitcoin will trade between $60,000 and $70,000 before August 3.SLERF total destruction exceeds 7 million USD4.ether.fi large staker initiates pledge withdrawal application for 37,140 ETH5.Web3 digital asset company Alpha Transform Holdings makes strategic investments in Arhasi and Cloudbench 6.A trader spent 402 ETH to buy 732,326 FRIEND, with an unrealized profit of $653,0007.A certain address has sold a total of 677,197 FRIEND airdrops through BunnySwap, making a profit of approximately $1.15 million8.A multi-signature wallet withdrew 915.85 billion PEPE from Binance9.The NFT project Blob team engraved the rune EPIC•EPIC•EPIC•EPIC on the Epic Satoshi block of Bitcoin’s fourth halving10.On-Chain Analyst Predicts Six to Twelve Months of 'Parabolic Advance' for Bitcoin

  • Cointime May 4th News Express

    1. Hong Kong Bitcoin Spot ETF has held 4,218 BTC since its listing three days ago

  • Blockchain Asset Management announces launch of a dedicated blockchain fund for accredited investors

    Blockchain Asset Management, a cryptocurrency fund with a scale of $100 million, announced the launch of an exclusive blockchain fund for qualified investors. The specific amount of funds raised by the fund has not been disclosed yet, but it is said to have reached "eight figures", which means it is in the tens of millions of dollars. In addition, the investment threshold for the new fund is $100,000, and all investors are required to meet the approved standards (annual income exceeding $200,000, net assets exceeding $1 million).

  • Renault's BWT Alpine F1 Team announces partnership with ApeCoinDAO

    The BWT Alpine F1 team under Renault announced a partnership with ApeCoinDAO on X platform, which will introduce APE into the Alpine F1 ecosystem and collaborate with global token holders to launch peripheral products and digital assets inspired by the first ApeCoin. It is reported that according to the cooperation between the two parties, in the future, BAYC NFTs may be able to wear equipment and clothing with the Alpine team logo.

  • BTC breaks through $63,000

    The market shows BTC has broken through $63,000 and is currently trading at $63,014.9, with a daily increase of 6.11%. The market is volatile, so please exercise caution in risk management.

  • The total gas consumption on the Base chain exceeds 10,000 ETH

    According to the blockchain analysis platform Dune Analytics, the total gas consumption on the Base chain has exceeded 10,000 ETH, reaching 10,839.5062 ETH at the time of writing (equivalent to over $33.6 million at current prices). The average gas usage amount is about $0.1754 per transaction (0.000059661 ETH), and the total number of blocks has reached 13.41 million, with an average transaction volume of about 14.63 transactions per block. In addition, the data shows that the total transaction volume on the Base chain has exceeded 196.2 million, with over 8.366 million users and over 184 million user transactions at the time of writing. Furthermore, the total number of contracts created on the Base chain has exceeded 64 million, reaching 64,056,573 in the current period.

  • A wallet received 2,000 ETH from Alemeda/FTX

    As monitored by The Data Nerd, 6 hours ago, wallet 0xaEa received 2,000 ETH (approximately $6.23 million) from Alemeda/FTX. Within a week, it received a total of 8,000 ETH (approximately $24.71 million) from Alameda and deposited 6,000 ETH into Binance.

  • A single transaction with a transaction fee of up to 1.5 BTC appeared on the Bitcoin chain

    According to on-chain data tracking service monitoring , there has been a single transaction on the Bitcoin network with a transaction fee as high as 1.5 BTC, worth about $100,254. It is reported that the sender of the transaction is an address starting with "bc1p4n" and the recipient is an address starting with "bc1pqv".

  • 2 wallets deposited 211 billion SHIB into Coinbase within 10 hours

    According to The Data Nerd's monitoring, within 10 hours, 2 wallets (with the same amount of SHIB) deposited a total of 211 billion SHIB (about 5.16 million US dollars) into Coinbase. These wallets accumulated these SHIBs last week, and if sold at the current price, it would cause a small loss (about 120,000 US dollars).

  • USA to forge AI partnership with Nigeria for economic growth

    The partnership aims to strengthen economic ties and ensure that AI deployment is safe, secure, transparent, and trustworthy.