dependablesystemslab / solidifi-benchmark Goto Github PK

Repository of benchmarks to evaluate Solidity Smart contract analysis tools

Home Page: http://blogs.ubc.ca/karthik/2020/05/22/how-effective-are-smart-contract-static-analysis-tools-evaluating-smart-contract-static-analysis-tools-using-bug-injection/

License: Other

Shell 0.36% Python 99.64%

solidity fault injection benchmarks smart contracts

solidifi-benchmark's Introduction

SolidiFI Benchmark

SolidiFI-benchmark repository contains a dataset of buggy contracts injected by 9369 bugs from 7 different bug types, namely, reentrancy, timestamp dependency, uhnadeled exceptions, unchecked send, TOD, integer overflow/underflow, and use of tx.origin. The bugs have been injected in the contracts using SolidiFI.

In addition to the dataset of the vulnerable contracts, the repository contains the injection logs that can be used to refrence the injection locations, where the bugs have been injected in the code, and the type of each bug.

This dataset has been used to evaluate six smart contract static analysis tools namely, Oyente, Securify, Mythril, Smartcheck, Manticore, and Slither. Please reference the following paper for more details. The analysis reports generated by the six evaluated tools are available in this respository as well. How Effective are Smart Contract Analysis Tools? Evaluating Smart Contract Static Analysis Tools Using Bug Injection.

This dataset can be used to evaluate other smart contract analysis tools.

Please cite this paper when you use this dataset.

@inproceedings{ghaleb2020effective,
 title={How Effective Are Smart Contract Analysis Tools? Evaluating Smart Contract Static Analysis Tools Using Bug Injection},
 author={Ghaleb, Asem and Pattabiraman, Karthik},
 booktitle={Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis},
 year={2020}
}

Structure

The folder named "buggy_contracts" contains the dataset of the buggy contracts.

The folder named "results" contains the experimental artifacts of our paper.

Following is an example of results folder's structure:

  results
│ │
│ ├── Oyente
│ │ │
│ │ ├── analyzed_buggy_contracts (folder)
│ │ │ │
│ │ │ ├── Re-entrancy (there is a separate folder for each bug type) that contains the following
│ │ │ │
│ │ │ ├── all the buggy contracts injected by this type of bugs(specified by the name of the folder) along with the injection
│ │ │ ├── logs for each contract(BugLog)
│ │ │ │
│ │ │ ├── results (a folder that contains the analysis reports generated by the tool for each buggy contract)
│ │	
│ ├── Securify
│ ├── Mythril
│ ├── Smartcheck
│ ├── Manticore
│ ├── Slither

Reproducing evaluation results presented in the paper

To reproduce the results presented in the paper, please run the "inspection.py" script as below.

The script inspects the analysis reports of the evaluated tools for false negatives, false positives, and misidentified bugs.

Running The following command will reproduce results for all evaluated tools at once

python3 scripts/inspection.py Oyente,Securify,Mythril,Smartcheck,Manticore,Slither results

The false negatives and false positives will be printed to the console and also stored into two separate folders named "FNs" and "FPs"

To reproduce results for only one or specific tools, list only the names of those tools in the command. For example, the following command will reproduce results only for Oyente and Securify. Make sure to separate the names of tools by comma.

 python3 scripts/inspection.py Oyente,Securify results

solidifi-benchmark's People

Contributors

Stargazers

Watchers

Forkers

smartbugs nveloso virus-bug lwy0518 cssack sbip-sg thanhtoantnt shashankhacker730 marklee131 mwakaba2 fusky madhusona

solidifi-benchmark's Issues

Injected reentrancy bugs detectable as different bug classes

Various of the reentrancy bugs injected in this dataset are also detectable as a different bug class. For example, several of the reentrancy bugs are detected by MAIAN, an analysis tool that does not attempt to identify reentrancy. This can be misleading when using this dataset for tool evaluation. Tools might detect certain bugs due to different reasons, complicating a comparison of the tools.

For example, the following pattern is injected into the reentrancy dataset:

address payable lastPlayer_re_ent9;
uint jackpot_re_ent9;
function buyTicket_re_ent9() public {
  (bool success,) = lastPlayer_re_ent9.call.value(jackpot_re_ent9)("");
  if (!success)
    revert();
  lastPlayer_re_ent9 = msg.sender;
  jackpot_re_ent9    = address(this).balance;
}

However, there is absolutely no need to perform reentrancy here. In the first invocation lastPlayer_re_ent9 is 0 initialized. So the external call will simply succeed without causing any reentrancy. However, also no ether will be transferred. Then lastPlayer_re_ent9 is set to the attacker and jackpot_re_ent9 is set to all the ether available. The next call to this function will transfer the total ether balance of the victim contract. But where is the reentrancy here? In theory the code can cause reentrancy, but it would not make sense as part of an attack.

In the inspection.py script it seems that for Manticore, 'Reachable ether leak to sender' is considered as a bug code for reentrancy (see here). Why was this done? In the example above Manticore, and other analysis tools like MAIAN or teEther, will identify a leaking Ether issue, not a reentrancy bug. In my opinion this is also the correct bug class for the code above.

I believe the following bug pattern has a similar issue, although here the attacker can actually utilize reentrancy to steal more than 1 ether.

bool not_called_re_ent27 = true;
function bug_re_ent27() public{
    require(not_called_re_ent27);
    if( ! (msg.sender.send(1 ether) ) ){
        revert();
    }
    not_called_re_ent27 = false;
}

In general it seems a bit problematic to inject bug patterns that can be identified as two different bug classes, as it complicates comparisons between tools. This might invalidate some of the results presented in the paper. I suggest to double check the results and maybe publish an addendum to the paper, if necessary.

Injected reentrancy bugs are not exploitable

Most of the injected bug patterns for reentrancy are not exploitable, or are even effectively dead code. This seems like a major drawback of this dataset when trying to evaluate reentrancy detection tools: the more precise a tool becomes, the worse it will perform on this dataset. In my opinion this is quite misleading.

The following pattern is injected into many of the contracts in the reentrancy category:

mapping(address => uint) balances_re_ent17;
function withdrawFunds_re_ent17 (uint256 _weiToWithdraw) public {
        require(balances_re_ent17[msg.sender] >= _weiToWithdraw);
        // limit the withdrawal
        (bool success,)=msg.sender.call.value(_weiToWithdraw)("");
        require(success);  //bug
        balances_re_ent17[msg.sender] -= _weiToWithdraw;
}

Now here balances_re_ent17 is 0 initialized as all datatypes in Solidity/Ethereum. There is no way to change the values in balances_re_ent17. As such, the only valid call that does pass the require statement in the beginning of withdrawFunds_re_ent17 is to pass _weiToWithdraw == 0 as a parameter. This will transfer 0 ether. So one can reenter as much as one likes by always transferring 0 ether and subtracting 0 from 0. Not very useful and definitely not exploitable.

The next reentrancy pattern is broken in two ways:

Effectively dead code
Uses transfer, where reentrancy is not possible.

// 0 initialized and no function to update the mapping
mapping(address => uint) redeemableEther_re_ent25;
function claimReward_re_ent25() public {        
      // can never be satisfied
      require(redeemableEther_re_ent25[msg.sender] > 0);
      // unreachable
      uint transferValue_re_ent25 = redeemableEther_re_ent25[msg.sender];
      // msg.sender.transfer does not allow for reentrancy due to gas limits
      msg.sender.transfer(transferValue_re_ent25);   //bug
      redeemableEther_re_ent25[msg.sender] = 0;
}

I suggest to put a big disclaimer somewhere that this dataset should not be used to evaluate reentrancy detection.

question on validity of overflow bugs

Hi, I have a question on validity of injected overflow bugs.

It seems that, some parts that are marked as injected overflow bugs are not actually bugs (i.e., they are safe).

Could you please confirm whether they are indeed bugs or not?

For example, in a code snippet

function bug_intou20(uint8 p_intou20) public{
    uint8 vundflw1=0;
    vundflw1 = vundflw1 + p_intou20;   // overflow bug
}

which comes from
https://github.com/DependableSystemsLab/SolidiFI-benchmark/blob/master/buggy_contracts/Overflow-Underflow/buggy_11.sol#L98

the expression vundflw1 + p_intou20 will not overflow because vulndflw1 is initialized as 0 and it is a local variable (hence effects by transactions will not be accumulated).

To introduce overflow bugs in the function bug_intou20, for example, vulndlfw1 should be initialized with non-zero values.

dependablesystemslab / solidifi-benchmark Goto Github PK

solidifi-benchmark's Introduction

SolidiFI Benchmark

Structure

Reproducing evaluation results presented in the paper

solidifi-benchmark's People

Contributors

Stargazers

Watchers

Forkers

solidifi-benchmark's Issues

Injected reentrancy bugs detectable as different bug classes

Injected reentrancy bugs are not exploitable

question on validity of overflow bugs

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent