Giter Club home page Giter Club logo

maintainable-security's Introduction

security-maintainability

Pre-print available here: https://arxiv.org/abs/2106.03271. Published at the Empirical Software Engineering Journal.

Installation

Requirements installation:

virtualenv --python=python3.7 venv
source venv/bin/activate
pip install -r requirements.txt

Merge different caches in scripts/maintainability/caches/ folder:

source venv/bin/activate
cd scripts
python -m maintainability.merge_cache -cache maintainability/cache -output maintainability/bch_cache.zip

Data Analysis

How to collect maintainability results from BCH cache:

source venv/bin/activate
cd scripts
python report.py --report export -secdb ../dataset/db_security_changes.csv -regdb ../dataset/db_regular_changes_random.csv -baseline random -results ../results -cache maintainability/bch_cache.zip

Comparison between security and regular commits:

source venv/bin/activate
cd scripts
python report.py --report comparison -results ../results/ -reports ../reports

Get security maintainability report per guideline:

source venv/bin/activate
cd scripts
python report.py --report guideline -secdb ../results/maintainability_release_security_fixes.csv -reports ../reports

Get security maintainability report per language:

source venv/bin/activate
cd scripts
python report.py --report language -secdb ../results/maintainability_release_security_fixes.csv -reports ../reports

Get security maintainability report per severity:

source venv/bin/activate
cd scripts
python report.py --report severity -secdb ../results/maintainability_release_security_fixes.csv -reports ../reports

Get security maintainability report per cwe:

source venv/bin/activate
cd scripts
python report.py --report cwe -secdb ../results/maintainability_release_security_fixes.csv -reports ../reports

Get security maintainability report per specific cwe (available for CWE_664 and CWE_707):

source venv/bin/activate
cd scripts
python report.py --report cwe-spec -secdb ../results/maintainability_release_security_fixes.csv -cwe CWE_664 -reports ../reports

Experiments

How to collect maintainability reports from BCH:

Create a config file in scripts/maintainability/config.json based on the example in scripts/maintainability/config-template.json.

virtualenv --python=python3.7 venv
source venv/bin/activate
pip install -r requirements.txt
cd scripts
python -m maintainability.eval_maintainability

maintainable-security's People

Contributors

luiscruz avatar ruimaranhao avatar sofiaoreis avatar tqrg-bot avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

noah-spahn

maintainable-security's Issues

Minor issues: Section 4

Section 4:

  • Page 13, Line 3 - 969 security commits and 969 baseline commits? It's hard to know how to parse that.
  • Page 13, Line 31 "overall patches …" rather than having us rely on looking at the Figure, tell us the number or percentage of when the impact is positive like you do with the negative in the next sentence. [You do provide the specifics starting on page 15 line 49. I wanted it here.]
  • Page 19, Line 7 - for the unindoctrinated - I'd explain the concept of Research Concepts. Also explain your choice of 7a and 7b and how representative they are from the several hundred choices. Similar for your explanations in the second two paragraphs on this page. How far down the concept/vulnerability hierarchy tree are these CWE choices or are they lower level vulnerabilities chosen from the 700+ CWE types?
  • Page 20, Line 42 - how many is "a considerable number of cases"?

Minor issues: Abstract + Info

Abstract:

  • Throughout the paper, "hypothesize" is probably a better word than "suspect" for sounding more scientific.

Intro:

  • First sentence - quality is not ONLY related to cost but also to security and safety.
  • Provide a URL to Software Improvement Group and Better Code Hub.
  • Page 3, Line 1 - Application Security Verification Standard (ASVS).
  • Page 3, Line 7 - instead of a "broad number of code metrics" - tell us exactly the number of metrics.
  • Page 3, Line 13 - "suggest" sounds more scientific than "hint at"
  • Page 3, Line 24 - "we intend to highlight the need …". Is that the broad goal of your paper? The goal should be explicitly stated in the abstract and intro.

Minor issues: Motivation + Methodology

Section 2-3:

  • Delete first paragraph. It's redundant.
  • Page 4, Line 15 - You define the OCSP acronym late in the paragraph.
  • Page 5, Line 6 - how many new branch points?
  • Page 6, Line 19 - I think you want to say "In this study, maintainability is …." Since it currently implies this information is available in the CWE.
  • Page 6, line 47 - "regular commits" non-security commits to be more clear; maybe a few more words to explain "randomly collected." You also call them "baseline commits" in Figure 1 which gives a different phrase for the same concept. Pick one and use it everywhere.
  • Page 7, line 4 - I don't think "resemble" is the word you are looking for - but I don't know what you are trying to say so I can't make a suggestion.
  • Sometimes you say 1330 patches (e.g. page 7, line 16) and sometimes 1300 (e.g. page 8, line 11, and the abstract). In science it's better to say the exact number and use it always and not round.
  • You say the dataset had 1330 (or 1300) patches and 1282 commits though you also say one patch can have multiple commits assigned. So how did you end up with less than one commit/patch?
  • I also feel irritated that I don't understand what you really mean by "running against the BCH toolset" (page 8, line 13) since you have repeatedly mentioned BCH like it was an industry standard - but you have not told me if it's a static analysis tool or what it is.
  • Add explanation of BCH toolset to page 7.
  • Page 8, line 33 "regular commit" - pick baseline or regular (or non-security) commits and use it always.
  • I don't understand the sentence "The baseline dataset is generated from the security commits dataset." Do you mean you extract these commits from the projects in the security commits dataset?
  • Page 10, line 49 "mainly single-commit patches" … "small percentage of data points" - give us the exact numbers for each of these.
  • Page 11, line 38 - define "floss refactoring" and justify how these were objectively and repeatably identified in your set. You say "these cases" which makes it seems that you were including the set of patches that had more than one commit as your (only) floss candidates.
  • Page 11, Line 42 - I think you are saying you removed these two guidelines from the analysis of all projects (as shown in Figure 4) but this paragraph seems to say this was only for projects with large code bases.
  • Page 11 line 48. How many were disregarded?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.