Giter Club home page Giter Club logo

fabric8-analytics-license-analysis's Introduction

CI codecov

fabric8-analytics-license-analysis

License Analysis Service analyzes the given stack and returns the following:

  • unknown licenses, if any
  • conflicting licenses, if any
  • license based outlier packages, if any
  • stack level license, if possible

Test

How to test locally:

  • ./run-test-local.sh

    • To run with different threshold ./run-test-local.sh -t <[0-1]>

How to run the API locally:

  • ./run-api-local.sh

    • To run on different port ./run-api-local.sh -p <Port>

    • To run with different threshold ./run-test-local.sh -t <[0-1]>

    • To run with different port and threshold ./run-api-local.sh -p <Port> -t <[0-1]>

  • curl localhost:<SERVICE_PORT> should return {status: ok}

Notes:

  • By default the value of MAJORITY_THRESHOLD used is 0.6. If you wish to use any other value, modifications in the test cases will be required to reflect the new outliers.

  • To run tests the the value of DATA_DIR is set to .

  • To run the API locally the value of DATA_DIR is to `tests

Sample License analysis request input

ENDPOINT: /api/vi/stack_license
BODY: JSON data
{
    "packages": [
            {
                "package": "p1",
                "version": "1.1",
                "licenses": ["APACHE", "EPL 1.0"]
            }
        ]
}

Tree used for license comparison

License Diagram

References:

Footnotes

Check for all possible issues

The script named check-all.sh is to be used to check the sources for all detectable errors and issues. This script can be run w/o any arguments:

./check-all.sh

Expected script output:

Running all tests and checkers
  Check all BASH scripts
    OK
  Check documentation strings in all Python source file
    OK
  Detect common errors in all Python source file
    OK
  Detect dead code in all Python source file
    OK
  Run Python linter for Python source file
    OK
  Unit tests for this project
    OK
Done

Overal result
  OK

An example of script output when one error is detected:

Running all tests and checkers
  Check all BASH scripts
    Error: please look into files check-bashscripts.log and check-bashscripts.err for possible causes
  Check documentation strings in all Python source file
    OK
  Detect common errors in all Python source file
    OK
  Detect dead code in all Python source file
    OK
  Run Python linter for Python source file
    OK
  Unit tests for this project
    OK
Done

Overal result
  One error detected!

Please note that the script creates bunch of *.log and *.err files that are temporary and won't be commited into the project repository.

Coding standards

  • You can use scripts run-linter.sh and check-docstyle.sh to check if the code follows PEP 8 and PEP 257 coding standards. These scripts can be run w/o any arguments:
./run-linter.sh
./check-docstyle.sh

The first script checks the indentation, line lengths, variable names, white space around operators etc. The second script checks all documentation strings - its presence and format. Please fix any warnings and errors reported by these scripts.

List of directories containing source code, that needs to be checked, are stored in a file directories.txt

Code complexity measurement

The scripts measure-cyclomatic-complexity.sh and measure-maintainability-index.sh are used to measure code complexity. These scripts can be run w/o any arguments:

./measure-cyclomatic-complexity.sh
./measure-maintainability-index.sh

The first script measures cyclomatic complexity of all Python sources found in the repository. Please see this table for further explanation how to comprehend the results.

The second script measures maintainability index of all Python sources found in the repository. Please see the following link with explanation of this measurement.

You can specify command line option --fail-on-error if you need to check and use the exit code in your workflow. In this case the script returns 0 when no failures has been found and non zero value instead.

Dead code detection

The script detect-dead-code.sh can be used to detect dead code in the repository. This script can be run w/o any arguments:

./detect-dead-code.sh

Please note that due to Python's dynamic nature, static code analyzers are likely to miss some dead code. Also, code that is only called implicitly may be reported as unused.

Because of this potential problems, only code detected with more than 90% of confidence is reported.

List of directories containing source code, that needs to be checked, are stored in a file directories.txt

Common issues detection

The script detect-common-errors.sh can be used to detect common errors in the repository. This script can be run w/o any arguments:

./detect-common-errors.sh

Please note that only semantical problems are reported.

List of directories containing source code, that needs to be checked, are stored in a file directories.txt

Check for scripts written in BASH

The script named check-bashscripts.sh can be used to check all BASH scripts (in fact: all files with the .sh extension) for various possible issues, incompatibilities, and caveats. This script can be run w/o any arguments:

./check-bashscripts.sh

Please see the following link for further explanation, how the ShellCheck works and which issues can be detected.

fabric8-analytics-license-analysis's People

Contributors

abs51295 avatar akshaybhansali18 avatar animuk avatar dependabot[bot] avatar fridex avatar harjinder-hari avatar humaton avatar jmelis avatar jpopelka avatar miteshvp avatar msrb avatar rootavish avatar sara-02 avatar sawood14012 avatar shaded-enmity avatar sivaavkd avatar sunilk747 avatar tisnik avatar tuxdna avatar yzainee avatar yzainee-zz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

fabric8-analytics-license-analysis's Issues

Incorrect output from the license analysis service on prod-preview

Problem description

The license analysis service returns incorrect data with the status set to failure. The output is different from one used in documentation.

Input

Post the following payload taken from the Request and Response doc for dependency editor to the analytics_license service on prod-preview:

  "_resolved": [
          {
              "package": "com.googlecode.xmemcached:xmemcached",
              "version": "2.3.2"
          }, 
          {
              "package": "commons-fileupload:commons-fileupload",
              "version": "1.3"
          }, 
          {
              "package": "org.springframework.boot:spring-boot-starter-web",
              "version": "1.4.1.RELEASE"
          },
          {
              "package": "com.h2database:h2",
              "version": "1.4.192"
          },
          {
              "package": "org.springframework.boot:spring-boot-starter-data-jpa",
              "version": "1.4.1.RELEASE"
          }
      ],
    "ecosystem": "maven"
}

Output from the service

{
  "conflict_packages": [], 
  "distinct_licenses": [
    "apache 2.0"
  ], 
  "message": "No declared licenses found for 3 component(s).", 
  "outlier_packages": [], 
  "packages": [
    {
      "license_analysis": {
        "_message": "Representative license found", 
        "_representative_licenses": "apache 2.0", 
        "conflict_licenses": [], 
        "outlier_licenses": [], 
        "status": "Successful", 
        "synonyms": {
          "Apache License, Version 2.0": "apache 2.0"
        }, 
        "unknown_licenses": []
      }, 
      "licenses": [
        "Apache License, Version 2.0"
      ], 
      "package": "com.googlecode.xmemcached:xmemcached", 
      "version": "2.3.2"
    }, 
    {
      "license_analysis": {
        "_message": "Representative license found", 
        "_representative_licenses": "apache 2.0", 
        "conflict_licenses": [], 
        "outlier_licenses": [], 
        "status": "Successful", 
        "synonyms": {
          "Apache 2.0": "apache 2.0"
        }, 
        "unknown_licenses": []
      }, 
      "licenses": [
        "Apache 2.0"
      ], 
      "package": "commons-fileupload:commons-fileupload", 
      "version": "1.3"
    }, 
    {
      "license_analysis": {
        "_message": "Input is invalid", 
        "_representative_licenses": null, 
        "conflict_licenses": [], 
        "outlier_licenses": [], 
        "status": "Failure", 
        "synonyms": {}, 
        "unknown_licenses": []
      }, 
      "licenses": [], 
      "package": "org.springframework.boot:spring-boot-starter-web", 
      "version": "1.4.1.RELEASE"
    }, 
    {
      "license_analysis": {
        "_message": "Input is invalid", 
        "_representative_licenses": null, 
        "conflict_licenses": [], 
        "outlier_licenses": [], 
        "status": "Failure", 
        "synonyms": {}, 
        "unknown_licenses": []
      }, 
      "licenses": [], 
      "package": "com.h2database:h2", 
      "version": "1.4.192"
    }, 
    {
      "license_analysis": {
        "_message": "Input is invalid", 
        "_representative_licenses": null, 
        "conflict_licenses": [], 
        "outlier_licenses": [], 
        "status": "Failure", 
        "synonyms": {}, 
        "unknown_licenses": []
      }, 
      "licenses": [], 
      "package": "org.springframework.boot:spring-boot-starter-data-jpa", 
      "version": "1.4.1.RELEASE"
    }
  ], 
  "stack_license": null, 
  "status": "Failure", 
  "unknown_licenses": {
    "component_conflict": [], 
    "really_unknown": []
  }
}

Expected output

{
    "conflict_packages": [],
    "license_filter": {
        "alternate_packages": {
            "compatible_packages": [],
            "conflict_packages": [],
            "unknown_license_packages": []
        },
        "companion_packages": {
            "compatible_packages": [],
            "conflict_packages": [],
            "unknown_license_packages": []
        }
    },
    "outlier_packages": {},
    "packages": [
        {
            "license_analysis": {
                "_message": "Representative license found",
                "_representative_licenses": "epl 1.0",
                "conflict_licenses": [],
                "outlier_licenses": [],
                "status": "Successful",
                "synonyms": {
                    "APACHE": "apache 2.0",
                    "Eclipse Public License": "epl 1.0"
                },
                "unknown_licenses": []
            },
            "licenses": [
                "APACHE",
                "Eclipse Public License"
            ],
            "package": "p1",
            "version": "1.1"
        },
        {
            "license_analysis": {
                "_message": "Representative license found",
                "_representative_licenses": "gplv2",
                "conflict_licenses": [],
                "outlier_licenses": [],
                "status": "Successful",
                "synonyms": {
                    "BSD": "bsd-new",
                    "GPL V2": "gplv2"
                },
                "unknown_licenses": []
            },
            "licenses": [
                "BSD",
                "GPL V2"
            ],
            "package": "p2",
            "version": "1.1"
        }
    ],
    "stack_license": "gplv2",
    "status": "Successful"
}

Maintainability index is too high for tests modules

Maintainability index is too high for tests/test_license_analysis.py and tests/test_stack_license.py modules. Please try to fix it:

tests/test_license_analysis.py - B (14.71)
tests/test_stack_license.py - B (15.29)

Missing variant of Apache V2

We are missing this variant of Apache V2, so even though the license is known in the license graph, it is reported as unknown.
Apache Software License v2.0
screenshot from 2018-08-23 14-22-05

Update license synonym to lower casing

As per the discussion here, we should update the license synonym list to all key-values in lower case. It will get rid of lot of similar but redundant values, and easy adding of new values in the future.

Update this repository to use Python 3.6 instead of Python 3.4

EPEL repositories now contain proper Python 3.6 packages and at the same moment Python 3.4 is being deprecated [1] [2].

It means that we need to upgrade this repository to use Python 3.6 instead of Python 3.4.

What needs to be changed AND tested:

  • all Dockerfiles
  • CICO setup
  • linter and pydocstyle scripts
  • CI and MI measurement scripts
  • script to start tests

References:
[1] https://lists.fedoraproject.org/archives/list/[email protected]/thread/EGUMKAIMPK2UD5VSHXM53BH2MBDGDWMO/
[2] https://www.reddit.com/r/CentOS/comments/azetyy/python_34_to_be_deprecated_this_month/

Fix CVE by updating Jinja2 library

Additional information about CVE:

CVE-2019-10906
More information
high severity
Vulnerable versions: < 2.10.1
Patched version: 2.10.1

In Pallets Jinja before 2.10.1, str.format_map allows a sandbox escape.
CVE-2016-10745
More information
high severity
Vulnerable versions: < 2.8.1
Patched version: 2.8.1

In Pallets Jinja before 2.8.1, str.format allows a sandbox escape.

Python 2-related code in the rest-api.py

As we moved production environment to Python 3, I think that the following code is no longer relevant:

try:
    from importlib import reload
except ImportError:
    from imp import reload


# Python2.x: Make default encoding as UTF-8
if sys.version_info.major == 2:
    reload(sys)
    sys.setdefaultencoding('UTF8')

Fix CVE by updating urllib3 library

Additional information about CVE:

CVE-2019-11324
More information
high severity
Vulnerable versions: < 1.24.2
Patched version: 1.24.2

The urllib3 library before 1.24.2 for Python mishandles certain cases where the desired set of CA certificates is different from the OS store of CA certificates, which results in SSL connections succeeding in situations where a verification failure is the correct outcome. This is related to use of the ssl_context, ca_certs, or ca_certs_dir argument.
CVE-2018-20060
More information
high severity
Vulnerable versions: < 1.23
Patched version: 1.23

urllib3 before version 1.23 does not remove the Authorization HTTP header when following a cross-origin redirect (i.e., a redirect that differs in host, port, or scheme). This can allow for credentials in the Authorization header to be exposed to unintended hosts or transmitted in cleartext.

Licensing

Hi! I am the maintainer of scancode-toolkit which is used by the f8 worker to detect licenses (see https://github.com/fabric8-analytics/fabric8-analytics-worker/blob/d055bb8c3042d0e8ce0aa78d24a5be38d6b6db74/f8a_worker/workers/license.py )
scancode-toolkit itelf is Apache-licensed and I would love to possibly integrate this license analysis module there too but there is a licensing issue because this module is GPL-licensed.
Could I interest you in considering making it available under some other terms (GPL with exception, LGPL, Apache?)?
Thanks!

Add QA-related scripts into separate subdirectory

Currently all QA-related scripts (check-bashscripts.sh, check-docstyle.sh) are stored in the repo's root directory. It might be worth to move it into separate subdirectory to cleanup the content a bit.

Fix CVE by updating requests library

Additional information about CVE:

CVE-2018-18074
More information
moderate severity
Vulnerable versions: <= 2.19.1
Patched version: 2.20.0

The Requests package through 2.19.1 before 2018-09-14 for Python sends an HTTP Authorization header to an http URI upon receiving a same-hostname https-to-http redirect, which makes it easier for remote attackers to discover credentials by sniffing the network.

Reduce cyclomatic complexity in StackLicenseAnalyzer.compute_stack_license

Measured cyclomatic complexity is way too high in StackLicenseAnalyzer.compute_stack_license function:

src/stack_license.py
    M 180:4 StackLicenseAnalyzer.compute_stack_license - D (25)

1 blocks (classes, functions, methods) analyzed.
Average complexity: D (25.0)

To be able to use and test this code, it needs to be refactored.

HTTP code should be set to 400 in case of improper payload sent to the licence_recommender endpoint

Currently, the license-recommender endpoint does not contain any check if the payload sent to this endpoint is correct. It means that the service fail later with HTTP code 500, and not with HTTP code 400.

Possible fix:

  1. check payload right after it is received
  2. respond accordingly

Report made by BAF can be seen here:
https://fabric8-analytics.github.io/fuzz-tests/recommender_licence_recommender_issue_169.htm

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.