Giter Club home page Giter Club logo

liicd's People

Contributors

agamvrinos avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

liicd's Issues

Replace print statements with file logging

Replace print statements with file logging to be able to revisit the output of the detector for each commit that was processed, even after the termination of the app.

File renames caussing issues with detection

Currently the generate_config.py script, when parsing the commits of a repository, only saves the file changes that correspond to Modifications (M), Deletions (D) or Additions (A) statuses. However, when a file is moved from one place to another or when it gets renamed, git uses another status tag, in the format of R{number}, for this.

The problem that might occur is the following:

  1. The detector checks out to HEAD~2 to handle the commit that corresponds to that project snapshot
  2. The detector reads the file changes for that commit from the configuration file but since renames are not tracked, there are no related changes, hence the index does not get updated with the new location or filename
  3. The detector checks out to HEAD~1 to handle the commit that corresponds to that project snapshot
  4. The commit includes changes that refer to the now renamed/moved file.
  5. The detector tries to find the file in the index but since the index is not updated, it fails.

List of Invalid/Binary files handling not exhaustive

When reading a codebase to create the initial Clone Index, the application skips binary and invalid (non-unicode) files. This happens on the basis of a list that includes the extensions that should be skipped. This list is not exhaustive though, in the sense that there might be binary file extensions that have not been included.

In such a case, the issue would be the following:

  1. The application would ignore the file due to the inability to read it (an exception is thrown but the application continues by ignoring the file)
  2. A commit includes changes that affect the specific file. For instance, a .jpg file (assuming .jpg is not handled, which is) was renamed.
  3. The application tries to find the file in the index, but since it was not processed when the codebase was read, it fails.

Possible Solutions

  1. Prior to reading it, try to do an initial check to see if a file is binary or invalid. For binary files, there are libraries that do this but they do so probabilistically, so these are still not suitable.
  2. Update the application to only consider file extensions that form the majority of the codebase. For example for a Java-based project, only consider .java extensions or maybe a combination of extensions in case the project uses multiple languages.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.