Giter Club home page Giter Club logo

ldbcollector's Introduction

ldbcollector

This is a rewrite of the old ldbcollector, which is found in ./old-ldbcollector or in the branch v1.

This rewrite is not yet stable and for stable use the old version is prefered.

This is a rewrite, it contains

ldbcollector's People

Contributors

maxhbr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ldbcollector's Issues

Duplicates in ORT license-classifications.yml generated file make ORT fail

The license-classifications.yml file, as advertised in ORT's repo has a few duplicates that make ORT fail on the evalutor phase.

Full exception is reproduced below:

12:51:48.726 [main] INFO  org.ossreviewtoolkit.cli.commands.EvaluatorCommand - Read ORT result from 'advisor-result.json' (3.35 MiB) in 1.405892857s.
Exception in thread "main" com.fasterxml.jackson.databind.exc.ValueInstantiationException: Cannot construct instance of `org.ossreviewtoolkit.model.licenses.LicenseClassifications`, problem: Found multiple license categorizations with the same id: [LGPL-3.0-only, AGPL-3.0-only, ICU, GPL-3.0-only, PSF-2.0]
 at [Source: (File); line: 3264, column: 1]
	at com.fasterxml.jackson.databind.exc.ValueInstantiationException.from(ValueInstantiationException.java:47)
	at com.fasterxml.jackson.databind.DeserializationContext.instantiationException(DeserializationContext.java:2047)
	at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapAsJsonMappingException(StdValueInstantiator.java:587)
	at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.rewrapCtorProblem(StdValueInstantiator.java:610)
	at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:293)
	at com.fasterxml.jackson.module.kotlin.KotlinValueInstantiator.createFromObjectWith(KotlinValueInstantiator.kt:125)
	at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:202)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:444)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1405)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:352)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:185)
	at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323)
	at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4674)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3514)
	at org.ossreviewtoolkit.cli.commands.EvaluatorCommand.run(EvaluatorCommand.kt:804)
	at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:198)
	at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:211)
	at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:18)
	at com.github.ajalt.clikt.core.CliktCommand.parse(CliktCommand.kt:400)
	at com.github.ajalt.clikt.core.CliktCommand.parse$default(CliktCommand.kt:397)
	at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:415)
	at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:440)
	at org.ossreviewtoolkit.cli.OrtMainKt.main(OrtMain.kt:83)
Caused by: java.lang.IllegalArgumentException: Found multiple license categorizations with the same id: [LGPL-3.0-only, AGPL-3.0-only, ICU, GPL-3.0-only, PSF-2.0]
	at org.ossreviewtoolkit.model.licenses.LicenseClassifications.<init>(LicenseClassifications.kt:86)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
	at com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:128)
	at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:291)
	... 18 more

I've fixed (removed) these duplicates and am willing to submit a PR, however I understand that this file is generated.

Does it make any sense to include a fix to the generated file? At least people coming from ORT's documentation would not struggle with the same issue.

In ORT's `license-classifications.yml`, prefix categories with their origin

Currently, entries look like

- categories:
- permissive
- maybe-rating:Stop
- maybe-rating:No-Go
- osi-approved
- fsf-free
id: AAL

As LDBcollector aggregates origins of classifications, it's unclear where e.g. "maybe-rating:Stop" comes from. Could we extend the "colon-syntax" to add another <origin>: prefix to the categorization?

For "common" categorizations like "permissive" that approach would be the additional advantage to see which origins all agree on the "permissive" categorization.

Prebuilt version of the built list of licenses

Is there a way I can get hold of a released/fixed version of the JSON file containing data from the various sources?

I am currently looking in to adding license "translations" (e.g. "BSD 3 Clause" to "BSD-3-Clause") to flict via a separate file. Basing this on "__impliedNames" seems to be a good idea.

Running on Debian/Ubuntu

Trying to generate files myself I get the following:

$ bash ./run.sh 
Downloading lts-17.9 build plan ...
RedownloadInvalidResponse Request {
  host                 = "raw.githubusercontent.com"
  port                 = 443
  secure               = True
  requestHeaders       = []
  path                 = "/fpco/lts-haskell/master//lts-17.9.yaml"
  queryString          = ""
  method               = "GET"
  proxy                = Nothing
  rawBody              = False
  redirectCount        = 10
  responseTimeout      = ResponseTimeoutDefault
  requestVersion       = HTTP/1.1
}
 "/home/hesa/.stack/build-plan/lts-17.9.yaml" (Response {responseStatus = Status {statusCode = 404, statusMessage = "Not Found"}, responseVersion = HTTP/1.1, responseHeaders = [("Connection","keep-alive"),("Content-Length","14"),("Content-Security-Policy","default-src 'none'; style-src 'unsafe-inline'; sandbox"),("Strict-Transport-Security","max-age=31536000"),("X-Content-Type-Options","nosniff"),("X-Frame-Options","deny"),("X-XSS-Protection","1; mode=block"),("Content-Type","text/plain; charset=utf-8"),("X-GitHub-Request-Id","A53E:C95B:FA0E5:102A12:60913E8B"),("Accept-Ranges","bytes"),("Date","Tue, 04 May 2021 12:31:07 GMT"),("Via","1.1 varnish"),("X-Served-By","cache-cph20636-CPH"),("X-Cache","MISS"),("X-Cache-Hits","0"),("X-Timer","S1620131467.330890,VS0,VE178"),("Vary","Authorization,Accept-Encoding"),("Access-Control-Allow-Origin","*"),("X-Fastly-Request-ID","5e461a55f828e57f3f4af153b449f84384f5ecd1"),("Expires","Tue, 04 May 2021 12:36:07 GMT"),("Source-Age","0")], responseBody = (), responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose})

Do you have any ideas?

Host
Ubuntu 20.04

LDBCollector Version

$ git log | head -7
commit bd40c3e6134b921fee359b7e476d363ba49edbc2
Author: Maximilian Huber <[email protected]>
Date:   Fri Apr 23 09:21:36 2021 +0200

    fix Flict exporter
    
    Signed-off-by: Maximilian Huber <[email protected]>

$ git rev-parse --short HEAD
bd40c3e61

In ORT's `license-classifications.yml`, list the `id` first

Just a visual thing, but currently entries look like

- categories:
- permissive
- maybe-rating:Stop
- maybe-rating:No-Go
- osi-approved
- fsf-free
id: AAL

and it always confuses me that the id comes after the categories. Could we change the order to match the one shown in https://github.com/oss-review-toolkit/ort-config/blob/eaff9f3ceff12724069e6c9d6ca3394402c77153/license-classifications.yml#L36-L39?

BSL problem: distinguish between `Business Source License` and `Boost Software License`

        "__impliedNames": [
            "BSL-1.0",
            "BSL-1.0",
            "Boost Software License 1.0",
            "Boost Software License 1.0",
            "boost-1.0",
            "Boost 1.0",
            "bsl-1.0",
            "bsl-1.0",
            "Business Source License 1.0",
            "Business Source License 1.0",
            "boost-1.0",
            "Boost 1.0",
            "Boost Software License 1.0 (BSL-1.0)",
            "Boost Software License 1.0 (BSL-1.0)",
            "BSL (v1.0)",
            "BSL (v1.0)",
            "BSL (v1)"
            "BSL (v1)"
        ],
        ],

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.