Giter Club home page Giter Club logo

packageurl-java's People

Contributors

dependabot[bot] avatar hboutemy avatar jeremylong avatar magnusbaeck avatar mangoiv avatar maxhbr avatar mealingr avatar pablogalegoc avatar pombredanne avatar sschuberth avatar stevespringett avatar veselov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

packageurl-java's Issues

Slash character is not expected to be escaped by the specification

This issue is created following package-url/purl-spec#293

Slash character in qualifiers appears to be escaped in the current implementation. For example the following code

//DEPS com.github.package-url:packageurl-java:1.5.0

import java.util.TreeMap;
import com.github.packageurl.PackageURL;

public class purl {
    public static void main(String[] args) throws Exception {

        final TreeMap<String, String> qualifiers = new TreeMap<>();
        qualifiers.put("type", "jar");
        qualifiers.put("repository_url", "https://maven.repository.redhat.com/ga/");
        var purl = new PackageURL(PackageURL.StandardTypes.MAVEN,
                    "org.apache.james",
                    "apache-mime4j-storage",
                    "0.8.9.redhat-00001",
                    qualifiers, null);
        System.out.println(purl);
    }
}

results in

pkg:maven/org.apache.james/[email protected]?repository_url=https%3A%2F%2Fmaven.repository.redhat.com%2Fga%2F&type=jar

while following the spec it should be

pkg:maven/org.apache.james/[email protected]?repository_url=https://maven.repository.redhat.com/ga/&type=jar

Instanciating a PackagerURL with a version number containing a new line leads eventually to an invalid URL

Context

Identified following this analysis jeremylong/DependencyCheck#6688 (comment)

Maven supports the newline character in the version as per its XSD. When instantiating a PackageURL with such a character in its version, calling PackageURL.canonicalize() returns an invalid URL.

Steps to reproduce

        PackageURL url = new PackageURL("maven", "com.google.summit", "summit-ast", "2.2.0\n", null, null);

        String canonicalize = url.canonicalize();

Expected beavior

canonicalize is equal to pkg:maven/com.google.summit/[email protected]%0A

Current behavior

canonicalize is equal to pkg:maven/com.google.summit/[email protected]%A

Empty namespace(groupId) for maven does not throw exception

While parsing pkg:maven//a@v?&type=e one would expect that an exception is thrown giving the namespace (groupId) is required for maven or that a namespace is expected, but instead the purl (string value) is parsed and it's generated with encoded characters pkg:maven/%2Fa@v?type=e

Inconsistent colon encoding

There are inconsistencies with colon encoding in different languages.
For the following input:

type:docker
name:cassandra
version: sha256:244fd47e07d1004f0aed9c

output:

java implementation: pkg:docker/cassandra@sha256%3A244fd47e07d1004f0aed9c
go implementation: pkg:docker/cassandra@sha256:244fd47e07d1004f0aed9c
python implementation: pkg:docker/cassandra@sha256:244fd47e07d1004f0aed9c

As we can see, the colon : will be encoded as %3A in java implementaion, but not in other languages.
According to the specification of purl

the '#', '?', '@' and ':' characters must NOT be encoded when used as separators. They may need to be encoded elsewhere
the ':' scheme and type separator does not need to and must NOT be encoded. It is unambiguous unencoded everywhere

I think : must NOT be encoded.

Is there a new patch release planned

Hi,

I'm waiting for a patch release 1.1.2 that includes the equals/hashcode addition. Is that already planned?

With kind regards

Thomas von Siebenthal

PackageURLs, with versions including a plus sign

When I have a package URL like the following (copy/paste form dependency track):

pkg:deb/debian/mailutils@1%3A3.10-3%20b1?arch=amd64&distro=debian-11&upstream=mailutils%401%3A3.10-3

This matches the mailutils package with version 1:3.10-3+b1.

However when I try to decode the URL with this library, this doesn't translate to this version, but rather to 1:3.10-3 b1. I.e. it converts (by URL decoding) the %20 to a space, which is correct, but I would have expected a + sign.

This can be seen from this unit test too:

        // when
        PackageURL purl = new PackageURL("pkg:deb/debian/mailutils@1%3A3.10-3%20b1?arch=amd64&distro=debian-11&upstream=mailutils%401%3A3.10-3");

        // then
        assertThat(purl.getName()).isEqualTo("mailutils");
        assertThat(purl.getVersion()).isEqualTo("1:3.10-3+b1");

which results in a test failure:

expected: "1:3.10-3+b1"
 but was: "1:3.10-3 b1"

If I manually replace the %20 with a + sign I get the same results.

        // when
        PackageURL purl = new PackageURL("pkg:deb/debian/mailutils@1%3A3.10-3+b1?arch=amd64&distro=debian-11&upstream=mailutils%401%3A3.10-3");

        // then
        assertThat(purl.getName()).isEqualTo("mailutils");
        assertThat(purl.getVersion()).isEqualTo("1:3.10-3+b1");

which fails with the same message:

expected: "1:3.10-3+b1"
 but was: "1:3.10-3 b1"

I'm now wondering... should I manually replace these spaces with a + sign in my application, or is there a bug in this library.

Add PackageURL#getCoordinates method

Description

To my knowledge, the current PackageURL.java implementation does not provide a simple way to retrieve the package's 'coordinates' (purl without subpath or qualifiers).

The package coordinates are useful for generic component information: pkg:deb/debian/[email protected] = cURL version 7.50.3-1.
Whereas the full purl is useful for specific component information: pkg:deb/debian/[email protected]?arch=i386&distro=jessie&repository=... = cURL version 7.50.3-1 installed on Debian Jessie, i386 architecture, installed from this specific repository...

Proposed Solution

A PackageURL#getCoordinates method which returns pkg:type/namespace:name@version (no qualifiers or subpath).

For example, in Dependency-Track (a project you may have know about ๐Ÿ˜‰), components have separate purl and purlCoordinates fields.

Failure to adhere to spec: subpath

Currently, the subpath parsing and construction does not follow the spec:

  1. During parsing subpath is not split on '/' and empty, '.', or '..' segments removed.
  2. Segments are not % encoded during canonicalization
    • nor are invalid segments removed

`getCoordinates()` yields wrong result when `canonicalize()` was called before

For package URLs with qualifiers and / or subpaths, calling getCoordinates() on a PackageURL instance after canonicalize() was called on the same instance, will return the entire PURL instead of just the coordinates.

This happens because both canonicalize() and getCoordinates() use the canonicalize(boolean coordinatesOnly) method, which caches its result after the first invocation:

private String canonicalize(boolean coordinatesOnly) {
if (canonicalizedForm != null) {
return canonicalizedForm;
}

canonicalizedForm = purl.toString();
return canonicalizedForm;

In fact, the reverse is true as well: If getCoordinates() is called before canonicalize(), the result of canonicalize() will only contain coordinates instead of the complete PURL.

The behavior is easily reproducible by modifying the testGetCoordinates() test case by adding a purl.canonicalize() invocation:

@Test
public void testGetCoordinates() throws Exception {
    PackageURL purl = new PackageURL("pkg:generic/acme/[email protected]?key1=value1&key2=value2");
    purl.canonicalize();
    Assert.assertEquals("pkg:generic/acme/[email protected]", purl.getCoordinates());
}
Expected :pkg:generic/acme/[email protected]
Actual   :pkg:generic/acme/[email protected]?key1=value1&key2=value2

If caching of the canonicalized representation is required, it should be done for the "complete" and "coordinatesOnly" variants separately. The current situation can cause unpredictable behavior at runtime when objects with PackageURL fields are passed around and accessed by multiple domains or layers.

Requesting information

For the below pURL

pkg:maven/org.springframework.boot/[email protected]?type=jar

when PackageURL.getName() is called "spring-boot-starter" is returned but the name here is the artifact name right?

Shouldn't it return "org.springframework.boot/spring-boot-starter" instead of just "spring-boot-starter"?

Could use a module name

I decided to try building my latest project with the java module system so I can better understand how it works. I found myself needing to generate a package URL so I dropped in a reference to this project and a warning popped up.

eclipse warning

Since no module name is declared in the jar manifest, it seems the module system defaults to using the jar file name as the module name. Apparently this can cause a major problem if the name changes (either because a proper name was picked, or because the name of the jar file changed) and 2 different modules try to reference it with different names. Maven even goes as far as to beg me not to publish my project until the issue is fixed.

Maven warning

Personally I wonder if it's really that big a deal when I could presumably just release a new version of my project referencing the new module name, but anyhow it's an easy fix. You basically just need to pick a module name and add it to the manifest, which can be done while still compiling and running on JDK 8. For instance here is the jar manifest for the apache commons codec project.

Module name example

This post has a lot more info on the matter.

Validation Routines Decode

The validation routines decode values - which would be correct if only used during parsing. However, the validation routines are also used in the constructor. The result is values could be % decoded when they should not be. While this is not likely to cause an issue in production - this should likely be cleaned up.

I am more than willing to submit a PR for this but wanted to discuss options. Specifically, would the team prefer:

  1. Keeping all validation and parsing methods internal to the PackageURL class
  2. Extract validation and parsing methods into their own classes

My preference would be to go with option 2.

Support editable builders

It would help if the library can be used to edit PURLs.
I.e., having a builder be created from an existing package URL, and have it being edited, and re-built into a (modified) package URL.

NullPointerException calling `getQualifiers()` if the PURL has no qualifiers

If you parse a PURL with no qualifiers, for example pkg:maven/org.apache.commons/[email protected] from test-suite-data.json, and then try to check what the qualifiers were, a NullPointerException is thrown.

PackageURLTest.java currently skips verifying the qualifiers if none are expected to be present. It should check that it can retrieve the qualifiers and that there are none.

            if (qualifiers != null) {
                Assert.assertNotNull(purl.getQualifiers());
                Assert.assertEquals(qualifiers.length(), purl.getQualifiers().size());
                qualifiers.keySet().forEach((key) -> {
                    String value = qualifiers.getString(key);
                    Assert.assertTrue(purl.getQualifiers().containsKey(key));
                    Assert.assertEquals(value, purl.getQualifiers().get(key));
                });
            // New else case:
            } else {
                Assert.assertEquals(0, purl.getQualifiers().size());
            }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.