Giter Club home page Giter Club logo

Comments (19)

simon-jouet avatar simon-jouet commented on May 28, 2024 2

Finally got it working!

By default ingress-nginx has http/2 enabled and the default configuration cannot cope with headers this large. I've changed my ingress-nginx config with http2-max-field-size: 8k and it's working. The issue was that with a smaller size nginx fails and closes the connection and because it's http/2 it doesn't send or log any errors (I was expecting a 414).

I think the JWT token should be stripped to contain less information, is there any reasons for the duplicates in real_groups? What is the purpose of both groups and real_groups? The issue is that storing this kind of info in the JWT token will always result in this issue, it will just depend on the number of groups/repos in gitlab.

from verdaccio-gitlab.

buffcode avatar buffcode commented on May 28, 2024 2

Please bear in mind that while reducing the JWT size might resolve the issue for some/most users, this won't be a permanent fix.
In large(r) GitLab instances a user just need to have access/be maintainer of enough groups to trigger this issue again.

Could there be any other solution despite dropping everything into the JWT? Eg. configure a separate storage (defaults to JWT) for those information.

from verdaccio-gitlab.

simon-jouet avatar simon-jouet commented on May 28, 2024 1

@bufferoverflow thanks for that, make sense, I will have a deeper look into the group api once I get this up and running :).

Regarding the duplicates in real_groups do you think it's an issue with the current code or it's the expected behaviour? Maybe a simple improvement for the time being would be filter out duplicates? (unfortunately I can't post the decoded base64 token here because it contains sensitive info)

In the longer term, fixing the nginx config for me worked but I can quite easily imagine someone with significantly more repos resulting in a far too large header to be sensible. Do you think it would make sense to maybe just provide a token to the user and store the groups in memory (or possibly redis?).

We can close this issue if you want, just detailed the symptoms and resolution in case anybody else run into a similar problem.

from verdaccio-gitlab.

juanpicado avatar juanpicado commented on May 28, 2024 1

I updated my comment 👍 I was wrong.

const groupedGroups = _.isNil(groups) ? real_groups : groups.concat(realGroupsValidated);

from verdaccio-gitlab.

simon-jouet avatar simon-jouet commented on May 28, 2024

Okay I think I figured out the issue, when fetching the packages the authorization token is sent, but in my case the authorization token is very very large (about 7k) which is I believe exceeding nginx buffers resulting in the query not being parsed properly in nginx logs:

62.30.156.32 - [62.30.156.32] - - [27/May/2019:09:27:28 +0000] "-" 000 0 "https://verdaccio.mydomain/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36" 5258 0.000 [] - - - - f6fbab185e38c4ac2ca692bdf83a3603

I've decoded the authorization token and it contains groups and real_groups which is huge. real_groups contains all the groups as well as repos which in my case is about ~60 entries. The more problematic one is groups which starts with the content of groups then contains the scopes (below) and the repeats again the content of groups which sums about to ~120 entries

    "$all",
    "$authenticated",
    "@all",
    "@authenticated",
    "all",

I'm doing this with the verdaccio-gitlab master Dockerfile, I simply changed to use the tagged 4 release which was released yesterday

from verdaccio-gitlab.

bufferoverflow avatar bufferoverflow commented on May 28, 2024

@simon-jouet Thanks for the update!

GitLab differentiates between group name and group path, see also https://docs.gitlab.com/ce/api/groups.html

from verdaccio-gitlab.

dlouzan avatar dlouzan commented on May 28, 2024

Could you paste some small sample of the duplicate entries in the real_groups? You don't need to paste the whole decoded token, but just to get an idea of what might be wrong.

iirc the entries in the token depend on what we return from the authenticate call in the plugin, verdaccio will use that I guess to fill out the token contents.

from verdaccio-gitlab.

dlouzan avatar dlouzan commented on May 28, 2024

Also, are you sure it's jwt what we're talking about? I thought that verdaccio 4.x required a new jwt config entry in the verdaccio.yml file to activate them, otherwise it defaulted to the legacy behaviour. I don't see it activated in your configuration but maybe I'm mixing things.

from verdaccio-gitlab.

simon-jouet avatar simon-jouet commented on May 28, 2024

Thanks @dlouzan,

Here is the anonymised content of the token, just renamed the projects and repos but kept it consistent

{
    "real_groups": [
        "project1",
        "project2",
        "project3",
        "project3/repo1",
        "project3/repo2",
        "project3/repo3",
        "project3/repo4",
        "project2/repo5",
        "project2/repo6",
        "project2/repo7",
        "project2/repo8",
        "project2/repo9",
        "project4/repo10",
        "project2/repo11",
        "project2/repo12",
        "project2/repo13",
        "project1/repo14",
        "project2/repo15",
        "project2/repo16",
        "project2/repo17",
        "project2/repo18",
        "project2/repo19",
        "project2/repo20",
        "project2/repo21",
        "project2/repo22",
        "project2/repo23",
        "project2/repo24",
        "project3/repo25",
        "project2/repo26",
        "project2/repo27",
        "project2/repo28",
        "project2/repo29",
        "project2/repo30",
        "project2/repo31",
        "project2/repo32",
        "project2/repo33",
        "project3/repo34",
        "project2/repo35",
        "project2/repo36",
        "project2/repo37",
        "project4/repo38",
        "project2/repo39",
        "project2/repo40",
        "project1/repo41",
        "project1/repo42",
        "project1/repo43",
        "project1/repo44",
        "project2/repo45",
        "project1/repo46",
        "project3/repo47",
        "project1/repo48",
        "project5/repo49",
        "project1/repo50",
        "project1/repo51",
        "project1/repo52",
        "project6/repo53"
    ],
    "name": "simon-jouet",
    "groups": [
        "project1",
        "project2",
        "project3",
        "project3/repo1",
        "project3/repo2",
        "project3/repo3",
        "project3/repo4",
        "project2/repo5",
        "project2/repo6",
        "project2/repo7",
        "project2/repo8",
        "project2/repo9",
        "project4/repo10",
        "project2/repo11",
        "project2/repo12",
        "project2/repo13",
        "project1/repo14",
        "project2/repo15",
        "project2/repo16",
        "project2/repo17",
        "project2/repo18",
        "project2/repo19",
        "project2/repo20",
        "project2/repo21",
        "project2/repo22",
        "project2/repo23",
        "project2/repo24",
        "project3/repo25",
        "project2/repo26",
        "project2/repo27",
        "project2/repo28",
        "project2/repo29",
        "project2/repo30",
        "project2/repo31",
        "project2/repo32",
        "project2/repo33",
        "project3/repo34",
        "project2/repo35",
        "project2/repo36",
        "project2/repo37",
        "project4/repo38",
        "project2/repo39",
        "project2/repo40",
        "project1/repo41",
        "project1/repo42",
        "project1/repo43",
        "project1/repo44",
        "project2/repo45",
        "project1/repo46",
        "project3/repo47",
        "project1/repo48",
        "project5/repo49",
        "project1/repo50",
        "project1/repo51",
        "project1/repo52",
        "project6/repo53",
        "$all",
        "$authenticated",
        "@all",
        "@authenticated",
        "all",
        "project1",
        "project2",
        "project3",
        "project3/repo1",
        "project3/repo2",
        "project3/repo3",
        "project3/repo4",
        "project2/repo5",
        "project2/repo6",
        "project2/repo7",
        "project2/repo8",
        "project2/repo9",
        "project4/repo10",
        "project2/repo11",
        "project2/repo12",
        "project2/repo13",
        "project1/repo14",
        "project2/repo15",
        "project2/repo16",
        "project2/repo17",
        "project2/repo18",
        "project2/repo19",
        "project2/repo20",
        "project2/repo21",
        "project2/repo22",
        "project2/repo23",
        "project2/repo24",
        "project3/repo25",
        "project2/repo26",
        "project2/repo27",
        "project2/repo28",
        "project2/repo29",
        "project2/repo30",
        "project2/repo31",
        "project2/repo32",
        "project2/repo33",
        "project3/repo34",
        "project2/repo35",
        "project2/repo36",
        "project2/repo37",
        "project4/repo38",
        "project2/repo39",
        "project2/repo40",
        "project1/repo41",
        "project1/repo42",
        "project1/repo43",
        "project1/repo44",
        "project2/repo45",
        "project1/repo46",
        "project3/repo47",
        "project1/repo48",
        "project5/repo49",
        "project1/repo50",
        "project1/repo51",
        "project1/repo52",
        "project6/repo53"
    ],
    "iat": 1558948513,
    "nbf": 1558948513,
    "exp": 1559553313
}

Also, are you sure it's jwt what we're talking about? I thought that verdaccio 4.x required a new jwt config entry in the verdaccio.yml file to activate them, otherwise it defaulted to the legacy behaviour. I don't see it activated in your configuration but maybe I'm mixing things.

I don't have JWT explicitly enabled but i'm talking about the bearer token passed when querying the packages which highly looks like a JWT token {"alg":"HS256","typ":"JWT"} The config i'm using is still the one I posted previously.

EDIT: apologies for saying duplicate real_groups before, it's in groups not real_groups

from verdaccio-gitlab.

dlouzan avatar dlouzan commented on May 28, 2024

The groups duplication looks suspicious and might be a bug, but I'm still puzzled about the jwt token authentication.

@juanpicado Does this ring a bell? any idea why we're seeing jwt tokens in this configuration?

from verdaccio-gitlab.

dlouzan avatar dlouzan commented on May 28, 2024

Ok, maybe the documentation is a bit misleading. According to the PR that introduced JWT, it's enabled by default on API calls, but not on web requests:
verdaccio/verdaccio#896

The following post also documents the expected behaviour that groups are added as payload of the token:
https://medium.com/verdaccio/diving-into-jwt-support-for-verdaccio-4-88df2cf23ddc

JWT also contains an immutable payload, meaning that, once the token is being signed, we store the list of assigned user groups within the payload. Thus, for each request the API does not verify credentials against the authentication provider, it just verifies whether the token is valid and provides access to the resource.

So apart from the duplicated entries, I'm not sure we'll be able to solve that problem with the size directly. Additionally, since we expect to contact gitlab for authentication, we might need to document a recommended verdaccio.yml configuration in which we re-check the authentication more often than the default of 60 days (groups privileges could have changed).

from verdaccio-gitlab.

StevenLangbroek avatar StevenLangbroek commented on May 28, 2024

We're having this issue as well, even with these http2 settings the UI errors out after logging in (we're on verdaccio 4.0.0 due to #81):

http2-max-field-size: 32k
http2-max-header-size: 64k

Just to add to this, we're using the Docker image with tag latest, but somehow end up with [email protected] (it's in the bottom right of the UI)? Is that intentional?

Submitted PR #81 to fix this.

from verdaccio-gitlab.

juanpicado avatar juanpicado commented on May 28, 2024

@dlouzan sorry late to the party

@simon-jouet actually, here an snippet of the logic behind the real groups.

async jwtEncrypt(user: RemoteUser, signOptions: JWTSignOptions): string {
    const { real_groups, name, groups } = user;
    const realGroupsValidated = _.isNil(real_groups) ? [] : real_groups;
    const groupedGroups = _.isNil(groups) ? real_groups : groups.concat(realGroupsValidated);
    const payload: RemoteUser = {
      real_groups: realGroupsValidated,
      name,
      groups: groupedGroups,
    };

    const token: string = await signPayload(payload, this.secret, signOptions);

    // $FlowFixMe
    return token;
  }

I think the JWT token should be stripped to contain less information, is there any reasons for the duplicates in real_groups? What is the purpose of both groups and real_groups?

Why real group exist? I have NO CLUE 😆, many things were added to Sinopia with 0 backup, 0 context and 0 code review in the post Sinopia era and pre Verdaccio era. For reasons of backward compatibility were keep them all this time and I personally was more concerned about other topics. Perhaps they should be removed, I'll consider this for Verdaccio 5.

The groups duplication looks suspicious and might be a bug, but I'm still puzzled about the jwt token authentication.

Does this ring a bell? any idea why we're seeing jwt tokens in this configuration?

The logic above happens every time a token is singed, so @dlouzan (I removed the other comment it is a non sense, I was wrong) the same thoughts about the duplication, like I mentioned above, it was a legacy logic, might be a bug or just something intended, I'm not sure who/whom are related on this logic by far.

from verdaccio-gitlab.

dlouzan avatar dlouzan commented on May 28, 2024

@juanpicado I'll try to reserve some time tomorrow to take a detailed look at this, it might well be a bug from our end

from verdaccio-gitlab.

dlouzan avatar dlouzan commented on May 28, 2024

@juanpicado Is my assumption above right?

Ok, maybe the documentation is a bit misleading. According to the PR that introduced JWT, it's enabled by default on API calls, but not on web requests:
verdaccio/verdaccio#896

That would mean that by default the jwt token is the approach taken for all api calls, is this so?

from verdaccio-gitlab.

juanpicado avatar juanpicado commented on May 28, 2024

Nop, JWT is enabled in Web since someone developed that part by default, there is no legacy with the Web API, the CLI API is auth legacy token by default.

from verdaccio-gitlab.

GFWagnitz avatar GFWagnitz commented on May 28, 2024

I encountered this problem when using a cluster with Nginx Ingress Controller. But when I configured it on a K8s cluster that uses Traefik the listing worked ok at first but now I'm getting 400 (Bad Request) at every request .

Other coworkers with access to almost the same repositories are not having the same issue.

from verdaccio-gitlab.

Roboroads avatar Roboroads commented on May 28, 2024

My header is over 18k with this plugin - our company has a group with a huge list of repositories (in subgroups as well). I might be abple to PR a solution by saving the details of the group in another way, like a database.

from verdaccio-gitlab.

juanpicado avatar juanpicado commented on May 28, 2024

Seems related verdaccio/verdaccio#2068

from verdaccio-gitlab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.