Giter Club home page Giter Club logo

Comments (21)

carocad avatar carocad commented on June 21, 2024 1

It is hard for us to fix the problem without a working On-premise instance, would you like to work on the issue and fix it for others?

Sure thing, I opened a PR for it. I tested it on the Github Enterprise instance we use and it worked fine. I was not able to find any tests for the code though. I assume that as you mentioned, there are none since it would require both a github.com and github enterprise organization.

from incubator-devlake.

klesh avatar klesh commented on June 21, 2024

I think we do collect Github Deployment as well

DomainTypes: []string{plugin.DOMAIN_TYPE_CICD},

Select the CICD entities in the configuration and it should be good to go, just ignore the Github Action if you don't need them.

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

Maybe I misunderstood the code (I haven't taken a deep look) but it looks like that is only used by the graphql implementation but in our case we are using the "plain api".

just ignore the Github Action if you don't need them.

We tried this, we have 2 projects :

  • one with Github deployments but no Github Action and
  • another one with Github deployments through Github Actions

In the first case we don't get any deployment information shown in devlake/grafana dashboards but on the second case we do get the information which is what pointed me to the distinction mentioned on the bug description.

Please let me know if I can provide you some debug logs or similar to further clarify this :)
PS: we are using Github Enterprise edition not github.com

from incubator-devlake.

klesh avatar klesh commented on June 21, 2024

That is weird, the subtasks for collecting and converting Github Deployment are enabled by default.
Can you share the configuration which you can find by following this screenshot
image

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

That is weird, the subtasks for collecting and converting Github Deployment are enabled by default. Can you share the configuration which you can find by following this screenshot

Sure thing, here you go.

ghe+actions.json
ghe+jenkins.json

The first file is the one where deployments are made with github actions; whereas the second one is where the deployments are made outside of it. On the json files I can see that "Collect Workflow Runs" is a subtasks but there is no "Collect Deployments" or similar.

from incubator-devlake.

klesh avatar klesh commented on June 21, 2024

I see, did you disable the github graphql feature on the connection detail?

from incubator-devlake.

klesh avatar klesh commented on June 21, 2024

Try enable it and see if the problem got fixed.

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

Try enable it and see if the problem got fixed.

just tried it but unfortunately it crashes. See logs below

time="2024-05-14 08:46:28" level=info msg=" [pipeline service] [pipeline #87] [task #618] start executing task: 618"
time="2024-05-14 08:46:28" level=info msg=" [pipeline service] [pipeline #87] [task #618] start plugin"
time="2024-05-14 08:46:28" level=info msg=" [pipeline service] [pipeline #87] [task #618] [api async client] creating scheduler for api \"https://github.boschdevcloud.com/api/v3/\", number of workers: 13, 9500 reqs / 1h0m0s (interval: 378.947368ms)"
time="2024-05-14 08:46:28" level=error msg=" [pipeline service] [pipeline #87] [task #618] run task failed with panic\n\tcaused by: run task failed with panic (github.com/apache/incubator-devlake/helpers/pluginhelper/api.CreateAsyncGraphqlClient:71)\n\tWraps: (2) non-200 OK status code: 404 Not Found body: \"{\\\"message\\\":\\\"Not Found\\\",\\\"documentation_url\\\":\\\"https://docs.github.com/[email protected]/rest\\\"}\"\n\tWraps: (3) non-200 OK status code: 404 Not Found body: \"{\\\"message\\\":\\\"Not Found\\\",\\\"documentation_url\\\":\\\"https://docs.github.com/[email protected]/rest\\\"}\"\n\tError types: (1) *hintdetail.withDetail (2) *hintdetail.withDetail (3) *errors.errorString"

from incubator-devlake.

klesh avatar klesh commented on June 21, 2024

Seems like you are using the enterprise version, is it Cloud or On-premise?

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

Seems like you are using the enterprise version, is it Cloud or On-premise?

GitHub Enterprise Server 3.11.9 (OnPremise)

from incubator-devlake.

klesh avatar klesh commented on June 21, 2024

It sounds like there might be some specific behaviors with GitHub Enterprise Server 3.11.9 (On-Premise). Unfortunately, since we don't currently have access to this version for testing, it limits our ability to directly replicate the issue.

Here are a few ways we can move forward:

Community Resources: Have you checked the GitHub Enterprise Server documentation or community forums for known quirks or workarounds related to your specific version? There might be existing solutions or insights from other users.
Consider Upgrading (Optional): If feasible, upgrading to a newer version of GitHub Enterprise Server might resolve the issue and provide access to the latest features and bug fixes.

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

Hello @klesh ,
It seems that the rate limit behavior is deactivated by default on Github Enterprise Server. See docs: https://docs.github.com/en/[email protected]/graphql/overview/rate-limits-and-node-limits-for-the-graphql-api#primary-rate-limit

Rate limits are disabled by default for GitHub Enterprise Server.

Would it be possible for devlake to handle that case with a default value or another method which avoid a panic? I can also contact the administrators on our side and ask if they can set one but since this is the default behavior from Github Server I would assume that this wouldn't be an isolated case.

from incubator-devlake.

klesh avatar klesh commented on June 21, 2024

@carocad Sure, you may find the "Custom Rate Limit" on the connection page.

image

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

@carocad Sure, you may find the "Custom Rate Limit" on the connection page.

Unfortunately that didn't work either. See logs below

time="2024-05-21 07:48:30" level=info msg=" [pipeline service] [pipeline #145] [task #971] start executing task: 971"
time="2024-05-21 07:48:30" level=info msg=" [pipeline service] [pipeline #145] [task #971] start plugin"
time="2024-05-21 07:48:30" level=info msg=" [pipeline service] [pipeline #145] [task #971] [api async client] creating scheduler for api \"https://github.boschdevcloud.com/api/v3/\", number of workers: 6, 4500 reqs / 1h0m0s (interval: 800ms)"
time="2024-05-21 07:48:31" level=error msg=" [pipeline service] [pipeline #145] [task #971] run task failed with panic\n\tcaused by: run task failed with panic (github.com/apache/incubator-devlake/helpers/pluginhelper/api.CreateAsyncGraphqlClient:71)\n\tWraps: (2) non-200 OK status code: 404 Not Found body: \"{\\\"message\\\":\\\"Not Found\\\",\\\"documentation_url\\\":\\\"https://docs.github.com/[email protected]/rest\\\"}\"\n\tWraps: (3) non-200 OK status code: 404 Not Found body: \"{\\\"message\\\":\\\"Not Found\\\",\\\"documentation_url\\\":\\\"https://docs.github.com/[email protected]/rest\\\"}\"\n\tError types: (1) *hintdetail.withDetail (2) *hintdetail.withDetail (3) *errors.errorString"

I can only assume from this that DevLake would check the rate limit endpoint regardless of the custom rate limit and only after it receives a response it would then overwrite it. If you agree that this is a bug, I would happily try to create a PR to fix it :)

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

... I can also contact the administrators on our side and ask if they can set one but since this is the default behavior from Github Server I would assume that this wouldn't be an isolated case.

I just got an answer from the administrators on my site. The GraphQL endpoints on our side already have a rate limit so this shouldn't be the case. I also tried querying it myself and got an response without any issues. I submitted from the github.com docs (see ref)

query {
  viewer {
    login
  }
  rateLimit {
    limit
    remaining
    used
    resetAt
  }
}

response

{
    "data": {
        "viewer": {
            "login": "<redacted>"
        },
        "rateLimit": {
            "limit": 5000,
            "remaining": 4999,
            "used": 1,
            "resetAt": "2024-05-21T10:17:01Z"
        }
    }
}

I don't know which query does dev-lake uses but I can only guess that one of the fetched parameters was introduced on a version of github newer than 3.11.9. In any case, my proposal above still stands.
If you point me in the right direction I can definitely give it a try :)

from incubator-devlake.

d4x1 avatar d4x1 commented on June 21, 2024

... I can also contact the administrators on our side and ask if they can set one but since this is the default behavior from Github Server I would assume that this wouldn't be an isolated case.

I just got an answer from the administrators on my site. The GraphQL endpoints on our side already have a rate limit so this shouldn't be the case. I also tried querying it myself and got an response without any issues. I submitted from the github.com docs (see ref)

query {
  viewer {
    login
  }
  rateLimit {
    limit
    remaining
    used
    resetAt
  }
}

response

{
    "data": {
        "viewer": {
            "login": "<redacted>"
        },
        "rateLimit": {
            "limit": 5000,
            "remaining": 4999,
            "used": 1,
            "resetAt": "2024-05-21T10:17:01Z"
        }
    }
}

I don't know which query does dev-lake uses but I can only guess that one of the fetched parameters was introduced on a version of github newer than 3.11.9. In any case, my proposal above still stands. If you point me in the right direction I can definitely give it a try :)

Do you run this query

query {
  viewer {
    login
  }
  rateLimit {
    limit
    remaining
    used
    resetAt
  }
}
``` on [github](https://docs.github.com/en/graphql/overview/explorer) or  your github enterpise version?

From the log 

run task failed with panic\n\tcaused by: run task failed with panic (github.com/apache/incubator-devlake/helpers/pluginhelper/api.CreateAsyncGraphqlClient:71)\n\tWraps: (2) non-200 OK status code: 404 Not Found body: "{\"message\":\"Not Found\",\"documentation_url\":\"https://docs.github.com/[email protected]/rest\\\"}\"\n\tWraps: (3) non-200 OK status code: 404 Not Found body: "{\"message\":\"Not Found\",\"documentation_url\":\"https://docs.github.com/[email protected]/rest\\\"}\"\n\tError types: (1) *hintdetail.withDetail (2) *hintdetail.withDetail (3) *errors.errorString"
``

I think it's your github version doesn't support to query rate limit info. You can have a try.

And btw, DevLake's query is:

query {
  rateLimit {
    limit
    remaining
    resetAt
  }
}

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

Do you run this query

query {
  viewer {
    login
  }
  rateLimit {
    limit
    remaining
    used
    resetAt
  }
}
``` on [github](https://docs.github.com/en/graphql/overview/explorer) or  your github enterpise version?

On my github enterprise version of course :)

image

I think it's your github version doesn't support to query rate limit info. You can have a try.

And btw, DevLake's query is: ...

Yeah, it works on my github enterprise version. That query seems like a subset of the one I posted above on #7435 (comment) (see response section).

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

@klesh we continued to follow up on this and now I am 99% sure that this is an issue on devlake. The issue is that for Github Servers it is required to set the endpoint URL. This generally ends in /api/v3/{{suffix}} as per Rest guide of Github. Unfortunately the graphql endpoint doesn't follow this convention. So its endpoint is /api/graphql (notice the missing /v3/).

This line here. Assumes that the graphql suffix can be added to previously defined endpoint URL resulting in /api/v3/graphql which doesn't exists and therefore results in a 404 error --> panic. Since graphql doesn't follow the convention of suffix, simply harcoding it to /api/graphql should solve the issue. Please let me know if you would like me to make a PR for it or if you prefer to do it yourselves. Either way works for me :)

image

from incubator-devlake.

klesh avatar klesh commented on June 21, 2024

Try removing v3/ suffix from the endpoint and see if it solves your problem, e.g. https://github.boschdevcloud.com/api/

from incubator-devlake.

carocad avatar carocad commented on June 21, 2024

Try removing v3/ suffix from the endpoint and see if it solves your problem, e.g. https://github.boschdevcloud.com/api/

It seems that some other functionality relies on the /v3 suffix. Changing the connection endpoint fails to validate the token. Even ignoring the error still leads to an error while collecting/analysing the data. See logs below

image
time="2024-05-28 00:00:02" level=info msg=" [pipeline service] [pipeline #211] [task #1411] start executing task: 1411"
time="2024-05-28 00:00:03" level=info msg=" [pipeline service] [pipeline #211] [task #1411] start plugin"
time="2024-05-28 00:00:03" level=info msg=" [pipeline service] [pipeline #211] [task #1411] [api async client] creating scheduler for api \"https://github.boschdevcloud.com/api/\", number of workers: 6, 4500 reqs / 1h0m0s (interval: 800ms)"
time="2024-05-28 00:00:03" level=info msg=" [pipeline service] [pipeline #211] [task #1411] github graphql init success with remaining 5000/5000 and will reset at 2024-05-28 01:00:03 +0000 UTC"
time="2024-05-28 00:00:03" level=info msg=" [pipeline service] [pipeline #211] [task #1411] total step: 37"
time="2024-05-28 00:00:03" level=info msg=" [pipeline service] [pipeline #211] [task #1411] executing subtask Collect Milestones"
time="2024-05-28 00:00:03" level=info msg=" [pipeline service] [pipeline #211] [task #1411] [Collect Milestones] start api collection"
time="2024-05-28 00:00:03" level=error msg=" [pipeline service] [pipeline #211] [task #1411] [Collect Milestones] end api collection error\n\tcaused by: error parsing response from repos/eBike/devops-launch-assist/milestones (200)"
time="2024-05-28 00:00:03" level=error msg=" [pipeline service] [pipeline #211] [task #1411] subtask Collect Milestones ended unexpectedly\n\tWraps: (2) Error waiting for async Collector execution\n\tWraps: (3) error parsing response from repos/eBike/devops-launch-assist/milestones (200)\n\tError types: (1) *hintdetail.withDetail (2) *hintdetail.withDetail (3) *errors.errorString"

from incubator-devlake.

klesh avatar klesh commented on June 21, 2024

@carocad You are right, it appears that the URL patterns are different between Cloud and On-premise...

However, it is hard to tell whether it is caused by the implementation of the On-premise version or something specific to your configuration.

It is hard for us to fix the problem without a working On-premise instance, would you like to work on the issue and fix it for others?

from incubator-devlake.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.