Giter Club home page Giter Club logo

Comments (17)

dims avatar dims commented on June 28, 2024 1

FYI #122828 documents that [Feature:GPUDevicePlugin] run Nvidia GPU Device Plugin tests is getting dropped!

from kubernetes.

Vyom-Yadav avatar Vyom-Yadav commented on June 28, 2024

cc @BenTheElder @dims @ameukam

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

Note that while the job is "green" after kubernetes/test-infra#32635 we are not running [Feature:GPUDevicePlugin] run Nvidia GPU Device Plugin tests anymore and the Windows test is skipped so ... no real tests are run.

https://testgrid.k8s.io/sig-release-master-blocking#gce-device-plugin-gpu-master&show-stale-tests=&width=5

This issue is a sub-variant of kubernetes/test-infra#32242

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

/remove-sig k8s-infra
/sig node
/triage accepted

from kubernetes.

aojea avatar aojea commented on June 28, 2024

It seems fixed https://testgrid.k8s.io/sig-release-master-blocking#gce-device-plugin-gpu-master

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

It seems fixed https://testgrid.k8s.io/sig-release-master-blocking#gce-device-plugin-gpu-master

It's not, see above #124950 (comment)

The [Feature:GPUDevicePlugin] run Nvidia GPU Device Plugin tests is no longer even in stale tests because it hasn't run in so long now, but it was yesterday as a stale test, we're running no actual GPU tests currently.

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

I don't even seen [Feature:GPUDevicePlugin] run Nvidia GPU Device Plugin tests in the skipped results

The only matching test is for windows ... asking if something changed with the test cases in SIG Node slack
https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-device-plugin-gpu/1795477868910743552/artifacts/junit_01.xml

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

The test cases were all deleted ... bf268f0#diff-7629c065680da0396ef2e8d190ce7cdd1dbf2c336f99c22ec543a4be61d74ccd

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

NOTE: This also impacts the EC2 Job which is no longer running any test cases.

The GCE job is running and "passing" same as the ec2 job now ... neither of which run any tests.

/retitle EC2 + GCE GPU CI Jobs not running any test cases

See an old run: https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-ec2-device-plugin-gpu/1781225142752382976

(ran: Kubernetes e2e suite: [It] [sig-scheduling] [Feature:GPUDevicePlugin] run Nvidia GPU Device Plugin tests, 8 tests passed, the other "tests" are just cluster bringup / test runner etc)

Current:

https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-ec2-device-plugin-gpu/1795464027577520128

(7 "tests" passed, none of which are actual e2e tests)

from kubernetes.

aojea avatar aojea commented on June 28, 2024

๐Ÿ˜…

from kubernetes.

pacoxu avatar pacoxu commented on June 28, 2024

https://testgrid.k8s.io/sig-release-1.30-blocking#gce-device-plugin-gpu-1.30
keeps failing

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

Yes, I think the job is coming up but not the driver install or device plugin. We need to add more log dump there, I've been discussing a bit with dims what we should do about the test removal in #sig-node: https://kubernetes.slack.com/archives/C0BP8PW9G/p1716914276819719?thread_ts=1716913485.823089&cid=C0BP8PW9G

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

Talked to @elfinhe this morning about the driver install.

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

I'm going to revisit test cases once we figure out the driver install issue on 1.30 with existing test cases on that branch. There are WIP PRs for this and I'm in contact with the team supporting us on the driver problems, providing upstream CI pointers. #125208 / #125206

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

Googlers: b/344684158 bug tracking driver issue at GCP.

[We'll update the PR and comment back here when it's sorted]

from kubernetes.

AnishShah avatar AnishShah commented on June 28, 2024

Notes from sig-node CI meeting:

  • This is not release blocking

from kubernetes.

BenTheElder avatar BenTheElder commented on June 28, 2024

For the GPU tests, we have a driver install fix for 1.30 branch now at #125208
We will need to backport to other branches, forward port to master, and then re-introduce device plugin GPU tests to master.

from kubernetes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.