Giter Club home page Giter Club logo

Comments (13)

sfc-gh-hyu avatar sfc-gh-hyu commented on September 23, 2024 1

@PedroRibeiro95 Have you done any new researches on this? I was doing some researches on this and it looks like it should be working with following configurations:

k8s-device-plugin does have a config called DEVICE_LIST_STRATEGY, which does allow device list to be returned back as CDI. Once kubelet receive allocate response from device plugin, then it should populate CDI spec file, and start containerd ( assuming we are just using containerd ). Then the containerd will parse CDI devices and convert device to oci spec file and pass the spec to runs or runsc. Then runsc should just create linux devices here as @ayushr2 just described. (I am assuming in this case nvidia-runtime is not needed since we don't need prestart hook?)

I never tested anything, everything mentioned above is just pure guess from me, but let me know if my reasoning makes sense or not.

from gvisor.

PedroRibeiro95 avatar PedroRibeiro95 commented on September 23, 2024

Adding the logs for the container:
logs.zip

from gvisor.

ayushr2 avatar ayushr2 commented on September 23, 2024

Thanks for the very detailed report! Apologies for the delay. nvproxy is not supported with k8s-device-plugin yet, and we haven't investigated what needs to be done to add support. We would appreciate OSS contributions!

We are currently focused on establishing support in GKE. GKE uses a different GPU+container stack. It does not use k8s-device-plugin. It instead has its own device plugin: https://github.com/GoogleCloudPlatform/container-engine-accelerators/tree/master/cmd/nvidia_gpu. This configures the container in a different way. nvproxy in GKE is still experimental, but it works! Please let me know if you want to experiment on GKE, and we can provide more detailed instructions.

To summarize, nvproxy works in the following environments:

  1. Docker: docker run --gpus= ... Needs --nvproxy-docker flag.
  2. nvidia-container-runtime with legacy mode. Needs --nvproxy-docker flag.
  3. GKE. Does not need --nvproxy-docker flag.

from gvisor.

PedroRibeiro95 avatar PedroRibeiro95 commented on September 23, 2024

Thanks for the followup @ayushr2. In the meantime I've made some progress where by just using nvproxy, bootstrapping the host node with NVIDIA and then mounting the driver to the container using hostPath gets me to run nvidia-smi successfully. However, it seems it can't fully access the GPU:

==============NVSMI LOG==============

Timestamp                                 : Mon Oct 30 15:53:01 2023
Driver Version                            : 525.60.13
CUDA Version                              : 12.0

Attached GPUs                             : 1
GPU 00000000:00:1E.0
    Product Name                          : Tesla T4
    Product Brand                         : NVIDIA
    Product Architecture                  : Turing
    Display Mode                          : Disabled
    Display Active                        : Disabled
    Persistence Mode                      : Enabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : GPU access blocked by the operating system
    Accounting Mode Buffer Size           : 4000
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : GPU access blocked by the operating system
    GPU UUID                              : GPU-3ec3e89a-b2ec-68d1-bb38-3becc2cf55cd
    Minor Number                          : 0
    VBIOS Version                         : Unknown Error
    MultiGPU Board                        : No
    Board ID                              : 0x1e
    Board Part Number                     : GPU access blocked by the operating system
    GPU Part Number                       : GPU access blocked by the operating system
    Module ID                             : GPU access blocked by the operating system
    Inforom Version
        Image Version                     : GPU access blocked by the operating system
        OEM Object                        : Unknown Error
        ECC Object                        : GPU access blocked by the operating system
        Power Management Object           : Unknown Error
    GPU Operation Mode
        Current                           : GPU access blocked by the operating system
        Pending                           : GPU access blocked by the operating system
    GSP Firmware Version                  : 525.60.13
    GPU Virtualization Mode
        Virtualization Mode               : Pass-Through
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x00
        Device                            : 0x1E
        Domain                            : 0x0000
        Device Id                         : 0x1EB810DE
        Bus Id                            : 00000000:00:1E.0
        Sub System Id                     : 0x12A210DE
        GPU Link Info
            PCIe Generation
                Max                       : Unknown Error
                Current                   : Unknown Error
                Device Current            : Unknown Error
                Device Max                : Unknown Error
                Host Max                  : Unknown Error
            Link Width
                Max                       : Unknown Error
                Current                   : Unknown Error
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : GPU access blocked by the operating system
        Replay Number Rollovers           : GPU access blocked by the operating system
        Tx Throughput                     : GPU access blocked by the operating system
        Rx Throughput                     : GPU access blocked by the operating system
        Atomic Caps Inbound               : GPU access blocked by the operating system
        Atomic Caps Outbound              : GPU access blocked by the operating system
    Fan Speed                             : N/A
    Performance State                     : P8
    Clocks Throttle Reasons
        Idle                              : Active
        Applications Clocks Setting       : Not Active
        SW Power Cap                      : Not Active
        HW Slowdown                       : Not Active
            HW Thermal Slowdown           : Not Active
            HW Power Brake Slowdown       : Not Active
        Sync Boost                        : Not Active
        SW Thermal Slowdown               : Not Active
        Display Clock Setting             : Not Active
    FB Memory Usage
        Total                             : 15360 MiB
        Reserved                          : 399 MiB
        Used                              : 2 MiB
        Free                              : 14957 MiB
    BAR1 Memory Usage
        Total                             : 256 MiB
        Used                              : 2 MiB
        Free                              : 254 MiB
    Compute Mode                          : Default
    Utilization
        Gpu                               : 0 %
        Memory                            : 0 %
        Encoder                           : 0 %
        Decoder                           : 0 %
    Encoder Stats
        Active Sessions                   : GPU access blocked by the operating system
        Average FPS                       : GPU access blocked by the operating system
        Average Latency                   : GPU access blocked by the operating system
    FBC Stats
        Active Sessions                   : GPU access blocked by the operating system
        Average FPS                       : GPU access blocked by the operating system
        Average Latency                   : GPU access blocked by the operating system
    Ecc Mode
        Current                           : GPU access blocked by the operating system
        Pending                           : GPU access blocked by the operating system
    ECC Errors
        Volatile
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
        Aggregate
            SRAM Correctable              : N/A
            SRAM Uncorrectable            : N/A
            DRAM Correctable              : N/A
            DRAM Uncorrectable            : N/A
    Retired Pages
        Single Bit ECC                    : GPU access blocked by the operating system
        Double Bit ECC                    : GPU access blocked by the operating system
        Pending Page Blacklist            : GPU access blocked by the operating system
    Remapped Rows                         : GPU access blocked by the operating system
    Temperature
        GPU Current Temp                  : 22 C
        GPU Shutdown Temp                 : 96 C
        GPU Slowdown Temp                 : 93 C
        GPU Max Operating Temp            : 85 C
        GPU Target Temperature            : N/A
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : Supported
        Power Draw                        : 13.16 W
        Power Limit                       : 70.00 W
        Default Power Limit               : 70.00 W
        Enforced Power Limit              : 70.00 W
        Min Power Limit                   : 60.00 W
        Max Power Limit                   : 70.00 W
    Clocks
        Graphics                          : 300 MHz
        SM                                : 300 MHz
        Memory                            : 405 MHz
        Video                             : 540 MHz
    Applications Clocks
        Graphics                          : 1590 MHz
        Memory                            : 5001 MHz
    Default Applications Clocks
        Graphics                          : 585 MHz
        Memory                            : 5001 MHz
    Deferred Clocks
        Memory                            : N/A
    Max Clocks
        Graphics                          : 1590 MHz
        SM                                : 1590 MHz
        Memory                            : 5001 MHz
        Video                             : 1470 MHz
    Max Customer Boost Clocks
        Graphics                          : 1590 MHz
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Voltage
        Graphics                          : N/A
    Fabric
        State                             : N/A
        Status                            : N/A
    Processes                             : None

I tried to also run this under runtimeClass: nvidia and this didn't happen, so it's definitely a gVisor issue now. Unfortunately for our use case GKE is not viable. I'll try with the options you described to see if I can get it working.

from gvisor.

ayushr2 avatar ayushr2 commented on September 23, 2024

However, it seems it can't fully access the GPU

Yeah I don't think it will work just yet. In GKE, the container spec defines which GPUs to expose in spec.Linux.Devices. However, in the boot logs you attached above, I could not see any such devices defined. So gvisor will not expose any device.

My best guess is that k8s-device-plugin is creating bind mounts of /dev/nvidia* devices in the container's root filesystem and then expecting the container to be able to access that. That won't work with gVisor with any combination of our --nvproxy flags, because even though the devices exist on the host filesystem, they don't exist in our sentry's /dev filesystem (which is an in-memory filesystem).

In docker mode, the GPU devices are explicitly exposed like this. In GKE, the device files are automatically created here because spec.Linux.Devices defines it. So you could look into adding similar support for k8s-device-plugin environment.

from gvisor.

PedroRibeiro95 avatar PedroRibeiro95 commented on September 23, 2024

Thanks for the detailed reply @ayushr2! Though I'm a bit out of my depth here, your guidance has been very helpful. I'm trying to better understand the differences for GKE; could you please point me to where the container spec/sandbox is defined? I'm not sure if it's possible to try to port that configuration over to Amazon Linux or if I should just try to add the feature directly to the gVisor code you pointed me to.

from gvisor.

PedroRibeiro95 avatar PedroRibeiro95 commented on September 23, 2024

I've tried very naively to add the following snipped to runsc/boot/vfs.go:createDeviceFiles:

		mode := os.FileMode(int(0777))
		info.spec.Linux.Devices = append(info.spec.Linux.Devices, []specs.LinuxDevice{
			{
				Path:     "/dev/nvidia0",
				Type:     "c",
				Major:    195,
				Minor:    0,
				FileMode: &mode,
			},
			{
				Path:     "/dev/nvidia-modeset",
				Type:     "c",
				Major:    195,
				Minor:    254,
				FileMode: &mode,
			},
			{
				Path:     "/dev/nvidia-uvm",
				Type:     "c",
				Major:    245,
				Minor:    0,
				FileMode: &mode,
			},
			{
				Path:     "/dev/nvidia-uvm-tools",
				Type:     "c",
				Major:    245,
				Minor:    1,
				FileMode: &mode,
			},
		}...)

in order to try to mount the devices during runtime, but seems like even this isn't enough

from gvisor.

ayushr2 avatar ayushr2 commented on September 23, 2024

You probably also want /dev/nvidiactl. You basically want to call this. Usually that is only called for --nvproxy-docker. JUST FOR TESTING try adding a new flag --nvproxy-k8s and change the condition on line 1221 to be if info.conf.NVProxyDocker || info.conf.NVProxyK8s { ...

Also note that the minor number of /dev/nvidia-uvm is different inside the sandbox. So just copying from host won't work.

from gvisor.

PedroRibeiro95 avatar PedroRibeiro95 commented on September 23, 2024

Yeah, from reading the code and looking at the logs seems like gVisor automatically assigns a minor number to the device. Unfortunately your suggestion still didn't work. I'll leave the logs for the container in case you (or anyone that comes across this issue) want to use it for debugging (note that I had already added a nvproxy-automount-dev flag for the same purpose you suggested using nvproxy-k8s)
runsc.tar.gz

from gvisor.

ayushr2 avatar ayushr2 commented on September 23, 2024

Got it, thanks for working with me on this.

Just to set the expectations, adding support for k8s-device-plugin is currently not on our roadmap. We are focused on maturing GPU support in GKE first. OSS contributions for GPU support in additional environments is appreciated in the meantime!

from gvisor.

PedroRibeiro95 avatar PedroRibeiro95 commented on September 23, 2024

No worries! In the meantime, we don't have a strict requirement for having NVIDIA working with gVisor so we can get around it. I'd love to help bringing in this feature but it would still need to get more familiarized with gVisor, but I'll help in any way I can!

from gvisor.

github-actions avatar github-actions commented on September 23, 2024

A friendly reminder that this issue had no activity for 120 days.

from gvisor.

PedroRibeiro95 avatar PedroRibeiro95 commented on September 23, 2024

Hey @sfc-gh-hyu, thanks for the detailed instructions. I haven't revisited this in the meantime as other priorities came up, but I will be testing it again very soon. I will try to follow what you suggested and I will report back with more details.

from gvisor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.