Giter Club home page Giter Club logo

Comments (2)

WaterKnight1998 avatar WaterKnight1998 commented on August 21, 2024

I saw that probably I was setting the configuration in a bad place: kserve/modelmesh#46 (comment)

from modelmesh-serving.

WaterKnight1998 avatar WaterKnight1998 commented on August 21, 2024

I get much better GPU utilization using:

apiVersion: serving.kserve.io/v1alpha1
kind: ServingRuntime
metadata:
  annotations:
    maxLoadingConcurrency: "2"
  labels:
    app.kubernetes.io/instance: modelmesh-controller
    app.kubernetes.io/managed-by: modelmesh-controller
    app.kubernetes.io/name: modelmesh-controller
    name: modelmesh-serving-triton-2.x-SR
  name: triton-2.x
  # namespace: inference-server
spec:
  builtInAdapter:
    memBufferBytes: 134217728
    modelLoadingTimeoutMillis: 90000
    runtimeManagementPort: 8001
    serverType: triton
    env:
    - name: CONTAINER_MEM_REQ_BYTES
      value: "12884901888" # Works for T4
    - name: MODELSIZE_MULTIPLIER
      value: "2"
  containers:
  - args:
    - -c
    - 'mkdir -p /models/_triton_models; chmod 777 /models/_triton_models; exec tritonserver
      "--model-repository=/models/_triton_models" "--model-control-mode=explicit"
      "--strict-model-config=false" "--strict-readiness=false" "--allow-http=true"
      "--allow-sagemaker=false" '
    command:
    - /bin/sh
    image: nvcr.io/nvidia/tritonserver:21.06.1-py3
    livenessProbe:
      exec:
        command:
        - curl
        - --fail
        - --silent
        - --show-error
        - --max-time
        - "9"
        - http://localhost:8000/v2/health/live
      initialDelaySeconds: 5
      periodSeconds: 30
      timeoutSeconds: 10
    name: triton
    resources:
      limits:
        nvidia.com/gpu: 1
      requests:
        cpu: 500m
        memory: 1Gi
        nvidia.com/gpu: 1

from modelmesh-serving.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.