Comments (2)
I saw that probably I was setting the configuration in a bad place: kserve/modelmesh#46 (comment)
from modelmesh-serving.
I get much better GPU utilization using:
apiVersion: serving.kserve.io/v1alpha1
kind: ServingRuntime
metadata:
annotations:
maxLoadingConcurrency: "2"
labels:
app.kubernetes.io/instance: modelmesh-controller
app.kubernetes.io/managed-by: modelmesh-controller
app.kubernetes.io/name: modelmesh-controller
name: modelmesh-serving-triton-2.x-SR
name: triton-2.x
# namespace: inference-server
spec:
builtInAdapter:
memBufferBytes: 134217728
modelLoadingTimeoutMillis: 90000
runtimeManagementPort: 8001
serverType: triton
env:
- name: CONTAINER_MEM_REQ_BYTES
value: "12884901888" # Works for T4
- name: MODELSIZE_MULTIPLIER
value: "2"
containers:
- args:
- -c
- 'mkdir -p /models/_triton_models; chmod 777 /models/_triton_models; exec tritonserver
"--model-repository=/models/_triton_models" "--model-control-mode=explicit"
"--strict-model-config=false" "--strict-readiness=false" "--allow-http=true"
"--allow-sagemaker=false" '
command:
- /bin/sh
image: nvcr.io/nvidia/tritonserver:21.06.1-py3
livenessProbe:
exec:
command:
- curl
- --fail
- --silent
- --show-error
- --max-time
- "9"
- http://localhost:8000/v2/health/live
initialDelaySeconds: 5
periodSeconds: 30
timeoutSeconds: 10
name: triton
resources:
limits:
nvidia.com/gpu: 1
requests:
cpu: 500m
memory: 1Gi
nvidia.com/gpu: 1
from modelmesh-serving.
Related Issues (20)
- Change release process to update KServe Helm charts based on kustomize build dry-run
- Triton Inference Server doesn't return any values when using gRPC protocol but RESTful does HOT 2
- Update Go to supported version (1.20) HOT 1
- Excessive unloading of models when loading an additional model HOT 2
- storage-secret-config should have the same parameters as Kserve HOT 11
- The behavior of `certificate` in StorageSpec is different from Kserve HOT 1
- Upgrade UBI base images to version 9 HOT 1
- Model Loading Requests Contention
- `ClusterStorageContainer` Feature with ModelMesh
- Install fails if latest commit short hash can be parsed as an int HOT 1
- Fail to build Docker develop image. HOT 1
- Upgrade sigs.k8s.io/controller-runtime to 0.15.x version HOT 8
- Align knative/serving version across Modelmesh repositories HOT 5
- Models deployed with ModelMesh-Serving get restarted on upgrade HOT 3
- Stick with a specific version of Kubebuilder and Kustomize HOT 4
- Add ginkgo cli to be installed as part of the `fvt` make goal.
- release: v0.12.0 HOT 1
- The Python-Based Custom Runtime with MLServer can not support to deploy a model stored on a Persistent Volume Claim
- How to Control the Number of Model Replicas in ModelMesh Serving HOT 1
- Failed to load model while following the tutorial 'Creating a custom serving runtime in KServe ModelMesh' HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from modelmesh-serving.