Comments (6)
@kevinnegy yes, that's correct regarding Nvidia implementation.
Nvidia implementation uses a docker and I believe by passing in appropriate docker run flag, you can enable/disable the required GPUs. By default, the implementation uses all the GPUs it sees inside the docker container. If you can wait until mid next week, I should be able to give you this option via CM which can work with the 4.0 inference submissions - this is in progress now.
For onnxruntime - adding --env.CUDA_VISIBLE_DEVICES]="1"
to the CM run command should help to run on device 1. But the reference implementations are at least 10X slower compared to Nvidia TensorRT implementation and so this may not be a worthwhile exercise.
from ck.
@arjunsuresh Awesome! That environment variable worked for me! And yes, I can wait for the 4.0 CM option. Thanks so much!
from ck.
@kevinnegy Are you trying to run the reference implementation using CM? Currently CM doesn't have such a flag but this can be easily added. But the problem is that the underlying implementation for each framework across all the benchmarks must also support this. Since reference implementations are meant for "reference" and not really for benchmarking, so far we haven't seen such a request. Are you targeting some specific benchmark? If so this can be done.
from ck.
@arjunsuresh Yes, I'm trying to run the reference implementations using CM. My understanding was that MLPerf (including the reference implementations) could be used for benchmarking GPUs to measure performance, is that not correct?
I had hoped to benchmark the 9 reference workloads in the mlperf inference repo, but at the very least getting just Bert and RNNT to have the GPU option would be super helpful.
I really appreciate any help you can provide.
from ck.
@kevinnegy Reference implementations are not good to benchmark systems because most of them lack the basic optimizations like batching and multi-GPU support etc. If you want to benchmark Nvidia GPUs, Nvidia implementation is the way to go. It is supported in CM. And all of the benchmarks shouldn't take more than a day or 2 to complete except DLRM which needs days to get the dataset.
from ck.
@arjunsuresh Thank you for the suggestion. I'm assuming this is what you had in mind for the Nvidia implementation of Bert with CM?
As for specifying which GPU, I was able to brute force RNNT with pytorch by switching all references in all RNNT python scripts of torch.device("cuda:0")
to whatever device I want. I was unable to do the same with reference BERT with onnxruntime. Is there any hacky method like this to pick a device for BERT (reference onnxruntime and Nvidia-implementation tensorrt) just so I can get up and running in case a global CM parameter will take a while to be implemented?
from ck.
Related Issues (20)
- running on ARM64? HOT 5
- Running dlrm cpu inference ends up using resnet50 HOT 2
- Print migration warning when using mlcommons@ck HOT 7
- Add universal check of env vars in cmind.utils HOT 1
- Add "prototype" flag to CM script meta
- `convert_path` is not part of setuptools API and will be removed HOT 7
- Refactor CM, CM for MLOps and CM for MLPerf docs and tutorials
- Improving CM core
- When dumping version info from dependencies, variations do not have _ HOT 1
- Warning Encountered During pip install cmind on Ubuntu via WSL HOT 3
- KeyError in MLPerf Inference with ResNet-50 HOT 5
- Could not identify license file for opentelemetry-cpp HOT 1
- CUDA version 12.4 not supported for this cm command HOT 1
- Support branch for cm pull repo
- cm add script is failing on new CM repository HOT 1
- Support ssh URLs in cm pull repo HOT 3
- Improve the accessibility of the documentations HOT 1
- How to prevent caching? HOT 2
- Requiring a user access token or an SSH key instead for huggingface.co HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ck.