Comments (2)
CPU testing should be done only on cpu instances when possible, to avoid incurring gpu runtime costs. The containers are mostly tailored towards GPUs since your primary use case for containers are for multi gpu or multi node deployments for training.
For Nemo, the vast majority of users can get by with conda env and no docker, so I do agree with this that it's a niche problem for a subset of users.
That subset of users does include the entire Nemo research team plus a few external research teams, who build containers and run multi node jobs on the clusters. So a breakage of support usually means we wait out upgrading our containers for periods of 1-2 months.
Also, the CI tests running in Nemo are in the container, but we simulate the install environment of the user - ie we use a bare bones pytorch base container and then follow regular pip and conda install steps. Now ofc the torch environment is based on a container which is not what normal users will face, but still it's close to real world install scenarios.
from ecosystem-ci.
Maybe test the ecosystem CI (or just even PTL alone) on the latest public NGC pytorch container (or really any cloud container which has pytorch built into it). Ofc this is a big task so it's just a suggestion.
I see and it would be quite a useful feature to allow customer images... I think it could be very feasible for the GPU testing which is already running customer docker image, but how much do you think is needed also for base CPU testing?
One more point, and it is rather thinking than a complaining... we are talking about two kinds of users (a) heavily rely on containers/docker [probably some corporate user] and (b) casual users using mostly PyPI/Conda registry... so I may say the at would like to serve both, so for example of Nemo I would include both testing a,b 🐰
from ecosystem-ci.
Related Issues (13)
- enable email notifications
- mention contact person in Slack message
- Add differential build for PR HOT 3
- Slack action is limited to Linux only HOT 1
- enable make for all configs HOT 3
- catching also PL deprecation warnigs HOT 1
- Create a GitHub issue if compatibility breaks HOT 3
- Conda environment for specific dependencies HOT 1
- Add an option to specify the installation path of target_repository HOT 1
- Produce collection of passing integrations HOT 1
- isolate tests & extra dependencies
- expose other build commands
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ecosystem-ci.