Comments (1)
Thanks!
The companion project at github.com/nvlabs/tensorcom provides distributed preprocessing for just this reason.
In addition to providing distributed data augmentation, it also has a broadcast mode for training multiple clients from a single data server, for example for hyperparameter optimization.
from webdataset.
Related Issues (20)
- FAQ : What's the meaning of n in `with_epoch(n)` HOT 2
- Distributed Training with videos not working? HOT 1
- [Errno 32] Broken pipe - Download Failed Error with S3 URLs HOT 1
- Webdataset (Liaon115M) + Torchlightning (pl.DataModule) with visualizing progressbar during training HOT 1
- Seed in multiprocessing (DDP) is not fixed in shuffle() HOT 1
- Update pypi with 0.2.88?
- How does shuffling work? HOT 1
- Restricting the number of samples in the dataset HOT 1
- wds.Decoder TypeError: 'functools.partial' object is not iterable HOT 2
- Loop through same tar file 10 times? HOT 1
- Excess memory usage when generating short sequence clips HOT 3
- seed not used in shuffle HOT 1
- Memory leak during training with standard DataLoader coupled with WebDataset dataloader HOT 3
- `pipe:s3cmd` or `pipe:aws s3 cp` ?
- custom batch sampler ?
- PyPI packages not available for 2.90 or 2.88
- ShardListDataset does not work with multiprocessing_context=spawn
- batched augmentations with kornia
- Incorrect Documentation for default_collation_fn
- Validation Set Distributed Sampling when using WebDataset with FSDP
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from webdataset.