Comments (7)
give more details when you open an issue. what have you tried ?
Anyway, for DDP use stanard PyTorch multi-GPU command:
python -u -m torch.distributed.launch --nproc_per_node=2 train_single_label.py
currently, the script does not save the model. there is no special saving of DDP model, just torch.save(...) (on rank==0 only of course).
you are welcome to open a merge request for adding the saving feature.
from imagenet21k.
when i use the above command
`
root@1ebda974bc7a:/home/ImageNet21K# python -u -m torch.distributed.launch --nproc_per_node=2 train_single_label.py --batch_size=4 --data_path=/mnt/dataset --model_name=mobilenetv3_large_100 --model_path=/home/ImageNet21K/mobilenetv3_large_100_miil_21k.pth --epochs=10
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
creating model mobilenetv3_large_100...
creating model mobilenetv3_large_100...
done
done
`
the progarm seems block stop and not continue
is this normally?
from imagenet21k.
By using "python -u -m torch.distributed.launch --nproc_per_node=2 train_single_label.py",I encounter the same problem
from imagenet21k.
@jaffe-fly thanks for the feedback
i changed the order, 'torch.distributed.init_process_group' should be before model initialization.
please update and let me know if it was fixed.
from imagenet21k.
@mrT23 Now it works well,Thanks alot!
from imagenet21k.
@mrT23 its ok thank you
but one warining
UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
from imagenet21k.
thanks @jaffe-fly and @Stephen-Hao
the warning on lr_scheduler does not hurt the training results.
from imagenet21k.
Related Issues (20)
- ImageNet-1K Metadata HOT 3
- why no data normlization in data pre-processing?when I use data normlization in data pre-processing, rate of convergence of the network is slow HOT 1
- Can you share mean, std of imagenet21k? HOT 2
- Is the 1k validation set included in the 21k data? HOT 2
- About hierarchy balancing HOT 2
- No parent for n09450163 (sun)
- Any label map for ImageNet-1K?
- could you please provide image-label map directly?
- When using your ImageNet21K pretrained ResNet50 model in Detectron2, performance degrades HOT 1
- I see no normalization of images.
- Hyperparameters to finetune ResNet50 from IN21k to IN1k
- Missing details on Dropout and momentum value used for SGD when fine tuning on ImageNet1k
- What is the teacher model when using semantic softmax with KD?
- Where can I find the data?
- Does there any decriptions abount classes in "imagenet21k_small_classes"
- Anyone here have trouble reaching the mentioned accuracy for ViT-B?
- How to test Imagenet1K with pretrained backbone MobilenetV3_large_100? Could you release the testing script? Thanks a lot.
- ask for the pretrained model on ImageNet-21k-P from single label
- ask for the Transfer Learning Code of train.py on cifar100
- Dependencies to run code
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from imagenet21k.