aiot-mlsys-lab / deepaa Goto Github PK
View Code? Open in Web Editor NEW[ICLR 2022] "Deep AutoAugment" by Yu Zheng, Zhi Zhang, Shen Yan, Mi Zhang
[ICLR 2022] "Deep AutoAugment" by Yu Zheng, Zhi Zhang, Shen Yan, Mi Zhang
Hi, as specified in the readme I tried to evaluate the augmentation policies by running:
python -m DeepAA_evaluate.train -c confs/wresnet28x10_cifar10_DeepAA_1.yaml --dataroot ./data --tag Exp_DeepAA_cifar10
And got the following results (mean of five runs):
Acc | |
---|---|
CIFAR 10 | |
WRN40-2 | 96.37 |
WRN28-10 | 97.44 |
CIFAR100 | |
WRN40-2 | 79.30 |
WRN28-10 | 83.64 |
Which is considerably lower than results reported in the paper for wresnet28-10.
I also evaluated on wresnet40-20, which is the architecture on which the search is performed.
Am I doing sth wrong?
After training, is there a specific policy for the dataset?Where is the result?
As far as I know, it uses tf.autodiff.ForwardAccumulator
to calculate the gradient importance of certain augmentation policy, but how is this process realized through tf.autodiff.ForwardAccumulator
? I've checked the introduction of tf.autodiff.ForwardAccumulator
on this website but still very puzzled about how gradient importance(a scalar) is calculated taking gradients(several vectors) and tangents(several vectors) with same shape as input. Could you please provide more explanations about this process or some other resources related to JVP for me to better understand it?Thanks!
DeepAA_search
on Cifar100 with command python3 DeepAA_search.py --dataset cifar100 --n_classes 100 --use_model WRN_40_2 --n_policies 6 --search_bno 1024 --pretrain_lr 0.1 --seed 1 --batch_size 128 --test_batch_size 512 --policy_lr 0.025 --l_mags 13 --use_pool --pretrain_size 5000 --nb_epochs 45 --EXP_G 16 --EXP_gT_factor=4 --train_same_labels 16 2>&1 | tee SomeFile.txt
, it cause an assertion failureTraceback (most recent call last):
File "DeepAA_search.py", line 170, in <module>
pretrain()
File "DeepAA_search.py", line 144, in pretrain
images, labels = get_pretrain_data()
File "DeepAA_search.py", line 115, in get_pretrain_data
use_post_aug=True, pool=pool)
File "/opt/tiger/app/DeepAA/data_generator.py", line 184, in __call__
self.check_data_type(images, labels)
File "/opt/tiger/app/DeepAA/data_generator.py", line 179, in check_data_type
assert type(labels[0]) == np.uint8
AssertionError
DeepAA_evaluate
on Cifar10 python3 -m DeepAA_evaluate.train -c confs/wresnet28x10_cifar10_DeepAA_1.yaml --dataroot ./data --save ckpt/DeepAA_cifar10.pth --tag Exp_DeepAA_cifar10
incurredTraceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/tiger/app/DeepAA/DeepAA_evaluate/train.py", line 482, in <module>
assert world_size == C.get()['gpus'], f"Did not specify the number of GPUs in Config with which it was started: {world_size} vs {C.get()['gpus']}"
AssertionError: Did not specify the number of GPUs in Config with which it was started: 8 vs 1
First, I created the conda environment and I ran the first command with DeepAA_search.py on cifar10. I didn’t have any problem except some library conflicts . I searched a compatible version of TensorFlow-probability with the environment to solve this. To clarify, I’m working on AWS EC2.
When I run the second command: train.py for the dataset cifar10, I get an error: FileNotFoundError: [Errno 2] No such file or directory: 'ckpt/DeepAA_cifar10.
In the repository DeepAA_evaluate, there is no ckpt repository. Should I make it myself? what should the file DeepAA_cifar10.pth contain?
Hi,
In your paper, you indicate that search on CIFAR10/100 takes 9 GPU hours. In my case, it takes almost 9 hours in total using 8 GPus V100, meaning around 70 GPU hours. Is it normal or do I do something wrong?
Thanks!
Juliette
Only CIFAR-10/100 and ImageNet are included in README on how to run augmentation policy search on them.
Now i want to use a new dataset, and different from ImageNet, the new dataset doesnot contain image classes label.
The file DeepAA_evaluate/lr_sheduler seems to be missed...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.