Comments (18)
@ldd91 ,hi,
- What is your partition ratio for training and validation? Considering the time cost, we recommend to use a sampled subset of ImageNet as mentioned in the paper.
- This means that we do not update the architecture parameters until epochs>=35. This code controls the update epochs. You can find similar strategies in auto-deeplab and pdarts.
from pc-darts.
Hi @yuhuixu1993 ,I didn't set partition ratio,just use the train data and val data in ImageNet
from pc-darts.
@ldd91 , we can only use training data. It need to be partition into two parts, one part to training supernets and the other used for architecture as also described in the original darts and following other works(proxylessnas, pdarts...)
from pc-darts.
@yuhuixu1993 thank you,i will change the code and have a try
from pc-darts.
@ldd91, I still recommend you to use a subset.
from pc-darts.
@yuhuixu1993 ,Thank you I will try to use a subnet
from pc-darts.
@yuhuixu1993 hi,I use
split = int(np.floor(portion*num_train))
dataloader = torch.utils.data.DataLoader(batch_size=1024,sampler=torch.utils.data.sample.SubsetRandomSampler(indices[:split]))
i set the portion as 0.1 to use a sampled subset of ImageNet ,in the log there are only 3 step in each epoch,after first epoch the train_acc is 3.37 and each epoch takes about 25 mimutes,
from pc-darts.
I wanna know how many step in your each epoch
from pc-darts.
@yuhuixu1993 hi,I set split the train data into train_data and valid_data,and then i set 0.1train_data and 0.025valid_data is it correct?
from pc-darts.
Please refer to our paper, thanks.The steps are not important as it depends on the batch size you use. About the split proportion Yes,just according to the settings described in the paper. While I wrote the sampling codes myself to make sure the data in each class is sampled evenly.
from pc-darts.
Thank you for your reply,i encounter another issue,when I use 1 V100 and set batch size = 128,one epoch can be finished in 13 minutes which is faster than experiment in 8 V100(batch size = 1024 cost 25 minutes each epoch )
from pc-darts.
sorry, I have not tried one V100 on Imagenet. You may check carefully.
from pc-darts.
hi @yuhuixu1993 ,I found the last experiment that I can set batch_size=1024 was because I set architect.ste can be execed when epoch >15 ,when epoch >16 it was out of menory(8 V100),and i can only set batch_size=256,I exec nvidia-smi and found gpu0 was out of memory however the last seven gpus's memory was less than gpu0,the last seven gpu's memory is same
from pc-darts.
@xxsgcjwddsg, he had the same problem in this issue. I think he can help you.
from pc-darts.
Can not multi-gpu training may because ‘model.module.loss’ can not multi-gpu, so do not put this in the network. you can delete the loss from the network, and then calculate the loss after the network output.
from pc-darts.
@xxsgcjwddsg thank you,you mean delete the loss in the network in model_search_imagenet.py?
from pc-darts.
@xxsgcjwddsg i can use multi-gpu but the memory in GPU0 is different from the others
from pc-darts.
Thanks a lot for this project and @yuhuixu1993.I implemented a distributed version with pytorch 1.1.0 on cifar10.People who are interested can go to test and verify.https://github.com/bitluozhuang/Distributed-PC-Darts
from pc-darts.
Related Issues (20)
- Is a channel sampling mask fixed? HOT 3
- Is there any plan to release the pretrained imagenet model? HOT 1
- Why modifying architecture after epoch 15
- Data preparation of ImageNet
- How to change the channel proportion K? HOT 2
- Cannot re-implement your claimed result HOT 3
- GPU Utilization is Bad HOT 1
- We cannot obtain your claimed result on ImageNet after trying many configurations HOT 4
- Question about search on custom dataset HOT 5
- test.py运行报错
- Understanding the two sets of the architecture hyperparameter HOT 2
- how you report the final accuracy in evaluation? Possibly touch the test set for the best acc? HOT 2
- Learning rate schedule
- 你好,结果不一致 HOT 2
- Searched genotype remain / keep unchanged for a great number of epoch HOT 2
- RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
- 您好,想请问一下网络搜索完之后如何得到需要的网络结构代码? HOT 3
- About the license of this repository
- Hello, whether PC-DARTS likes DARTS with extra dropout?
- Not Enough Comments in the Code
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pc-darts.