"To reduce search time, we randomly sample two subsets from the 1.3M training set of I

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

ImageNet search question about pc-darts HOT 13 CLOSED

yuhuixu1993 commented on July 18, 2024

ImageNet search question

from pc-darts.

Comments (13)

yuhuixu1993 commented on July 18, 2024

@xxsgcjwddsg ,the subset still has 1000 classes. Actually, we sampled 10% and 2.5% from each class for training and validation respectively. Yes, hyper-parameters are architecture weights.

from pc-darts.

pawopawo commented on July 18, 2024

@xxsgcjwddsg ,the subset still has 1000 classes. Actually, we sampled 10% and 2.5% from each class for training and validation respectively. Yes, hyper-parameters are architecture weights.

Thanks.
How do you search with multiple GPUs? I add model = nn.DataParallel(model) before model = model.cuda() and model = model.module after it, but still search on one GPU.

from pc-darts.

yuhuixu1993 commented on July 18, 2024

@xxsgcjwddsg ,you need to comment this line torch.cuda.set_device(args.gpu)

from pc-darts.

pawopawo commented on July 18, 2024

@xxsgcjwddsg ,you need to comment this line torch.cuda.set_device(args.gpu)

Thanks !
I has comment torch.cuda.set_device(args.gpu), but don't work.

from pc-darts.

yuhuixu1993 commented on July 18, 2024

Can you show the errors? I think maybe the error still comes from the model.module? E.g. in the SGD optimizer, it should be model.parameters()not model.module.parameters(). Besides, do you change the model.module in the train function and validation function?

from pc-darts.

pawopawo commented on July 18, 2024

Thanks a lot.
I shouldn't add "model = model.module" after "model = nn.DataParallel(model).cuda()"

from pc-darts.

pawopawo commented on July 18, 2024

Hi, when run architecture.step(), it encountered an error of OOM.
Thanks for your reply!

from pc-darts.

yuhuixu1993 commented on July 18, 2024

@xxsgcjwddsg ，just for sure that you use 8 v100gpus. Besides，I notice that your parameter size is nearly two times of mine, how many layers are stacked in the search period? We use 8 in our experiments. And the initial channels are 16.

from pc-darts.

pawopawo commented on July 18, 2024

Thanks for your reply.
I use 8 v100 gpus, but the first one takes up most. As the picture shows，the batch size is 256.
I think the reason is using self.model.module._loss(input, target). However, it will run only on one GPU.

from pc-darts.

yuhuixu1993 commented on July 18, 2024

@xxsgcjwddsg ，hi，maybe you need to remove the .to(..... .device) e.g. to(xtemp.device) code in the model_search_imagenet.py. I run this code in the company， may be there are differences within devices. If it worked, please tell me and I will update the code.

from pc-darts.

pawopawo commented on July 18, 2024

Thanks for your reply. It cann't works.
I have another question. The paper randomly sample two subsets from the 1.3M training set of ImageNet, with 10% and 2.5% images, respectively. The batch_size of valid_queue is also 1024? If it is 1024, then the frequency of architect.step() is 1/4 of optimizer.step()?

from pc-darts.

yuhuixu1993 commented on July 18, 2024

@xxsgcjwddsg ,hi, I have no idea now, I can run the code with 8 V100(16G each). Maybe you can add my wechat, we can talk about more details. Yes, and I also change the validation batch-size to balance the frequency and found no difference. Which version of pytorch you use by the way？

from pc-darts.

HeathHose commented on July 18, 2024

@yuhuixu1993, hi. I notice that the validation batch-size is same as the train batch-size.According to my understanding, it means that the valid dataset will be used four times?
Or another way， architect steps one time while optimizer steps four times?

from pc-darts.

ImageNet search question about pc-darts HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent