Hi there, I'm Yuyang Zhao ๐
Contact Me:
โ๏ธ Email: [email protected]
๐ Website: https://yuyangzhao.com
๐ Google Scholar: https://scholar.google.com/citations?user=u5M6XPAAAAAJ
Pytorch implementation for M^3L. CVPR 2021
Contact Me:
โ๏ธ Email: [email protected]
๐ Website: https://yuyangzhao.com
๐ Google Scholar: https://scholar.google.com/citations?user=u5M6XPAAAAAJ
Hi, nice job!
I am curious about how to implement the baseline.
for dataset in datasets: for index in range(iters): loss = cls + tri + cent
Is the baseline written like this?
And in your Table 4, line 2, as follows
Did you use unify the classification loss of the label space of all datasets ,triplet ans centerloss?
Looking forward to your reply.
I ran your project with three GPUs,and i didnot change the code except replacing msmt17v1 with msmt17v2,but it has this question:
File "D:\papercode\M3Lmaster\reid\trainers.py", line 64, in train
f_out, tri_features = self.model(inputs, MTE='', save_index=save_index)
ValueError: too many values to unpack (expected 2)
Is there something wrong with my dataset? Have you encountered it before?
Thank you for reading!
Hi, why you write the copyModel function instead of using copy.deepcopy method?
Hello, I am very interested in your work, then I ran your code and have a few questions to ask you:
1, I see that your code writes itself some network layer, and normal torch.nn network layer is different, network layer parameter can be a tensor instead of parameter, so that can achieve meta-learning. But I found that with meta-learning and without, the training speed varies greatly. That is to say, using meta-learning training becomes very slow. Do I have to write the network layer as buffer? Have you tried directly getting the gradient of mteloss and then writing an optimizer to update the model parameters directly with the gradient of mteloss and grad_info(mtrloss gradient)?
2. When I run your code, there will be a feature fusion process in the meta-test phase, The fusion feature comes from Norm. Sample, where I found that running would report an error:"the parameter scale has invalid values", and the larger the learning rate, the more likely this error would occur. Have you ever met one?
3. I also found that I deleted the code for feature fusion due to an error in 2, I found that running out a much higher than in your article, such as MS + C + D โ M reached mAP = 52.1%; At the same time, I also found that this time without meta-learning can achieve mAP = 52.0%
Thanks for reading!
Thank you for making the code available.
I was trying to run the same repo as it is, I just changed my batch size from 64 to 32 due to memory constraints.
I am running the code on 2 Nvidia 1080Ti GPU's each of 12 GB memory.
However, randomly after few epochs I keep getting a Value error as:
ValueError: Expected parameter scale (Tensor of shape (2048,)) of distribution Normal(loc: torch.Size([2048]), scale: torch.Size([2048])) to satisfy the constraint GreaterThan(lower_bound=0.0), but found invalid values:
tensor([1.2194e-04, 1.5050e-04, 2.8594e-03, ..., 3.8839e-05, 1.8705e-05,
1.1311e-05], device='cuda:0')
I am getting it randomly after 10 epochs. Below is the full stack trace.
Kindly help me in this regard to run your code.
Epoch: [25][160/200] Time 2.152 (2.173) Total loss 6.960 (7.223) Loss 3.233(3.638) LossMeta 3.728(3.585)
Epoch: [25][165/200] Time 2.192 (2.173) Total loss 7.966 (7.204) Loss 4.791(3.644) LossMeta 3.174(3.560)
Traceback (most recent call last):
File "main.py", line 286, in
main()
File "main.py", line 108, in main
main_worker(args)
File "main.py", line 202, in main_worker
print_freq=args.print_freq, train_iters=args.iters)
File "/home/sarosij/M3L/reid/trainers.py", line 89, in train
f_test, mte_tri = self.newMeta(testInputs, MTE=self.args.BNtype)
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/_utils.py", line 434, in reraise
raise exception
ValueError: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sarosij/M3L/reid/models/resMeta.py", line 180, in forward
bn_x = self.feat_bn(x, MTE, save_index)
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/sarosij/M3L/reid/models/MetaModules.py", line 362, in forward
Distri1 = Normal(self.meta_mean1, self.meta_var1)
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/distributions/normal.py", line 50, in init
super(Normal, self).init(batch_shape, validate_args=validate_args)
File "/home/sarosij/anaconda3/envs/reid/lib/python3.6/site-packages/torch/distributions/distribution.py", line 56, in init
f"Expected parameter {param} "
ValueError: Expected parameter scale (Tensor of shape (2048,)) of distribution Normal(loc: torch.Size([2048]), scale: torch.Size([2048])) to satisfy the constraint GreaterThan(lower_bound=0.0), but found invalid values:
tensor([1.2194e-04, 1.5050e-04, 2.8594e-03, ..., 3.8839e-05, 1.8705e-05,
1.1311e-05], device='cuda:0')
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.