corenel / pytorch-adda Goto Github PK
View Code? Open in Web Editor NEWA PyTorch implementation for Adversarial Discriminative Domain Adaptation
License: MIT License
A PyTorch implementation for Adversarial Discriminative Domain Adaptation
License: MIT License
I am wondering if anybody has tried to visualize the features obtained from the source and target encoders respectively, to see if they indeed look similar or not? After the adversarial training is complete, and the discriminator and target encoder are able to reach the Nash equilibrium state, the two encoders should generate similar features? Has anybody tried visualizing them?
Thanks for the code.
I changed the dataset. But the adaptation process didn't converge( acc=0.5 represents convergence?), and just get 14% domain adaptation accrucy.
Epoch [457/600] Step [100/200]:d_loss=0.32023 g_loss=3.27941 acc=0.89000
what parameters I need to change?
Should I normalize the source and target dataset?
Thank you!
In adapt.py ,81 line.
Why optimizer_critic.zero_grad() is needed ?
# zero gradients for optimizer
optimizer_critic.zero_grad()
optimizer_tgt.zero_grad()
You have done a great work! But I have some questions about the loss function.
I have no idea where is the advarsarial loss in the code?
You used the nn.LogSoftmax() in the adapt.py, and then used the nn.CrossEntropyLoss() in the main.py to train the discriminator. As we know, the nn.CrossEntropyLoss() combines nn.LogSoftmax() and nn.NLLLoss() in one single class, so is this use not repeated?
And other question is the nn.CrossEntropyLoss() and the nn.LogSoftmax() is equivalent to the advarsarial loss?
Thank you very much!
Looking forward to your reply!
Hello, recently I've been working with your PyTorch implementation of ADDA so first of all thanks for your code!
For now I am only interested in testing the network Ms (trained on the source domain) on the target domain (src_only baseline):
Surprisingly on your github page you announced ~84% accuracy of this src_only baseline which is around 9% over the reported accuracy (75.2%) in the original paper (https://arxiv.org/abs/1702.05464). How can you explain such a difference?
I have tried to limit the number of samples in source domain (MNIST) to 2000 (as in the original paper) and yet I observed a ~87% accuracy (the last modifications I made from your master branch corresponds to that experiment : https://github.com/emmanuelrouxfr/pytorch-adda)
To be as close as possible from the paper setup, I have also tried to set the original batch size to 128 (adjusting the number of epochs to be 625 to fit the mentioned 10 000 learning iterations) and the original learning rates and parameters:
d_learning_rate = 2e-4
c_learning_rate = 2e-4
beta1 = 0.5
beta2 = 0.999
but I can't reproduce the results originally presented in the original paper (~75% accuracy of src_only tested on USPS). It is always much higher than it is supposed to be.
I hope you could help me identify a possible reason to this phenomenon, thanks !
The url "https://raw.githubusercontent.com/mingyuliutw/CoGAN/master/cogan_pytorch/data/uspssample/usps_28x28.pkl" for the USPS dataset is fail, how can I reach the newest address. Thanks!
It says urllib.error.HTTPError: HTTP Error 404: Not Found
when trying to load the usps .pkl file.
How can we do the experiment for USPS to MNIST?
The Adda results are indeed impressive. But I am wondering how it compares to:
I tried 1) and 2) on some document classification (NLP) task, I found both 1) and 2) worked very well, i.e., improving target classification results from 0.74 to 0.87. Thus how does Adda compare to 1) and 2)?
1 I use RNN network as encoder, save the trained model in the source domain and test the USPS data set. The accuracy rate is only 10%, which is not as good as lenet network. Why is this
2 Does ADDA algorithm replace the noise with the target domain data set on the basis of Gan?
3 After using RNN as encoder, There is no obvious change in g_loss during confrontation generation.
=== Evaluating classifier for encoded target domain ===
only source <<<
Avg Loss = 14.961788177490234, Avg Accuracy = 56.140000%
source and target <<<
Avg Loss = 8366.6220703125, Avg Accuracy = 11.350000%
I got accuracy = 11 after domain adaptation.
RuntimeError: The expanded size of the tensor (1) must match the existing size (3) at non-singleton dimension 0
I don't know in test.py lines 25: labels = make_variable(labels).squeeze_()
, why the lables need squeeze_(), and in 'pretrain.py' the labels don't need squeeze(). Maybe because the USPS dataset?
Can somebody elucidate why have the ADDA codes (this as well as the TensorFlow one) used two feature maps output from the discriminator instead of one? I am wondering why here in adapt.py, we concatenate the source and target features, and then pass the concatenated features to the discriminator for prediction?
Why not use one and do one prediction at a time, as how it is done in most GAN examples (say here - https://github.com/pytorch/examples/blob/master/dcgan/main.py)??
Thanks for sharing the code. I have a question about the line 25 in adapt.py (criterion = nn.CrossEntropyLoss() ). Since nn.LogSoftmax() is added in discriminator.py (line 21). Is it correct to use CrossEntropyLoss as criterion ? Or maybe I miss something? Thanks in advance.
I changed the dataset(source data count:20000, target data count:2100)
Result:
source only:
mydata set: Average loss: 2.1571, Accuracy: 1311/2100 (62.00%)
domain adaptation:
mydata set: Average loss: 4.5971, Accuracy: 327/2100 (15.00%)
Because GPU has small memory , I set batchsize=16,Is this batchsize problem?
Thank you for your help!
In ADDA, classifier loss and advarsarial loss are used. In which file you are using these two losses ?
I use the code with pytorch=1.0.1, torchvision=0.2.0. And i get the result of 95.1% only using source dataset, and 95.8% after adptation. It is confusing!
Thank you for you code!
I have run the code as the instruction, but just got 13% accuracy on target domain.
is there something wrong?
I use your code but not get a 95%+ accuracy and only around 75%. Is there any trick that will rocketing the performance?
Downgrading to (py)torch==0.3.1 (required re-processing data) fixed issue. This issue is mainly to help other people who run into the same problem.
Here are my results with torch 0.3.1 and torchvision 0.2.0:
>>> source only <<<
Avg Loss = 0.309243381023407, Avg Accuracy = 91.182796%
>>> domain adaption <<<
Avg Loss = 0.15142789483070374, Avg Accuracy = 95.913978%
eg.source domain has more than 8000 samples while target domain has about 200 samples.
ACC decreases after adaptation, Reduce the learning rate and adapt epochs seems to be ineffective
when i run 'main.py', i got following error.
`Traceback (most recent call last):
File "Domain_Adaption/pytorch-adda/main.py", line 41, in
src_encoder, src_classifier, src_data_loader)
File "Domain_Adaption/pytorch-adda/core/pretrain.py", line 32, in train_src
for step, (images, labels) in enumerate(data_loader):
File "/envs//lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 615, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File /envs//lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 615, in
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/envs//lib/python3.6/site-packages/torchvision/datasets/mnist.py", line 95, in getitem
img = self.transform(img)
File "/envs//lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 60, in call
img = t(img)
File "/envs//lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 163, in call
return F.normalize(tensor, self.mean, self.std, self.inplace)
File "/envs/*/lib/python3.6/site-packages/torchvision/transforms/functional.py", line 208, in normalize
tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
RuntimeError: output with shape [1, 28, 28] doesn't match the broadcast shape [3, 28, 28]
`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.