Giter Club home page Giter Club logo

pytorch-adda's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-adda's Issues

Anybody visualized feature adaptation?

I am wondering if anybody has tried to visualize the features obtained from the source and target encoders respectively, to see if they indeed look similar or not? After the adversarial training is complete, and the discriminator and target encoder are able to reach the Nash equilibrium state, the two encoders should generate similar features? Has anybody tried visualizing them?

adaptation question, discriminator acc=0.5 representing best adaptation?

Thanks for the code.
I changed the dataset. But the adaptation process didn't converge( acc=0.5 represents convergence?), and just get 14% domain adaptation accrucy.

Epoch [457/600] Step [100/200]:d_loss=0.32023 g_loss=3.27941 acc=0.89000

what parameters I need to change?
Should I normalize the source and target dataset?
Thank you!

Questions about zero_grad()

In adapt.py ,81 line.

Why optimizer_critic.zero_grad() is needed ?

# zero gradients for optimizer
optimizer_critic.zero_grad()
optimizer_tgt.zero_grad()

Question about the loss

You have done a great work! But I have some questions about the loss function.
I have no idea where is the advarsarial loss in the code?
You used the nn.LogSoftmax() in the adapt.py, and then used the nn.CrossEntropyLoss() in the main.py to train the discriminator. As we know, the nn.CrossEntropyLoss() combines nn.LogSoftmax() and nn.NLLLoss() in one single class, so is this use not repeated?
And other question is the nn.CrossEntropyLoss() and the nn.LogSoftmax() is equivalent to the advarsarial loss?
Thank you very much!
Looking forward to your reply!

Thanks ! + testing the "src_only" baseline ...

Hello, recently I've been working with your PyTorch implementation of ADDA so first of all thanks for your code!

For now I am only interested in testing the network Ms (trained on the source domain) on the target domain (src_only baseline):

Surprisingly on your github page you announced ~84% accuracy of this src_only baseline which is around 9% over the reported accuracy (75.2%) in the original paper (https://arxiv.org/abs/1702.05464). How can you explain such a difference?

I have tried to limit the number of samples in source domain (MNIST) to 2000 (as in the original paper) and yet I observed a ~87% accuracy (the last modifications I made from your master branch corresponds to that experiment : https://github.com/emmanuelrouxfr/pytorch-adda)

To be as close as possible from the paper setup, I have also tried to set the original batch size to 128 (adjusting the number of epochs to be 625 to fit the mentioned 10 000 learning iterations) and the original learning rates and parameters:
d_learning_rate = 2e-4
c_learning_rate = 2e-4
beta1 = 0.5
beta2 = 0.999

but I can't reproduce the results originally presented in the original paper (~75% accuracy of src_only tested on USPS). It is always much higher than it is supposed to be.

I hope you could help me identify a possible reason to this phenomenon, thanks !

Can't load usps_28x28.pkl

It says urllib.error.HTTPError: HTTP Error 404: Not Found when trying to load the usps .pkl file.

Pretrain+Finetune results?

The Adda results are indeed impressive. But I am wondering how it compares to:

  1. train on MNIST, fine-tune on the small USPS dataset
  2. mixes MNIST and small USPS dataset, and trained on the mixed dataset.

I tried 1) and 2) on some document classification (NLP) task, I found both 1) and 2) worked very well, i.e., improving target classification results from 0.74 to 0.87. Thus how does Adda compare to 1) and 2)?

RNN network invalid

1 I use RNN network as encoder, save the trained model in the source domain and test the USPS data set. The accuracy rate is only 10%, which is not as good as lenet network. Why is this
2 Does ADDA algorithm replace the noise with the target domain data set on the basis of Gan?
3 After using RNN as encoder, There is no obvious change in g_loss during confrontation generation.

ERROR!!!

RuntimeError: The expanded size of the tensor (1) must match the existing size (3) at non-singleton dimension 0

something about eval

I don't know in test.py lines 25: labels = make_variable(labels).squeeze_(), why the lables need squeeze_(), and in 'pretrain.py' the labels don't need squeeze(). Maybe because the USPS dataset?

Two feature maps output from discriminator?

Can somebody elucidate why have the ADDA codes (this as well as the TensorFlow one) used two feature maps output from the discriminator instead of one? I am wondering why here in adapt.py, we concatenate the source and target features, and then pass the concatenated features to the discriminator for prediction?

Why not use one and do one prediction at a time, as how it is done in most GAN examples (say here - https://github.com/pytorch/examples/blob/master/dcgan/main.py)??

Criterion problem

Thanks for sharing the code. I have a question about the line 25 in adapt.py (criterion = nn.CrossEntropyLoss() ). Since nn.LogSoftmax() is added in discriminator.py (line 21). Is it correct to use CrossEntropyLoss as criterion ? Or maybe I miss something? Thanks in advance.

Adaptation leads to lower precision.

I changed the dataset(source data count:20000, target data count:2100)

Result:
source only:
mydata set: Average loss: 2.1571, Accuracy: 1311/2100 (62.00%)
domain adaptation:
mydata set: Average loss: 4.5971, Accuracy: 327/2100 (15.00%)

Because GPU has small memory , I set batchsize=16,Is this batchsize problem?

Thank you for your help!

Losses

In ADDA, classifier loss and advarsarial loss are used. In which file you are using these two losses ?

The adaptation does not work

I use the code with pytorch=1.0.1, torchvision=0.2.0. And i get the result of 95.1% only using source dataset, and 95.8% after adptation. It is confusing!

0% accuracy with pytorch >= 0.4.0

Downgrading to (py)torch==0.3.1 (required re-processing data) fixed issue. This issue is mainly to help other people who run into the same problem.

Here are my results with torch 0.3.1 and torchvision 0.2.0:

>>> source only <<<
Avg Loss = 0.309243381023407, Avg Accuracy = 91.182796%
>>> domain adaption <<<
Avg Loss = 0.15142789483070374, Avg Accuracy = 95.913978%

error about mnist data shape

when i run 'main.py', i got following error.

`Traceback (most recent call last):

File "Domain_Adaption/pytorch-adda/main.py", line 41, in
src_encoder, src_classifier, src_data_loader)
File "Domain_Adaption/pytorch-adda/core/pretrain.py", line 32, in train_src
for step, (images, labels) in enumerate(data_loader):
File "/envs//lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 615, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File /envs/
/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 615, in
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/envs/
/lib/python3.6/site-packages/torchvision/datasets/mnist.py", line 95, in getitem
img = self.transform(img)
File "/envs/
/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 60, in call
img = t(img)
File "/envs/
/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 163, in call
return F.normalize(tensor, self.mean, self.std, self.inplace)
File "/envs/
*/lib/python3.6/site-packages/torchvision/transforms/functional.py", line 208, in normalize
tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
RuntimeError: output with shape [1, 28, 28] doesn't match the broadcast shape [3, 28, 28]
`

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.