Comments (4)
Hi @ttaa9 , thanks for your questions! The training splits follow the splits available for the datasets. For example, for CIFAR-10 there are train and test splits, but similar to many works I used train split instead. On the other hand, for STL-10, there is the unlabeled version from the dataset, which is also most commonly used. For the scores, they are obtained from a single training run and no retraining of the model to find the best one was done.
from mimicry.
Hi @kwotsin , thanks so much for the quick reply. It might be useful to put this info about splits somewhere in the readme so we know which splits to test against to compare against your scores.
Also, the reason I asked about multiple runs is that when I run the GANs myself I don't get quite the same scores. E.g., on celeb-A (128 x 128), I get FID/KID of 13.08/0.00956 (versus your 12.93/0.0076) when training on the train set and evaluating on the test set. (It's a bit worse, 13.39/0.010, when evaluating on the training set, unsurprisingly). FID is quite close but KID is a bit off. So I am wondering whether this is simply stochasticity across training runs versus a difference in the training settings. Perhaps you could post your Trainer
settings/object as well as just the architectures, which you currently have?
from mimicry.
Hi @ttaa9 , no worries! Indeed, on the splits, the information is currently listed under the "Baselines", which is also available for all other datasets tested. To clarify, similar to many existing works, the same split was used for both training and evaluation for each dataset. On the training settings and architectures, these are listed on the README page as well, which are the same ones used for the checkpoint.
On the CelebA run, I think your obtained FID score looks correct, with the difference quite similar to the error interval (which as you mentioned, is probably due to the stochasticity across different training runs). For the KID score, could you check and see if the JSON file scores have any anomalous scores? For example, my current JSON file for the KID scores have the following values:
[
0.007495319259681859,
0.007711712250735898,
0.007619357938282523
]
I suspect an anomalous reading could affect the KID score significantly, which is not surprising since I noticed it can happen even for FID -- e.g. at the same checkpoint, generating using a different random seed can sometimes give few hundred FID points instead of the 20+ points from other readings, although this is very rare. I've re-run the evaluation with the given checkpoint and have gotten a similar score as well: 0.007659641506459136 (± 7.556746387021168e-06)
. Given that your obtained FID is similar to the one I got, I suspect the KID score might have an anomaly for one of the readings.
To reproduce the scores for KID CelebA, you can download the checkpoint file and run this minimal script:
import torch
import torch_mimicry as mmc
from torch_mimicry.nets import sngan
# Replace with checkpoint file from CelebA 128x128, SNGAN. https://drive.google.com/open?id=1rYnv2tCADbzljYlnc8Ypy-JTTipJlRyN
ckpt_file = "/path/to/checkpoints/netG/netG_100000_steps.pth"
# Default variables
log_dir = './examples/example_log_celeba'
dataset = 'celeba_128'
device = torch.device('cuda:0' if torch.cuda.is_available() else "cpu")
# Restore model
netG = sngan.SNGANGenerator128().to(device)
netG.restore_checkpoint(ckpt_file)
# Metrics
scores = []
for seed in range(3):
score = mmc.metrics.kid_score(num_samples=50000,
netG=netG,
seed=seed,
dataset=dataset,
log_dir=log_dir,
device=device)
scores.append(score)
print(scores)
Feel free to let me know if this is helpful!
from mimicry.
Closing this issue for now, but feel free to let me know if you have more questions!
from mimicry.
Related Issues (20)
- Cannot import sagan when I use pip install. Please fix. HOT 1
- Some cons when using metric... HOT 3
- Pretrained Discriminators HOT 3
- Difference of spectral norm conv2d in Dblock/DBlockOptimized
- Evaluation on the test set HOT 1
- How to judge the training process is correct HOT 1
- The gap of FID on LSUN dataset. HOT 1
- Could you add the SAGAN for 256x256 size image? HOT 2
- Could you add the support for torch >= 1.8? HOT 1
- LSUN bedroom 128x128 HOT 1
- bugs in README.md and Documentation about evaluate HOT 2
- dtype bugs on windows10 HOT 1
- How to calculate the FID IS after the experiments? HOT 4
- fid_score() got an unexpected keyword argument 'dataset_name' HOT 1
- ImportError: cannot import name 'PY3' from 'torch._six' HOT 1
- size mismatch for block4.c2.weight
- In which file should I modify the structure of a specific model?
- [CelebA]RuntimeError: The daily quota of the file img_align_celeba.zip is exceeded and it can't be downloaded. This is a limitation of Google Drive and can only be overcome by trying again later. HOT 1
- ImageNet token is empty. Please obtain permission token from the official website.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mimicry.