hanzhanggit / stackgan-pytorch Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
In tensorflow version, you use different network to compute "conditional augmentation", but in this pytorch version, you use the mean value computed from Generator as the "conditional augmentation", and pass it to Discriminator: https://github.com/hanzhanggit/StackGAN-Pytorch/blob/master/code/trainer.py#L189. In Discriminator, you concatenate the 'mu' to encoded images directly without computing another "conditional augmentation".
However, in Generator, you compute the c_code, and concatenate it to noise?
Did you do this on purpose? Does this improve the quality of images generated or something else?
I am getting the following error trace while trying to run the Stage I of training using coco dataset
Looks like tensorboard V 1.10.0 (which is what I have installed in my virtual env does not have a class called FileWriter???). My Pycharm IDE is also complaining about the same thing
Traceback (most recent call last):
File "C:/PyCharmProjects/StackGAN-Pytorch/code/main.py", line 19, in
from trainer import GANTrainer
File "C:\PyCharmProjects\StackGAN-Pytorch\code\trainer.py", line 24, in
from tensorboard import FileWriter
ImportError: cannot import name 'FileWriter' from 'tensorboard' (C:\PythonVEnvs\StackGANPyTorch\lib\site-packages\tensorboard_init_.py)
Process finished with exit code 1
@hanzhanggit
@taoxugit
please help me, what is the main problem behind this?
(base) H:\StackGAN\StackGAN-Pytorch-master\code>python main.py --cfg cfg/coco_eval.yml --gpu 0
Using config:
{'CONFIG_NAME': 'stageI',
'CUDA': True,
'DATASET_NAME': 'coco',
'DATA_DIR': '../data/coco',
'EMBEDDING_TYPE': 'cnn-rnn',
'GAN': {'CONDITION_DIM': 128, 'DF_DIM': 96, 'GF_DIM': 192, 'R_NUM': 4},
'GPU_ID': '0',
'IMSIZE': 64,
'NET_D': '',
'NET_G': '',
'STAGE': 1,
'STAGE1_G': '',
'TEXT': {'DIMENSION': 1024},
'TRAIN': {'BATCH_SIZE': 128,
'COEFF': {'KL': 2.0},
'DISCRIMINATOR_LR': 0.0002,
'FLAG': True,
'GENERATOR_LR': 0.0002,
'LR_DECAY_EPOCH': 20,
'MAX_EPOCH': 120,
'PRETRAINED_EPOCH': 600,
'PRETRAINED_MODEL': '',
'SNAPSHOT_INTERVAL': 10},
'VIS_COUNT': 64,
'WORKERS': 4,
'Z_DIM': 100}
Load filenames from: ../data/coco\train\filenames.pickle (82783)
embeddings: (82783, 5, 1024)
This section is run successfully...
STAGE1_G(
(ca_net): CA_NET(
(fc): Linear(in_features=1024, out_features=256, bias=True)
(relu): ReLU()
)
(fc): Sequential(
(0): Linear(in_features=228, out_features=24576, bias=False)
(1): BatchNorm1d(24576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
(upsample1): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(1536, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample2): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(768, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample3): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(384, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample4): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(192, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(img): Sequential(
(0): Conv2d(96, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): Tanh()
)
)
STAGE1_D(
(encode_img): Sequential(
(0): Conv2d(3, 96, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): LeakyReLU(negative_slope=0.2, inplace)
(2): Conv2d(96, 192, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(3): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(4): LeakyReLU(negative_slope=0.2, inplace)
(5): Conv2d(192, 384, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(6): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): LeakyReLU(negative_slope=0.2, inplace)
(8): Conv2d(384, 768, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(9): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(10): LeakyReLU(negative_slope=0.2, inplace)
)
(get_cond_logits): D_GET_LOGITS(
(outlogits): Sequential(
(0): Conv2d(896, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): LeakyReLU(negative_slope=0.2, inplace)
(3): Conv2d(768, 1, kernel_size=(4, 4), stride=(4, 4))
(4): Sigmoid()
)
)
)
Preparing training data...
Traceback (most recent call last):
File "main.py", line 77, in
algo.train(dataloader, cfg.STAGE)
File "H:\StackGAN\StackGAN-Pytorch-master\code\trainer.py", line 158, in train
for i, data in enumerate(data_loader, 0):
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 501, in iter
return _DataLoaderIter(self)
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 289, in init
w.start()
File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
OSError: [Errno 22] Invalid argument
Thanks for the great work! Could you please provide the steps for using this model on a new dataset? The dataset has multiple captions per image. It would be really helpful if these steps could be elaborated upon.
Thanks
@hanzhanggit The readme says
Our current implementation has a higher inception score(10.62±0.19) than reported in the StackGAN paper
With the provided pre-trained model, I get 8.50526228697. Is the 10+ score for StackGAN v2?
Hi,
I notice that the output saves fake and real images. Are they supposed to be aligned? Or only the last image is aligned?
OBS: I only ran the "stage 1".
In models.py, it looks like the Stage 2 Generator takes a text-embedding and noise as input. In the paper, it also takes stage 1 Generator output.
In trainer, one can see that there is no difference in the stage1 and stage2 training, as they both take only text embedding input.
From paper- "The Stage-II GAN takes Stage-I results and text descriptions as inputs".
Let me know if I have misinterpreted anything.
Hi, I'm working on Windows 10, I get this issue:
DATAPATH: ../data/coco/test/val_captions.t7
Traceback (most recent call last):
File "main.py", line 77, in <module>
algo.sample(datapath, cfg.STAGE)
File "D:\documenti\Monica\StackGAN-Pytorch\code\trainer.py", line 243, in sample
t_file = torchfile.load(datapath)
File "C:\Users\Utente\venv\lib\site-packages\torchfile.py", line 424, in load
return reader.read_obj()
File "C:\Users\Utente\venv\lib\site-packages\torchfile.py", line 386, in read_obj
v = self.read_obj()
File "C:\Users\Utente\venv\lib\site-packages\torchfile.py", line 386, in read_obj
v = self.read_obj()
File "C:\Users\Utente\venv\lib\site-packages\torchfile.py", line 414, in read_obj
"unknown object type / typeidx: {}".format(typeidx))
torchfile.T7ReaderException: unknown object type / typeidx: -1112529805
Can anyone help me? Does StackGAN-Pytorch work on Windows?
I wanted to download the preprocessed char-CNN-RNN text embeddings for coco and the pretrained StackGAN model, but my access got denied in the google drive. I tried to leave a message to request access, but I didn't get any response. Is there any way to access the embeddings and pretrained model or can someone share these files? I will be very grateful if anyone can help me.
Traceback (most recent call last):
File "main.py", line 22, in
from trainer import GANTrainer
File "/media/server009/seagate/liuhan/text2img/StackGAN-Pytorch-master/code/trainer.py", line 24, in
from tensorboard import FileWriter
ImportError: cannot import name FileWriter
@hanzhanggit
Did I do something wrong?
Why cannot import FileWriter?
Thank you very much!
same as above
How do I perform this action?
Currently I have typed the following terminal commands:
python
import sys
sys.path.append("path/to/Modules")
print (sys.path)
Hi,
Thank you for sharing the code. We are trying to run evaluation of pre-trained StackGAN model but encountered a problem with loading caption features for validation images.
Specifically, when we try to load the feature from (val_captions.t7) using torchfile, the following error occurs:
*** error: unpack requires a string argument of length 4
We are suspecting that may be the caption feature file is truncated, considering that the size of file is ~13.5 MB. Could you please check whether the file is valid?
I am a newer of GAN and I just wonder the generator loss used in this project, why not the pixel-level bce loss between real images and fake images?
Hi Han,
I'm getting a "cuda runtime error (2) : out of memory" error when I try to evaluate the model using the pretrained weights. What is the hardware requirement to run this code? I have an Nvidia gtx 1080.
Console:
$ python main.py --cfg cfg/coco_eval.yml --gpu 0
Using config:
{'CONFIG_NAME': 'stageII',
'CUDA': True,
'DATASET_NAME': 'coco',
'DATA_DIR': '../data/coco',
'EMBEDDING_TYPE': 'cnn-rnn',
'GAN': {'CONDITION_DIM': 128, 'DF_DIM': 96, 'GF_DIM': 192, 'R_NUM': 2},
'GPU_ID': '0',
'IMSIZE': 256,
'NET_D': '',
'NET_G': '../models/coco/netG_epoch_90.pth',
'STAGE': 2,
'STAGE1_G': '',
'TEXT': {'DIMENSION': 1024},
'TRAIN': {'BATCH_SIZE': 40,
'COEFF': {'KL': 2.0},
'DISCRIMINATOR_LR': 0.0002,
'FLAG': False,
'GENERATOR_LR': 0.0002,
'LR_DECAY_EPOCH': 600,
'MAX_EPOCH': 600,
'PRETRAINED_EPOCH': 600,
'PRETRAINED_MODEL': '',
'SNAPSHOT_INTERVAL': 50},
'VIS_COUNT': 64,
'WORKERS': 4,
'Z_DIM': 100}
STAGE2_G (
(STAGE1_G): STAGE1_G (
(ca_net): CA_NET (
(fc): Linear (1024 -> 256)
(relu): ReLU ()
)
(fc): Sequential (
(0): Linear (228 -> 24576)
(1): BatchNorm1d(24576, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU (inplace)
)
(upsample1): Sequential (
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(1536, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
(3): ReLU (inplace)
)
(upsample2): Sequential (
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(768, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True)
(3): ReLU (inplace)
)
(upsample3): Sequential (
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(384, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True)
(3): ReLU (inplace)
)
(upsample4): Sequential (
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(192, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True)
(3): ReLU (inplace)
)
(img): Sequential (
(0): Conv2d(96, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): Tanh ()
)
)
(ca_net): CA_NET (
(fc): Linear (1024 -> 256)
(relu): ReLU ()
)
(encoder): Sequential (
(0): Conv2d(3, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): ReLU (inplace)
(2): Conv2d(192, 384, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(3): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True)
(4): ReLU (inplace)
(5): Conv2d(384, 768, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(6): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
(7): ReLU (inplace)
)
(hr_joint): Sequential (
(0): Conv2d(896, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU (inplace)
)
(residual): Sequential (
(0): ResBlock (
(block): Sequential (
(0): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU (inplace)
(3): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
)
(relu): ReLU (inplace)
)
(1): ResBlock (
(block): Sequential (
(0): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
(2): ReLU (inplace)
(3): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
)
(relu): ReLU (inplace)
)
)
(upsample1): Sequential (
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(768, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True)
(3): ReLU (inplace)
)
(upsample2): Sequential (
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(384, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True)
(3): ReLU (inplace)
)
(upsample3): Sequential (
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(192, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True)
(3): ReLU (inplace)
)
(upsample4): Sequential (
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(96, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True)
(3): ReLU (inplace)
)
(img): Sequential (
(0): Conv2d(48, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): Tanh ()
)
)
Load from: ../models/coco/netG_epoch_90.pth
STAGE2_D (
(encode_img): Sequential (
(0): Conv2d(3, 96, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): LeakyReLU (0.2, inplace)
(2): Conv2d(96, 192, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(3): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True)
(4): LeakyReLU (0.2, inplace)
(5): Conv2d(192, 384, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(6): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True)
(7): LeakyReLU (0.2, inplace)
(8): Conv2d(384, 768, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(9): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
(10): LeakyReLU (0.2, inplace)
(11): Conv2d(768, 1536, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(12): BatchNorm2d(1536, eps=1e-05, momentum=0.1, affine=True)
(13): LeakyReLU (0.2, inplace)
(14): Conv2d(1536, 3072, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(15): BatchNorm2d(3072, eps=1e-05, momentum=0.1, affine=True)
(16): LeakyReLU (0.2, inplace)
(17): Conv2d(3072, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(18): BatchNorm2d(1536, eps=1e-05, momentum=0.1, affine=True)
(19): LeakyReLU (0.2, inplace)
(20): Conv2d(1536, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(21): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
(22): LeakyReLU (0.2, inplace)
)
(get_cond_logits): D_GET_LOGITS (
(outlogits): Sequential (
(0): Conv2d(896, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
(2): LeakyReLU (0.2, inplace)
(3): Conv2d(768, 1, kernel_size=(4, 4), stride=(4, 4))
(4): Sigmoid ()
)
)
(get_uncond_logits): D_GET_LOGITS (
(outlogits): Sequential (
(0): Conv2d(768, 1, kernel_size=(4, 4), stride=(4, 4))
(1): Sigmoid ()
)
)
)
Successfully load sentences from: ../data/coco/test/val_captions.t7
Total number of sentences: 40470
num_embeddings: 40470 (40470, 1024)
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
File "main.py", line 77, in <module>
algo.sample(datapath, cfg.STAGE)
File "/home/shenkev/Downloads/StackGAN-Pytorch/code/trainer.py", line 278, in sample
nn.parallel.data_parallel(netG, inputs, self.gpus)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/data_parallel.py", line 102, in data_parallel
return module(*inputs[0], **module_kwargs[0])
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/home/shenkev/Downloads/StackGAN-Pytorch/code/model.py", line 257, in forward
h_code = self.upsample4(h_code)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/upsampling.py", line 80, in forward
return F.upsample(input, self.size, self.scale_factor, self.mode)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 911, in upsample
return _functions.thnn.UpsamplingNearest2d(_pair(size), scale_factor)(input)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/thnn/upsampling.py", line 52, in forward
self.scale_factor
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:66
@hanzhanggit Hello!
Thank you for your contributions on this code.
I'm trying to train this on my own dataset.
I followed reedscot/icml2016, and trained a char-CNN-RNN text encoder.
But it's a .t7 file, not .pickle as your preprocessed char-CNN-RNN text embeddings.
So I'm wondering how to preprocess the char-CNN-RNN text embeddings to a .pickle file?
Thanks again for your contributions.
I'm looking forward to your reply!
THCudaCheck FAIL file=..\aten\src\THC\THCGeneral.cpp line=87 error=30 : unknown error Traceback (most recent call last): File "main.py", line 77, in <module> algo.sample(datapath, cfg.STAGE) File "C:\Users\hunte\OneDrive\Documents\Projects\EAD Project\StackGAN-Pytorch-master\code\trainer.py", line 238, in sample netG, _ = self.load_network_stageII() File "C:\Users\hunte\OneDrive\Documents\Projects\EAD Project\StackGAN-Pytorch-master\code\trainer.py", line 110, in load_network_stageII netG.cuda() File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 260, in cuda return self._apply(lambda t: t.cuda(device)) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 187, in _apply module._apply(fn) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 187, in _apply module._apply(fn) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 187, in _apply module._apply(fn) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 193, in _apply param.data = fn(param.data) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 260, in <lambda> return self._apply(lambda t: t.cuda(device)) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda\__init__.py", line 162, in _lazy_init torch._C._cuda_init() RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87
Hi Han,
I wanted to train the model on the the Bird and Flower dataset (like in the Tensorflow version). Would it be as straight forward as downloading the datasets and calling main? I'm guessing you haven't tried this yet, any potential pitfalls you see?
Hi,
in trainer.py line:180, the text condition that input to netD is 'mu',
i double why it's not 'c_code' which is the text condition for the netG.
And 'mu' only has half information of the text embedding, so i'm a little confused about it.
Waitting for your reply.
Best regards!
What's the property parameters for training CUB-birds(200-2011) datasets?
I had tried several param_sets, but the G loss always increases and oscillates in 2.0 after 50 epochs, and the KL loss is always increase..
How to improve the trend of loss to get better images? Any suggestions? Thanks!
File "", line 1, in
runfile('D:/Projects/GAN-Text2Image/code/main.py', args='--cfg cfg/coco_s1.yml --gpu 0', wdir='D:/Projects/GAN-Text2Image/code')
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 786, in runfile
execfile(filename, namespace)
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "D:/Projects/GAN-Text2Image/code/main.py", line 73, in
algo.train(dataloader, cfg.STAGE)
File "D:\Projects\GAN-Text2Image\code\trainer.py", line 205, in train
self.summary_writer.add_summary(summary_D, count).eval()
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\tensorboard\writer.py", line 94, in add_summary
event = event_pb2.Event(summary=summary)
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\google\protobuf\internal\python_message.py", line 520, in init
_ReraiseTypeErrorWithFieldName(message_descriptor.name, field_name)
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\google\protobuf\internal\python_message.py", line 448, in _ReraiseTypeErrorWithFieldName
six.reraise(type(exc), exc, sys.exc_info()[2])
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\six.py", line 692, in reraise
raise value.with_traceback(tb)
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\google\protobuf\internal\python_message.py", line 518, in init
copy.MergeFrom(new_val,)
File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\google\protobuf\internal\python_message.py", line 1230, in MergeFrom
'expected %s got %s.' % (cls.name, msg.class.name))
TypeError: Parameter to MergeFrom() must be instance of same class: expected Summary got Tensor. for field Event.summary
I have finished the training of 2 stages on CUB and I can generate the samples during the test.
During the test, I also use the .pickle file extracted from the char-CNN-RNN text embeddings file of CUB as the embedding. But I failed to match the caption with the generated images. I just use the corresponding description during test in [self.caption]( # self.captions = self.load_all_captions()), but they are not matched.
How to get the correct caption of the generated images?
Much appreciation!
I have got the model after train,but i do not know how to create image using own sentence as input,please help me.
Hi, I am using the pretrained COCO model, which is shared in this repo. How can I use it over some other sentences as input? Since the mode will not be train, it will use val_captions.t7. However, I am not clear on how to convert a text file to t7 file. Could you please elaborate on this?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.