Giter Club home page Giter Club logo

stackgan-pytorch's People

Contributors

hanzhanggit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stackgan-pytorch's Issues

ImportError: cannot import name 'FileWriter' from 'tensorboard'

I am getting the following error trace while trying to run the Stage I of training using coco dataset

Looks like tensorboard V 1.10.0 (which is what I have installed in my virtual env does not have a class called FileWriter???). My Pycharm IDE is also complaining about the same thing

Traceback (most recent call last):
File "C:/PyCharmProjects/StackGAN-Pytorch/code/main.py", line 19, in
from trainer import GANTrainer
File "C:\PyCharmProjects\StackGAN-Pytorch\code\trainer.py", line 24, in
from tensorboard import FileWriter
ImportError: cannot import name 'FileWriter' from 'tensorboard' (C:\PythonVEnvs\StackGANPyTorch\lib\site-packages\tensorboard_init_.py)

Process finished with exit code 1

Inconsistent with the paper (Stage-II inputs)

In models.py, it looks like the Stage 2 Generator takes a text-embedding and noise as input. In the paper, it also takes stage 1 Generator output.
In trainer, one can see that there is no difference in the stage1 and stage2 training, as they both take only text embedding input.
From paper- "The Stage-II GAN takes Stage-I results and text descriptions as inputs".
Let me know if I have misinterpreted anything.

ImportError: cannot import name FileWriter

Traceback (most recent call last):
File "main.py", line 22, in
from trainer import GANTrainer
File "/media/server009/seagate/liuhan/text2img/StackGAN-Pytorch-master/code/trainer.py", line 24, in
from tensorboard import FileWriter
ImportError: cannot import name FileWriter

@hanzhanggit
Did I do something wrong?
Why cannot import FileWriter?
Thank you very much!

How to preprocess char-CNN-RNN?

@hanzhanggit Hello!
Thank you for your contributions on this code.
I'm trying to train this on my own dataset.
I followed reedscot/icml2016, and trained a char-CNN-RNN text encoder.
But it's a .t7 file, not .pickle as your preprocessed char-CNN-RNN text embeddings.
So I'm wondering how to preprocess the char-CNN-RNN text embeddings to a .pickle file?
Thanks again for your contributions.
I'm looking forward to your reply!

Generator loss

I am a newer of GAN and I just wonder the generator loss used in this project, why not the pixel-level bce loss between real images and fake images?

Work on Windows 10?

Hi, I'm working on Windows 10, I get this issue:

DATAPATH:  ../data/coco/test/val_captions.t7
Traceback (most recent call last):
  File "main.py", line 77, in <module>
    algo.sample(datapath, cfg.STAGE)
  File "D:\documenti\Monica\StackGAN-Pytorch\code\trainer.py", line 243, in sample
    t_file = torchfile.load(datapath)
  File "C:\Users\Utente\venv\lib\site-packages\torchfile.py", line 424, in load
    return reader.read_obj()
  File "C:\Users\Utente\venv\lib\site-packages\torchfile.py", line 386, in read_obj
    v = self.read_obj()
  File "C:\Users\Utente\venv\lib\site-packages\torchfile.py", line 386, in read_obj
    v = self.read_obj()
  File "C:\Users\Utente\venv\lib\site-packages\torchfile.py", line 414, in read_obj
    "unknown object type / typeidx: {}".format(typeidx))
torchfile.T7ReaderException: unknown object type / typeidx: -1112529805

Can anyone help me? Does StackGAN-Pytorch work on Windows?

Ues 'mu' as "conditional augmentation" and pass it to Discriminator?

In tensorflow version, you use different network to compute "conditional augmentation", but in this pytorch version, you use the mean value computed from Generator as the "conditional augmentation", and pass it to Discriminator: https://github.com/hanzhanggit/StackGAN-Pytorch/blob/master/code/trainer.py#L189. In Discriminator, you concatenate the 'mu' to encoded images directly without computing another "conditional augmentation".

However, in Generator, you compute the c_code, and concatenate it to noise?

Did you do this on purpose? Does this improve the quality of images generated or something else?

some loss keep increase

What's the property parameters for training CUB-birds(200-2011) datasets?

I had tried several param_sets, but the G loss always increases and oscillates in 2.0 after 50 epochs, and the KL loss is always increase..

How to improve the trend of loss to get better images? Any suggestions? Thanks!

Output images aligned?

Hi,

I notice that the output saves fake and real images. Are they supposed to be aligned? Or only the last image is aligned?

OBS: I only ran the "stage 1".

Why the text condition input of netD is `mu`?

Hi,
in trainer.py line:180, the text condition that input to netD is 'mu',
i double why it's not 'c_code' which is the text condition for the netG.

And 'mu' only has half information of the text embedding, so i'm a little confused about it.

Waitting for your reply.

Best regards!

cuda runtime error

THCudaCheck FAIL file=..\aten\src\THC\THCGeneral.cpp line=87 error=30 : unknown error Traceback (most recent call last): File "main.py", line 77, in <module> algo.sample(datapath, cfg.STAGE) File "C:\Users\hunte\OneDrive\Documents\Projects\EAD Project\StackGAN-Pytorch-master\code\trainer.py", line 238, in sample netG, _ = self.load_network_stageII() File "C:\Users\hunte\OneDrive\Documents\Projects\EAD Project\StackGAN-Pytorch-master\code\trainer.py", line 110, in load_network_stageII netG.cuda() File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 260, in cuda return self._apply(lambda t: t.cuda(device)) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 187, in _apply module._apply(fn) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 187, in _apply module._apply(fn) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 187, in _apply module._apply(fn) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 193, in _apply param.data = fn(param.data) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 260, in <lambda> return self._apply(lambda t: t.cuda(device)) File "C:\ProgramData\Anaconda3\lib\site-packages\torch\cuda\__init__.py", line 162, in _lazy_init torch._C._cuda_init() RuntimeError: cuda runtime error (30) : unknown error at ..\aten\src\THC\THCGeneral.cpp:87

How to create images for new text?

Hi, I am using the pretrained COCO model, which is shared in this repo. How can I use it over some other sentences as input? Since the mode will not be train, it will use val_captions.t7. However, I am not clear on how to convert a text file to t7 file. Could you please elaborate on this?
Thanks

Images are chaotic

I ran the pre-trained model and the images are pretty incoherent. Did I load something wrong or is this the current state-of-the-art?

image

Error: Parameter to MergeFrom() must be instance of same class:

File "", line 1, in
runfile('D:/Projects/GAN-Text2Image/code/main.py', args='--cfg cfg/coco_s1.yml --gpu 0', wdir='D:/Projects/GAN-Text2Image/code')

File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 786, in runfile
execfile(filename, namespace)

File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "D:/Projects/GAN-Text2Image/code/main.py", line 73, in
algo.train(dataloader, cfg.STAGE)

File "D:\Projects\GAN-Text2Image\code\trainer.py", line 205, in train
self.summary_writer.add_summary(summary_D, count).eval()

File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\tensorboard\writer.py", line 94, in add_summary
event = event_pb2.Event(summary=summary)

File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\google\protobuf\internal\python_message.py", line 520, in init
_ReraiseTypeErrorWithFieldName(message_descriptor.name, field_name)

File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\google\protobuf\internal\python_message.py", line 448, in _ReraiseTypeErrorWithFieldName
six.reraise(type(exc), exc, sys.exc_info()[2])

File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\six.py", line 692, in reraise
raise value.with_traceback(tb)

File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\google\protobuf\internal\python_message.py", line 518, in init
copy.MergeFrom(new_val,)

File "C:\Users\Anaconda3\envs\tf_gpu\lib\site-packages\google\protobuf\internal\python_message.py", line 1230, in MergeFrom
'expected %s got %s.' % (cls.name, msg.class.name))

TypeError: Parameter to MergeFrom() must be instance of same class: expected Summary got Tensor. for field Event.summary

GPU out of memory during evaluation.

Hi Han,

I'm getting a "cuda runtime error (2) : out of memory" error when I try to evaluate the model using the pretrained weights. What is the hardware requirement to run this code? I have an Nvidia gtx 1080.

Console:

$ python main.py --cfg cfg/coco_eval.yml --gpu 0

Using config:
{'CONFIG_NAME': 'stageII',
 'CUDA': True,
 'DATASET_NAME': 'coco',
 'DATA_DIR': '../data/coco',
 'EMBEDDING_TYPE': 'cnn-rnn',
 'GAN': {'CONDITION_DIM': 128, 'DF_DIM': 96, 'GF_DIM': 192, 'R_NUM': 2},
 'GPU_ID': '0',
 'IMSIZE': 256,
 'NET_D': '',
 'NET_G': '../models/coco/netG_epoch_90.pth',
 'STAGE': 2,
 'STAGE1_G': '',
 'TEXT': {'DIMENSION': 1024},
 'TRAIN': {'BATCH_SIZE': 40,
           'COEFF': {'KL': 2.0},
           'DISCRIMINATOR_LR': 0.0002,
           'FLAG': False,
           'GENERATOR_LR': 0.0002,
           'LR_DECAY_EPOCH': 600,
           'MAX_EPOCH': 600,
           'PRETRAINED_EPOCH': 600,
           'PRETRAINED_MODEL': '',
           'SNAPSHOT_INTERVAL': 50},
 'VIS_COUNT': 64,
 'WORKERS': 4,
 'Z_DIM': 100}
STAGE2_G (
  (STAGE1_G): STAGE1_G (
    (ca_net): CA_NET (
      (fc): Linear (1024 -> 256)
      (relu): ReLU ()
    )
    (fc): Sequential (
      (0): Linear (228 -> 24576)
      (1): BatchNorm1d(24576, eps=1e-05, momentum=0.1, affine=True)
      (2): ReLU (inplace)
    )
    (upsample1): Sequential (
      (0): Upsample(scale_factor=2, mode=nearest)
      (1): Conv2d(1536, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (2): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
      (3): ReLU (inplace)
    )
    (upsample2): Sequential (
      (0): Upsample(scale_factor=2, mode=nearest)
      (1): Conv2d(768, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (2): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True)
      (3): ReLU (inplace)
    )
    (upsample3): Sequential (
      (0): Upsample(scale_factor=2, mode=nearest)
      (1): Conv2d(384, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True)
      (3): ReLU (inplace)
    )
    (upsample4): Sequential (
      (0): Upsample(scale_factor=2, mode=nearest)
      (1): Conv2d(192, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True)
      (3): ReLU (inplace)
    )
    (img): Sequential (
      (0): Conv2d(96, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (1): Tanh ()
    )
  )
  (ca_net): CA_NET (
    (fc): Linear (1024 -> 256)
    (relu): ReLU ()
  )
  (encoder): Sequential (
    (0): Conv2d(3, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): ReLU (inplace)
    (2): Conv2d(192, 384, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (3): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True)
    (4): ReLU (inplace)
    (5): Conv2d(384, 768, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (6): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
    (7): ReLU (inplace)
  )
  (hr_joint): Sequential (
    (0): Conv2d(896, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
    (2): ReLU (inplace)
  )
  (residual): Sequential (
    (0): ResBlock (
      (block): Sequential (
        (0): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
        (2): ReLU (inplace)
        (3): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (4): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
      )
      (relu): ReLU (inplace)
    )
    (1): ResBlock (
      (block): Sequential (
        (0): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
        (2): ReLU (inplace)
        (3): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (4): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
      )
      (relu): ReLU (inplace)
    )
  )
  (upsample1): Sequential (
    (0): Upsample(scale_factor=2, mode=nearest)
    (1): Conv2d(768, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (2): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True)
    (3): ReLU (inplace)
  )
  (upsample2): Sequential (
    (0): Upsample(scale_factor=2, mode=nearest)
    (1): Conv2d(384, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True)
    (3): ReLU (inplace)
  )
  (upsample3): Sequential (
    (0): Upsample(scale_factor=2, mode=nearest)
    (1): Conv2d(192, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True)
    (3): ReLU (inplace)
  )
  (upsample4): Sequential (
    (0): Upsample(scale_factor=2, mode=nearest)
    (1): Conv2d(96, 48, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (2): BatchNorm2d(48, eps=1e-05, momentum=0.1, affine=True)
    (3): ReLU (inplace)
  )
  (img): Sequential (
    (0): Conv2d(48, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): Tanh ()
  )
)
Load from:  ../models/coco/netG_epoch_90.pth
STAGE2_D (
  (encode_img): Sequential (
    (0): Conv2d(3, 96, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (1): LeakyReLU (0.2, inplace)
    (2): Conv2d(96, 192, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (3): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True)
    (4): LeakyReLU (0.2, inplace)
    (5): Conv2d(192, 384, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (6): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True)
    (7): LeakyReLU (0.2, inplace)
    (8): Conv2d(384, 768, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (9): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
    (10): LeakyReLU (0.2, inplace)
    (11): Conv2d(768, 1536, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (12): BatchNorm2d(1536, eps=1e-05, momentum=0.1, affine=True)
    (13): LeakyReLU (0.2, inplace)
    (14): Conv2d(1536, 3072, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (15): BatchNorm2d(3072, eps=1e-05, momentum=0.1, affine=True)
    (16): LeakyReLU (0.2, inplace)
    (17): Conv2d(3072, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (18): BatchNorm2d(1536, eps=1e-05, momentum=0.1, affine=True)
    (19): LeakyReLU (0.2, inplace)
    (20): Conv2d(1536, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (21): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
    (22): LeakyReLU (0.2, inplace)
  )
  (get_cond_logits): D_GET_LOGITS (
    (outlogits): Sequential (
      (0): Conv2d(896, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True)
      (2): LeakyReLU (0.2, inplace)
      (3): Conv2d(768, 1, kernel_size=(4, 4), stride=(4, 4))
      (4): Sigmoid ()
    )
  )
  (get_uncond_logits): D_GET_LOGITS (
    (outlogits): Sequential (
      (0): Conv2d(768, 1, kernel_size=(4, 4), stride=(4, 4))
      (1): Sigmoid ()
    )
  )
)
Successfully load sentences from:  ../data/coco/test/val_captions.t7
Total number of sentences: 40470
num_embeddings: 40470 (40470, 1024)
THCudaCheck FAIL file=/pytorch/torch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
Traceback (most recent call last):
  File "main.py", line 77, in <module>
    algo.sample(datapath, cfg.STAGE)
  File "/home/shenkev/Downloads/StackGAN-Pytorch/code/trainer.py", line 278, in sample
    nn.parallel.data_parallel(netG, inputs, self.gpus)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/parallel/data_parallel.py", line 102, in data_parallel
    return module(*inputs[0], **module_kwargs[0])
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/shenkev/Downloads/StackGAN-Pytorch/code/model.py", line 257, in forward
    h_code = self.upsample4(h_code)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/container.py", line 67, in forward
    input = module(input)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/upsampling.py", line 80, in forward
    return F.upsample(input, self.size, self.scale_factor, self.mode)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 911, in upsample
    return _functions.thnn.UpsamplingNearest2d(_pair(size), scale_factor)(input)
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/thnn/upsampling.py", line 52, in forward
    self.scale_factor
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/torch/lib/THC/generic/THCStorage.cu:66

Truncated caption feature file?

Hi,

Thank you for sharing the code. We are trying to run evaluation of pre-trained StackGAN model but encountered a problem with loading caption features for validation images.

Specifically, when we try to load the feature from (val_captions.t7) using torchfile, the following error occurs:

*** error: unpack requires a string argument of length 4

We are suspecting that may be the caption feature file is truncated, considering that the size of file is ~13.5 MB. Could you please check whether the file is valid?

add the project folder to PYTHONPATH

How do I perform this action?

Currently I have typed the following terminal commands:
python
import sys
sys.path.append("path/to/Modules")
print (sys.path)

OSError: [Errno 22] Invalid argument

@hanzhanggit
@taoxugit
please help me, what is the main problem behind this?

(base) H:\StackGAN\StackGAN-Pytorch-master\code>python main.py --cfg cfg/coco_eval.yml --gpu 0
Using config:
{'CONFIG_NAME': 'stageI',
'CUDA': True,
'DATASET_NAME': 'coco',
'DATA_DIR': '../data/coco',
'EMBEDDING_TYPE': 'cnn-rnn',
'GAN': {'CONDITION_DIM': 128, 'DF_DIM': 96, 'GF_DIM': 192, 'R_NUM': 4},
'GPU_ID': '0',
'IMSIZE': 64,
'NET_D': '',
'NET_G': '',
'STAGE': 1,
'STAGE1_G': '',
'TEXT': {'DIMENSION': 1024},
'TRAIN': {'BATCH_SIZE': 128,
'COEFF': {'KL': 2.0},
'DISCRIMINATOR_LR': 0.0002,
'FLAG': True,
'GENERATOR_LR': 0.0002,
'LR_DECAY_EPOCH': 20,
'MAX_EPOCH': 120,
'PRETRAINED_EPOCH': 600,
'PRETRAINED_MODEL': '',
'SNAPSHOT_INTERVAL': 10},
'VIS_COUNT': 64,
'WORKERS': 4,
'Z_DIM': 100}
Load filenames from: ../data/coco\train\filenames.pickle (82783)
embeddings: (82783, 5, 1024)
This section is run successfully...
STAGE1_G(
(ca_net): CA_NET(
(fc): Linear(in_features=1024, out_features=256, bias=True)
(relu): ReLU()
)
(fc): Sequential(
(0): Linear(in_features=228, out_features=24576, bias=False)
(1): BatchNorm1d(24576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
(upsample1): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(1536, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample2): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(768, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample3): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(384, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(upsample4): Sequential(
(0): Upsample(scale_factor=2, mode=nearest)
(1): Conv2d(192, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): ReLU(inplace)
)
(img): Sequential(
(0): Conv2d(96, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): Tanh()
)
)
STAGE1_D(
(encode_img): Sequential(
(0): Conv2d(3, 96, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(1): LeakyReLU(negative_slope=0.2, inplace)
(2): Conv2d(96, 192, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(3): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(4): LeakyReLU(negative_slope=0.2, inplace)
(5): Conv2d(192, 384, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(6): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): LeakyReLU(negative_slope=0.2, inplace)
(8): Conv2d(384, 768, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
(9): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(10): LeakyReLU(negative_slope=0.2, inplace)
)
(get_cond_logits): D_GET_LOGITS(
(outlogits): Sequential(
(0): Conv2d(896, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): LeakyReLU(negative_slope=0.2, inplace)
(3): Conv2d(768, 1, kernel_size=(4, 4), stride=(4, 4))
(4): Sigmoid()
)
)
)
Preparing training data...
Traceback (most recent call last):
File "main.py", line 77, in
algo.train(dataloader, cfg.STAGE)
File "H:\StackGAN\StackGAN-Pytorch-master\code\trainer.py", line 158, in train
for i, data in enumerate(data_loader, 0):
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 501, in iter
return _DataLoaderIter(self)
File "C:\ProgramData\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 289, in init
w.start()
File "C:\ProgramData\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\ProgramData\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
OSError: [Errno 22] Invalid argument

access denied

I wanted to download the preprocessed char-CNN-RNN text embeddings for coco and the pretrained StackGAN model, but my access got denied in the google drive. I tried to leave a message to request access, but I didn't get any response. Is there any way to access the embeddings and pretrained model or can someone share these files? I will be very grateful if anyone can help me.

Steps to train on a new dataset

Thanks for the great work! Could you please provide the steps for using this model on a new dataset? The dataset has multiple captions per image. It would be really helpful if these steps could be elaborated upon.
Thanks

How to match the generated images with the caption?

I have finished the training of 2 stages on CUB and I can generate the samples during the test.
During the test, I also use the .pickle file extracted from the char-CNN-RNN text embeddings file of CUB as the embedding. But I failed to match the caption with the generated images. I just use the corresponding description during test in [self.caption]( # self.captions = self.load_all_captions()), but they are not matched.
How to get the correct caption of the generated images?

image

Much appreciation!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.