jhhuang96 / convlstm-pytorch Goto Github PK

View Code? Open in Web Editor NEW

353.0 8.0 81.0 9.56 MB

ConvLSTM/ConvGRU (Encoder-Decoder) with PyTorch on Moving-MNIST

License: MIT License

Python 100.00%

convlstm convgru time-series spatio-temporal pytorch-implementation lstm gru rnn encoder-decoder

convlstm-pytorch's Introduction

ConvLSTM-Pytorch

ConvRNN cell

Implement ConvLSTM/ConvGRU cell with Pytorch. This idea has been proposed in this paper: Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

Experiments with ConvLSTM on MovingMNIST

Encoder-decoder structure. Takes in a sequence of 10 movingMNIST fames and attempts to output the remaining frames.

Instructions

Requires Pytorch v1.1 or later (and GPUs)

Clone repository

git clone https://github.com/jhhuang96/ConvLSTM-PyTorch.git

To run endoder-decoder network for prediction moving-mnist:

python main.py

Moving Mnist Generator

The script data/mm.py is the script to generate customized Moving Mnist based on MNIST.

MovingMNIST(is_train=True,
            root='data/',
            n_frames_input=args.frames_input,
            n_frames_output=args.frames_output,
            num_objects=[3])

is_train: If True, use script to generate data. If False, directly use Moving Mnist data downloaded from http://www.cs.toronto.edu/~nitish/unsupervised_video/
root: The path of MNIST data
n_frames_input: Number of input frames (int)
n_frames_output: Number of output frames (int)
num_objects: Number of digits in a frame (List) . [3] means there are 3 digits in each frame

Result

The first line is the real data for the first 10 frames
The second line is prediction of the model for the last 10 frames

Citation

@inproceedings{xingjian2015convolutional,
  title={Convolutional LSTM network: A machine learning approach for precipitation nowcasting},
  author={Xingjian, SHI and Chen, Zhourong and Wang, Hao and Yeung, Dit-Yan and Wong, Wai-Kin and Woo, Wang-chun},
  booktitle={Advances in neural information processing systems},
  pages={802--810},
  year={2015}
}
@inproceedings{xingjian2017deep,
    title={Deep learning for precipitation nowcasting: a benchmark and a new model},
    author={Shi, Xingjian and Gao, Zhihan and Lausen, Leonard and Wang, Hao and Yeung, Dit-Yan and Wong, Wai-kin and Woo, Wang-chun},
    booktitle={Advances in Neural Information Processing Systems},
    year={2017}
}

convlstm-pytorch's People

Contributors

Stargazers

Watchers

Forkers

yaolezju alexanderhucheerful poonono shareringns fionalippert czifan fangzuliang pandasambit15 sakastlord linhongxiang gregorysenay hao12312 ceodspspectrum fanshz iceiceiceee yueyedeai nakajimakou1 yjustc2019 joynny xjf1004 yuanxw5 hustllz difficult-name joywang-123 mlizhardy arifmasrur zmmmms qihang-dai victorliu1994 holmdk tkystof sdpen pranaval enjamamulhoq machengnan lyp317 tqrtq arjunkaruvally cfh1030887317 elliot-mild edmunddzeng gxyaa zerohero321 pavelmandrla h4hisham poisonbox pilipilinb veeoni ocrean-detection tlwzzy yonghuazhang1015 cckkrr chenkarl tonylibing xc1996114514 lyingflatddd netlabcode pon01095 whvixd charles1713 bio-neuroevolution open11012 ambitioner-c aasimwadood mr-nobody-dey jinyuli98 yongjun310 vanessa-f sebastianhafner funson szy4017 freedom-jj xiaojizzz alsac luna-98 jinjunliu liguigeng gzhengjie hunger233 hcl-bit newstuda

convlstm-pytorch's Issues

Blurry results

Hello, awesome repo.
I have been playing with various convlstm/gru implementation as we don't have an official one in Pytorch.
I am having trouble getting good images as output. I am unable to get sharp images as the ones you showed.
I modified your model to output 2 classes per image, to produce binary values and train with CrossEntropy (I just put to 1 all pixels greater that 0.5, and zero the others).
I am also currently trying this UpsampleBlock from fastai2 Unet for the decoder with good results:

class UpsampleBlock(Module):
    "A quasi-UNet block, using `PixelShuffle_ICNR upsampling`."
    @delegates(ConvLayer.__init__)
    def __init__(self, in_ch, out_ch, final_div=True, blur=False, act_cls=defaults.activation,
                 self_attention=False, init=nn.init.kaiming_normal_, norm_type=None, **kwargs):
        self.shuf = PixelShuffle_ICNR(in_ch, in_ch//2, blur=blur, act_cls=act_cls, norm_type=norm_type)
        ni = in_ch//2
        nf = out_ch
        self.conv1 = ConvLayer(ni, nf, act_cls=act_cls, norm_type=norm_type, **kwargs)
        self.conv2 = ConvLayer(nf, nf, act_cls=act_cls, norm_type=norm_type,
                               xtra=SelfAttention(nf) if self_attention else None, **kwargs)
        self.relu = act_cls()
        apply_init(nn.Sequential(self.conv1, self.conv2), init)

    def forward(self, up_in):
        up_out = self.shuf(up_in)
        return self.conv2(self.conv1(up_out))