zhengchang467 / strpm Goto Github PK
View Code? Open in Web Editor NEWSTRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction, CVPR2022
License: MIT License
STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction, CVPR2022
License: MIT License
作者您好,我对您提出的STRPM模型十分感兴趣,您是否能公开或者介绍一下训练数据集的处理方法和训练过程。万分感谢
Dear authors @ZhengChang467 :
Many thanks for your excellent work. But, there is some unclear meaning in STRPM.py. What is the sense of c_att, m_att, c_net, and m_net? Which formulas do these definitions correspond to in the paper?
I am looking forward to your reply.
Thanks
Dear authors of STRPM,
Thank you so much for sharing this fantastic work with the community! I know you are going to release the training code soon, but I am wondering if it is possible to get some details in advance about how to train STRPM on UCF sports and match the performances in the paper. Especially,
Any information would be highly appreciated. Thank you so much!
可能我对代码理解不太够,希望作者能分享下这个时空特征到底是怎么提取的呢,如果只是在图像上卷积不是只提取了空间特征吗
Dear authors of STRPM,
I'm trying to train my own dataset with your code.
From you code:
print('Loading train dataset') self.path = data_train_path with codecs.open(self.path) as f: self.file_list = f.readlines() print('Loading train dataset finished, with size:', len(self.file_list))
I think the content of train file should like:
data/ucfsport/Kicking-Side/,6
data/ucfsport/Kicking-Side/,7
data/ucfsport/Kicking-Side/,8
data/ucfsport/Kicking-Side/,9
the first line means using 6-9th to predict 10th
the second line means using 7-10th to predict 11th
and so on.
Am I right?
I appreciate your reply!
Thanks a lot!
Chu
First, thank you for your code.
Could you share the whole processed(decoded) test set or how to pre-process the videos and train-test split file ?
Especially for the SJTU4K and UCF
你好,我在看代码的时候发现STRPM_cell的实际实现增加了遗忘门用于处理attn,是类似ST-LSTM?同时想咨询一下cell中为何在最后进行residual操作的时候需要进行判断呢?
您好,感谢您的分享,我对代码有几处疑问(英文怕解释不清)
mask_true的作用
在strpm.py的forward函数中,下面这段代码使用到了mask_true
if t < self.configs.input_length:
net = frames[:, t] #t=0 1,192,64,64 取某一幅图画
else:
time_diff = t - self.configs.input_length #input_length = 4 , 表示预测周期,每4张预测1张
net = mask_true[:, time_diff] * frames[:, t] + (1 - mask_true[:, time_diff]) * x_gen
调试后发现train时,mask_true都是1,test时mask_true都是0:
问题1:是不是可以理解为,训练时,都输入原始图片,而测试时,输入Input_length长度的图片以后,只使用预测出的图片来循环预测?
问题2:但既然是那样的话,在训练代码的loss计算时,为何用了所有的9张图输出,不否应该把t=0,1,2,3时的输出剔除掉来计算loss更好?
tau的设置
tau的设置是不是与input_length有关?
我的场景是用8张图来预测后面2张图,input_length=8,total_length=10, tau是不是应该设为9为好? 这块没有特别理解.
谢谢!盼复!
Qc
2020.6.20
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.