weirme / fcsn Goto Github PK

View Code? Open in Web Editor NEW

114.0 5.0 33.0 98 KB

A PyTorch reimplementation of FCSN in paper "Video Summarization Using Fully Convolutional Sequence Networks"

Python 100.00%

video-processing video-summarization eccv-2018 fcsn

fcsn's Issues

datasets

Why the dataset cannot be downloaded？

IndexError: index 202 is out of bounds for axis 0 with size 0

I tried running module gen_summary.py

Traceback (most recent call last):
File "C:/Users/Sachin/Documents/MTech Dissertation/Video_Summary_using_FCSN/gen_summary.py", line 112, in
gen_summary()
File "C:/Users/Sachin/Documents/MTech Dissertation/Video_Summary_using_FCSN/gen_summary.py", line 104, in gen_summary
get_keys(id)
File "C:/Users/Sachin/Documents/MTech Dissertation/Video_Summary_using_FCSN/gen_summary.py", line 50, in get_keys
keyshots.append(frames[i])
IndexError: index 202 is out of bounds for axis 0 with size 0

getting the above error.
pls help in solving .

training_set and testing_set of tvsum

As mention in the paper, the training and testing set should be 80% and 20%.
But in
https://github.com/weirme/Video_Summary_using_FCSN/blob/96b40851b7805afd1f1fc69f2beb5143d5727b4e/data_loader.py#L25

should it be train_dataset, test_dataset = torch.utils.data.random_split(dataset, [int(len(dataset)*0.8), int(len(dataset)*0.2)])?

Thank you.

train on my own dataset

In the code, I use features of different dimensions, and the model will report a dimension error. I was wondering if I could change the model so that the model not only supports the unified 320 features

unable 同openobject（object ‘video_-esJrBWj2d8’ doesn't exit）

I get this error when running make_dataset.py, what should I do?

Could you share the SumMe dataset on your google drive?

Could you share the SumMe dataset on your google drive? I need Original Video,not h5 file.
I can‘t find SumMe dataset online,thank you very much!

Pre-trained model

Could you please provide a pre-trained model for the same?

How test this on single video?

Hi.
Thanks for sharing your code.
Could you help me with testing this code on single video?
I appreciate your help in advance.

googlenet eval mode

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/make_dataset.py#L53-L56

Because googlenet is only for feature extraction, it should be in eval mode.

Remember that you must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results.

Can you provide‘.h5’ files under three settings of dataset?

Implementation of Reconstruction and Diversity loss

Could you please point me to the implementation of Reconstruction and Diversity losses? Is there an option to reproduce the scores for your unsupervised model?

hello,Are you the author of the paper?

The architecture of the FCSN is different from the paper

data shape problem

The original frame feature shape is [320,1024]

But the code https://github.com/weirme/Video_Summary_using_FCSN/blob/96b40851b7805afd1f1fc69f2beb5143d5727b4e/data_loader.py#L18
wants to reshape to [1024,320] directly.

Should it use transpose instead of reshape?

Thank you.

How to do this in colab

Can you tell this how to perform this project in google colab, step by step please can you help.
I am not getting how to do in google colab.

frame-level scores to keyframes problem

As in FCSN Table 1, they use this paper 1.3 to convert frame-level scores to keyframes.

But you use this method to get keyframe which seems not identical to the FCSN paper.

Any ideas about the structure of unsupervised SUM-FCN

After reading chapter 3.3 in FCSN several times, I can not figure out what exactly structure of the unsupervised part. Is that mean:

select Y frames: choose the top Y socres features with dimension:
batch * 2 * Y
apply a 1*1 conv to decode features above to reconstruct their orginal feature representations:
batch * 2 * Y -> batch * 10 * Y (shape of the output of conv8)
merge the input frame-level feature vectors of thess selected Y frames using skip connection:
batch * 1024 * Y -> batch * 10 * Y
and then added by the output of step 2
obtain final reconstructed features of the Y frames:
batch * 10 * Y -> batch * 1024 * Y

F-score

Hello, your code is not complete, your test code does not get the accuracy rate and the recall rate.

Interval problem

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/eval.py#L24

The [start:end] operator excludes the end element

should it be
pred_value = np.array([pred_score[cp[0]:(cp[1]+1)].mean() for cp in cps]) ?

Also
https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/eval.py#L29

Various length input experiment

Has anyone ever tried various length input test?

input feature (1 x T x 1024) then,
T is 4494 or 1234 or whatever.. (the number of each video frame)

I tried this setting but the NLL Loss is not reduced...

fcsn_dataset.h5

Can someone tell me what the content format of fcsn_dataset.h5 is like?

Is the model same as discussed in the paper??

In the paper for the final layer a tensor of shape 1TC is output where c is class so ouput tensor should be 320 x 2 for 2 classes ?? Can you clarify this??

Hello，Can you send me a dataset?

Hello，
I come from China, I can not view your google link，Can you send me a dataset to my email?
my email is [email protected]
Thank you very much!

A mistake in train.py line 48

the code is:

log_p = torch.log_softmax(pred_score, dim=1).reshape(-1, n_class)

where "pred_score" is a (n_batch, n_class, n_frame) tensor. then, doing log_softmax on it and reshape it in (-1,n_class). However, function "reshape" default in "Row first" mode, and we need "Col frist" mode here. the right code is:

log_p = torch.log_softmax(pred_score, dim=1).permute(0,2,1).reshape(-1, n_class)

Hello,can you tell me the concrete structure of the unsupervised FCSN?

How can I summary a custom video?

As described, How can I test on a custom video and get the summary?

provide data/files (ydata-tvsum50-v1_1)

can you please provide me with this data file
(ydata-tvsum50-v1_1)
I could not find it in the internet also in your data file.
also can you please provide me with the data root which I think it is the file with the name (TVSum) ... I did not find them also can you please give the link of the data here in a comment or send them to me by email: [email protected]
I am so thankful for your help.

How can get change points using KTS?

I tried to get change points using KTS code.
But i couldn't get proper change points.

If someone get change points using KTS, please help me?

Normalized value problem

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/make_dataset.py#L49

These values are for imagenet dataset. Does it also fit the dataset we use here?

index 5866 is out of bounds for axis 0 with size 5846

Traceback (most recent call last):
File "gen_summary.py", line 112, in
gen_summary()
File "gen_summary.py", line 104, in gen_summary
get_keys(id)
File "gen_summary.py", line 50, in get_keys
keyshots.append(frames[i])
IndexError: index 5866 is out of bounds for axis 0 with size 5846

Any idea how to solve this?

gen_summary failed with IndexError

Hello @weirme ,

Thank you for the great implementation.

I tried to use gen_summary.py to generate summaries for tvsum videos but failed. I used the default settings in the code and laid the dataset accordingly, but an IndexError is thrown.

I found that it is because the video IDs are not used right. The IDs in the original TVSum dataset are random names, but in your case ids are 1-50. So do you have a mapping between the two?

Thanks.

Importance scores in the TVSum50 tsv file

Hello,
How should I interpret the importance scores in the tsv file of the original TVSum50 dataset?
Are they for each frame? If yes, what is the frame rate used?
What is the significance of a shot being of 2 seconds?

The data annotation file has importance scores for each video. The readme said that each shot is 2 seconds. Hence while going through the data for the 1st video (length 5 min 54 sec), the number of annotations provided was over 10000. I am not able to understand how the length of the video is related to the number of annotations. Multiplying each video duration with commonly used frame rates (24-30) doesn't help as well.

implementation of get_oracle_summary function

https://github.com/weirme/Video_Summary_using_FCSN/blob/0895cccbb2a488369b1bfc7d2c087b3050250898/make_dataset.py#L70

What is the meaning of this function?

Is it able to summarize a custom video?

As described, is it able to summarize a custom video?

Broken Link in Readme

This link doesn't seem to work,

http://www.cs.umanitoba.ca/~mrochan/projects/eccv18/fcsn.html

KeyError: "Unable to open object (object 'video_tensor(41)' doesn't exist)"

I only modified the address of the dataset

Traceback (most recent call last):
File "train.py", line 142, in
solver.train()
File "train.py", line 60, in train
for batch_i, (feature, label, ) in enumerate(tqdm(self.train_loader, desc='Batch', ncols=80, leave=False)):
File "/home/student/maruidi/anaconda2/envs/FCSN/lib/python3.6/site-packages/tqdm/std.py", line 1129, in iter
for obj in iterable:
File "/home/student/maruidi/anaconda2/envs/FCSN/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 615, in next
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/student/maruidi/anaconda2/envs/FCSN/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 615, in
batch = self.collate_fn([self.dataset[i] for i in indices])
File "/home/student/maruidi/anaconda2/envs/FCSN/lib/python3.6/site-packages/torch/utils/data/dataset.py", line 103, in getitem
return self.dataset[self.indices[idx]]
File "/home/student/maruidi/Frames/Video_Summary_using_FCSN-master/data_loader.py", line 17, in getitem
video = self.data_file['video'+str(index)]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/student/maruidi/anaconda2/envs/FCSN/lib/python3.6/site-packages/h5py/_hl/group.py", line 264, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object 'video_tensor(41)' doesn't exist)"

weirme / fcsn Goto Github PK

fcsn's Issues

Recommend Projects

Recommend Topics

Recommend Org