anyirao / sceneseg Goto Github PK

View Code? Open in Web Editor NEW

216.0 216.0 44.0 1.1 MB

Codebase for CVPR2020 A Local-to-Global Approach to Multi-modal Movie Scene Segmentation

Home Page: https://anyirao.com/projects/SceneSeg

Python 99.93% Shell 0.07%

boundary-detection scene segmentation video-analysis

sceneseg's Introduction

Hi there 👋

sceneseg's People

Stargazers

Watchers

sceneseg's Issues

Low AP score on BBC dataset

Hi, great job on this work!

I have been trying to test BBC dataset using only place feature (generated using pre/place/extract_feat.py) with place LGSS model you provide but got much worse results (AP around 40) than your paper (AP 79.5). I understand that the AP score mentioned in your paper was probably generated using all four features as well as global optimization. But I still expect using only place feature performs better than what I got. Do you know where I could be wrong?

Another question is BBC dataset provides 5 sets of ground truth labels, which one did you use in your evaluation?

Thanks a lot

Encountered an Error Running the Default Demo

I got NoneType in the pairlist of utilis/dataset_utilis.py:

...visualize scene video in demo mode, the above quantitive metrics are invalid
Traceback (most recent call last):
 File "run.py", line 210, in <module>
   main()
 File "run.py", line 205, in main
   scene_dict, scene_list = pred2scene(cfg, threshold=0.8)
 File "/shared/nas/data/users/yifung/data/MovieNet/yi/data_preproc/SceneSeg/lgss/utilis/dataset_utilis.py", line 166, in pred2scene
   scene_list,pair_list = get_demo_scene_list(cfg,pred_list)
 File "/shared/nas/data/users/yifung/data/MovieNet/yi/data_preproc/SceneSeg/lgss/utilis/dataset_utilis.py", line 55, in get_demo_scene_list
   for pair in pair_list:
TypeError: 'NoneType' object is not iterable

Update: okay I read in previous closed issues that maybe the default demo video doesn't have multiple scenes to segment. Will try another video file and see.

What means some parameters in the config file

Hello,
Thank you for brilliant work, I read paper and now Im studing code and I have some doubt.
I have a question about params in config file.
I dont understand crealy what means: shot_num and seq_len.

I also dont understand exacly why in _image config file exist network "LGSS_image" and in other config file exist "LGSS" network, what is difference between them ?

Thank for help,
Best

'image18_demo' model missing

I tried running the demo, following the steps in GETTING_STARTED.md and when I run lgss $ python run.py config/demo.py I get this error:

ValueError: => No checkpoint found at '../run/image18_demo/model_best.pth.tar'

I checked the run directory in Google Drive and there is no image18_demo directory.

Can I run the demo with the image50 model? Could you provide the image18_demo model please?

Thank you, this project looks promising

How can I get the videos of the scene segmentation dataset

There are only the annotation of the scene segmentation dataset in the geogle drive. I want to make experiments about scene segmentation using the dataset. But The website movienet.site shows the videos have yet not been released. So how can I get the overall data of the dataset or when I can download them.

Availability of MovieScenes Dataset

The CVPR 2020 paper mentions 'MovieScenes' dataset but it has not been included with Movienet.site. The site has shot-level annotations, but there is no movie scene transition annotations available. Will this dataset be publicly released? @AnyiRao

Dynamic programming

Please, your open source code does not include the dynamic programming described in the paper? And,if I use MovieNet to extract cast and action information,Should I retrain the Lgss from zero.Because the dimensions of the features don't match...

No checkpoint found at '../run/image50/model_best.pth.tar'

When I run python run.py config/demo.py I am getting the following error:

Traceback (most recent call last):
File "run.py", line 210, in
main()
File "run.py", line 186, in main
osp.join(cfg.logger.logs_dir, 'model_best.pth.tar'))
File "/Users/X/Downloads/SceneSeg-master-2/lgss/utilis/torch_utilis.py", line 35, in load_checkpoint
raise ValueError("=> No checkpoint found at '{}'".format(fpath))
ValueError: => No checkpoint found at '../run/image50/model_best.pth.tar'

If I have my own video is the the code atm able to return the scenes frame indexes (when each scene starts)? What part should I run? the demo?

Thank you for making this open source and all the help.

four elements （place, cast, action and audio）ratios of Tabel 4？

Thanks for your great work.
I want to know the ratios of four elements feature of Tabel 4. Why do you set these ratios?

Clarification on how data is obtained

Hi, thanks a lot for the great repo! One question, I notice that in all.py you have the following line,

SceneSeg/lgss/src/data/all.py

Lines 107 to 110 in 94c3a9e

 shotid_tmp = 0 

 for shotid in shotid_list: 

 if int(shotid) < shotid_tmp+seq_len_half: 

 continue

does that mean you only train / run inference on shots that can be fully covered by the window length divided by 2? (Ie the 1st and 2nd shot of a movie would not be trained on at all with a window size of 4 and half window size 2). Also if thats the case then what about the shots at the very end (the last and second to last shot with the same window sizes as above) Trying to understand how the data is fed into the model, thanks!

movienet-tools extract action feature dims 2048 ,while google cloud provide movienet dataset action feature is 512

Place Features missing from Google Drive

I was trying to predict with the all configuration, using all the features and I got an error stating that the files in the data/scene318/place_feat directory are missing.

Once again, I checked the Google Drive directory and they are not there.

Could these features be provided in the Drive directory, or at least could I be pointed to where can I find them?

Thank you

Config file

Hi,

To run "run.py" inside "lgss", there seems to a "Config" file:

parser.add_argument('config', help='config file path')

Can you kindly point us to the path to it if it is somewhere in the current repository? or can you kindly provide a template?

Thanks so much and have a good day!

Syntax error in pre/place/extract_feat.py

There is an extra bracket on line 241 in pre/place/extract_feat.py

Evaluation Metrics

Your paper uses 3 metrics, which are AP, Miou and Recall.

I wonder whether the best results of these three metrics are achieved by the same pretrained lgss model, or they are achieved by different pretrained models.

And can you provide the pretrained models?

How to solve NoneType error when running demo

When I run this demo code, I got this 'NoneType' error. Could you give me any suggestions on this?

(1) run code:

cd pre
python demodownload.py ## Download a YouTube video with pytube
python ShotDetect/shotdetect.py --print_result --save_keyf --save_keyf_txt ## Cut shot 
cd ../lgss
python run.py config/demo.py ## Cut scene

(2) print log as follows:

Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to /usr/local/app/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 97.8M/97.8M [00:30<00:00, 3.41MB/s]
...data and model loaded
...test with saved model
=> Loaded checkpoint '../run/image50/model_best.pth.tar'
AP: 1.000
mAP: 1.000
Average loss: 1.7054, Accuracy: 1/10 (10%)
Accuracy1: 1/10 (10%), Accuracy0: 0/0 (0%)
gts = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], preds = [0.05882452055811882, 0.043319448828697205, 0.14641502499580383, 0.15955811738967896, 0.1980414241552353, 0.1355811506509781, 0.48238664865493774, 0.48149824142456055, 0.17736411094665527, 0.5957018733024597]
Miou:  0.18324022346368712
Recall:  0.1
Recall_at_3:  0.0999999000001
...visualize scene video in demo mode, the above quantitive metrics are invalid
{'0335': 0, '0336': 0, '0337': 0, '0338': 0, '0339': 0, '0340': 0, '0341': 0, '0342': 0, '0343': 0, '0344': 0}
Traceback (most recent call last):
  File "run.py", line 208, in <module>
    main()
  File "run.py", line 202, in main
    scene_dict, scene_list = pred2scene(cfg, threshold=0.8)
  File "SceneSeg/lgss/utilis/dataset_utilis.py", line 167, in pred2scene
    scene_list, pair_list = get_demo_scene_list(cfg, pred_list)
  File "SceneSeg/lgss/utilis/dataset_utilis.py", line 56, in get_demo_scene_list
    for pair in pair_list:
TypeError: 'NoneType' object is not iterable

image50 vs place

The resent50 in 'image50/model_best.pth.tar' is finetuned? It's results seems better than place.

How to get the 150 videos

On MovieNet's website, there are 318 scene boundary annotated movies. Where can I find the 150 videos used in your experiments?

Share the model checkpoints?

Has anybody trained this and saved the model checkpoints which can be loaded and used? If so, can you pls share it? It will make this model usable for everyone very easily.

issue of transfer learning: train on the given labeled data and evaluated the model on the other set

Hi, Anyi
I trained the model based on labeled data and place_feat.zip. Here I assume that features are extracted for shots based on pretrained pytorch resnet50. On your provided labelled data, AUC on the testing was about 0.75. But when I tested it on BBC earth planet dataset using pre/place/extract_feat.py, AUC on the bbc dataset is only 0.55. Do you have clue?

Thanks

Jianxiong

Is the annotation tool open source anywhere?

I'd love to try it

Missing the code of global optimal grouping?

hello,
The code is running successful,it is a very good job.But i can not find the global optimal grouping(dynamic programming )in this repository, is it missing either i don not understand the part of the code?

Expect to your reply,thanks.

pytube issue

Hi,

There seem to be two issues related to the usage of "pytube" in this project:

It looks like "pytube" should be installed with "pip3 install pytube3";
When running "demodownload.py" inside "pre" directory, there seem to be an error related to "key" (see below)

formats = json.loads(stream_data["player_response"])["streamingData"][
KeyError: 'streamingData'

Please kindly advise.

Thanks so much and have a good day!

Can this job handle a continuous video sequence?

Great job! I'm currently interested in processing a continuous video sequence (i.e., a video captured continuously by a camera), such as transitioning from indoor to outdoor environments, or from textured scenes to textureless scenes. Can this job directly take such videos as input and segment my video into these different scenes?

Global Optimal Grouping at Movie Level

Hi,

Thanks for publishing your helpful code. I was wondering whether you have released the code for section 4.4 in the paper which is about the global optimization?

Thanks!

The dimension of cast_feature using movienet-tool is defferent from yours in cast_feat.pkl

hello，
when i use movie-tool tu reduce cast_feat,i get the dimension of feature is (256,)

I check your cast feature in cast_feature.pkl, the dimension is (512,):

What can i do to get the dimension or cast feature is (512,)?

Meaning of -1 for scene boundary label

In the first 10 lines at the label file of tt0047396.txt, there is -1.
I think both 1 and -1 mean boundary.
So, I wonder why do you use -1 and what is the difference between 1 and -1?

Model is missing

Hi guys, i'm trying to launch demo, but i can't find where is your trained models.
python run.py config/demo.py
and get this: `ValueError: => No checkpoint found at '../run/image50/model_best.pth.tar'
Can you help me? Thanks.

Where is the global optimization part of LGSS?

global optimal grouping at movie level (Global).
I did not find it.It would be greatly appreciated if you can answer.

Thank you

How to extract place, action, audio, cast features using the pre-trained models provided

Starting from a custom movie shot, how to extract the place, action, audio, cast feature tensors from it.

	shotid_tmp = 0
	for shotid in shotid_list:
	if int(shotid) < shotid_tmp+seq_len_half:
	continue

anyirao / sceneseg Goto Github PK

sceneseg's Introduction

Hi there 👋

sceneseg's People

Stargazers

Watchers

Forkers

sceneseg's Issues

Recommend Projects

Recommend Topics

Recommend Org