Giter Club home page Giter Club logo

livebot's Introduction

LiveBot

This is the codes and datasets for the papers: LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts.

What is Live Video Comments?

Live video commenting, which is also called ''video barrage'' (''弹幕'' in Chinese or ''Danmaku'' in Japanese), is an emerging feature on online video sites that allows real-time comments from viewers to fly across the screen like bullets or roll at the right side of the screen.

Requirements

  • Ubuntu 16.0.4
  • Python 3.5
  • Pytorch 0.4.1
  • Sklearn >= 0.19.1

Datasets

  • Processed dataset can be directly used for our codes to reproduce the results reported in the paper. It should be downloaded from Google Drive or Baidu Pan, and put in the folder /data.

  • Raw dataset consists of the videos and the corresponding live comments that directly downloaded from the Bilibili video websites. It can be found at Google Drive or Baidu Pan. After processed with the scripts in the folder /data, it can be transformed into the processed datasets above.

Livebot Model

  • Step 1: Download the processed dataset above

  • Step 2: Train a model

    python3 codes/transformer.py -mode train -dir CKPT_DIR
    
  • Step 3: Restore the checkpoint and evaluate the model

    python3 codes/transformer.py -mode test -restore CKPT_DIR/checkpoint.pt -dir CKPT_DIR
    

Process a raw dataset (Optional)

  • Step 1: Extract the frames from the videos and the comments from the .ass files.
    python3 data/extract.py
    
  • Step 2: Convert the extracted images and text into the format required by our model.
    python3 data/preprocess.py
    
  • Step 3: Construct the candidate set for the evaluation of the model.
    python3 data/add_candidate.py
    

Note

  • More details regarding the model and the dataset can be found in our paper.

  • The code is currently non-deterministic due to various GPU ops, so you are likely to end up with a slightly better or worse evaluation.

Citation

Hopefully the codes and the datasets are useful for the future research. If you use the above codes or datasets for your research, please kindly cite our paper:

@inproceedings{livebot,
  author    = {Shuming Ma and
               Lei Cui and
               Damai Dai and
               Furu Wei and
               Xu Sun},
  title     = {LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts},
  booktitle = {{AAAI} 2019},
  year      = {2019}
}

livebot's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

livebot's Issues

Problem in reproducing the paper results

Hello, I have read your paper and trained the transformer model with the processed dataset. However, I can't reproduce the results in the paper.

The results in the paper are showed below:
pic1
We can see that the model trained with both video and comment performs best.

I download the code and the processed dataset, train the model on a single GPU(it takes about 2.5 days for 50 epoches), and test the model on the final epoch, just like the instructions in README.md(I also test other checkpoints and the results are close). I set n_img to 0 and n_com to 0 respectively and train two another models to get the Comment_Only and Video_Only results.
pic2
The values are much higher than the results in your paper and Video_Only model shows an extremely high performance, which is counter-intuitive.

So I check the source code and find something strange(maybe a bug?). The line 155 in transformer.py is showed below.
pic4
It returns the loss rank of 100 candidate comments for each test item and the evaluation metrics are calculated accordingly. It should be a list ranked by the candidates' log-likelihood scores descendingly. However, the CrossEntropy Loss is contrary to the log-likelihood score, a lower loss means a higher log-likelihood score. So I think the loss should be ranked ascendingly. I fix the code line and run the test again.
pic3
This time, the model trained with both video and comment performs best as wished. However, the value is much lower than the results in your paper.

So I have some questions and expect your reply:

  1. Is there really a bug in transformer.py line 155 or it is just my misunderstanding?
  2. Why the results are so different from the paper(too high and strange on the original code, too low on my fixed code)? Is the processed dataset is what you use in the paper? Or is there something else wrong in the code?
  3. What's your result after runing the code in the repository? I have trained two times and get close results showed above.

Several Issues I encountered in replicating the paper.

Hi,

I am very interested in your work. In order to better understand your work, I have made several attempts to reproduce the baseline results reported in the paper.

When using the processed dataset provided in this repo, I get similar scores to those reported in this existing investigation issue. These results are summarised below.
image

However, when I process the raw dataset and train it on the same model, the performance drops significantly:

image

I tried different partitions of the data, but results vary very little. In order to understand the reason for these differences, I examined the provided dataset and found that there are a number of instances of identical videos assigned with different video ids that appear in both the training and test sets. I think this situation may have arisen by crawling the same video multiple times in creation of the dataset.

There is clearly a very large performance gap between my results and the baselines in the paper, and I wonder whether these repeated videos are responsible for the significant differences in the results. I am wonder, could you tell me please:

  • is the provided processed dataset the exact same one that was used to produce the baselines in the paper?
  • is the repetitive data the only cause of the performance difference or are there further reasons that you are aware of that could explain these differences?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.