Giter Club home page Giter Club logo

scenescoop's Introduction

Scenescoop

Scenescoop is a tool to get similar semantic scenes from a pair of videos. Basically, you input a video and get a scene that has a similar meaning in another video. You can run it as a python script or as a web app.

description

How it works

Scenescoop uses the im2text tensorflow model to analyze videos on a frame to frames basis and get a description of the content of those images. Frames with the same description are grouped together to create a sequence or scene.

Scenes are then analyzed with spaCy, for sentence parsing and built-in word vectors, using the average of the word vectors in the sentence.

Annoy is finally used to create an index for fast nearest-neighbor lookup (based on @aparrish Plot to poem)

This project is inspired by Thingscoop.

Video Demos

A man sitting at a table with a plate of food

A group of people walking down the street

Usage

To run this you'll need to install a few dependencies. You can follow the original repository or the instructions Edouard Fouché wrote. (I plan to write a step-by-step guide on how to install everything)

You can also get the pretrained model I'm using here.

Once everything is installed, clone the repo and install the project dependencies:

git clone https://github.com/cvalenzuela/scenescoop.git
cd scenescoop
pip install -r requirements.txt

You can then run Scenescoop in two modes:

1) Frame Analysis Mode

Given a video file --video (.mp4, .avi, .mkv or .mov), this will analyse the file frame by frame and output a .json file containing the descriptions of the those frames. The --name argument should be the output name of the transcript.

Example:

python scenescoop.py --video videos/moonrisekingdom.mp4 --name moonrisekingdom

The .json file should look something like this:

{ 
...
"a person is taking a picture of themselves in a mirror ": [4834], 
"a man sitting in the back of a pickup truck ": [2265, 2266], 
"a man sitting on a bench in front of a building ": [1935, 1937, 
1938, 3950, 3951, 3952, 3953, 3960, 4072, 4073, 4074, 4075, 
4077, 4079, 4080, 4082, 4115, 4467], 
"a man standing next to a tree holding a surfboard ": [2470]
...
}

2) Transfer Mode

Two videos are required for this mode and both should have their corresponding transcript.json file created in the Frame Analysis Mode.

The --input_data argument should be the .json file containing the data for the input video and --transform_data is the .json file for the transfer video. --input_seconds is the input time frame to transfer and --transform_src is the video source of the transfer video.

Example:

python scenescoop.py --input_data transcripts/street.json --input_seconds 0,5 --transform_src videos/her.avi --transform_data transcripts/her.json

You can print all options with python scenescoop.py -h:

usage: scenescoop.py [-h] [--video VIDEO] [--name NAME]
                     [--input_data INPUT_DATA] [--input_seconds INPUT_SECONDS]
                     [--transform_src TRANSFORM_SRC]
                     [--transform_data TRANSFORM_DATA] [--api API]

Storiescoop

optional arguments:
  -h, --help            show this help message and exit
  --video VIDEO         Video Source to transform
  --name NAME           Name of the video
  --input_data INPUT_DATA
                        Input Video. Must be a json file.
  --input_seconds INPUT_SECONDS
                        Input Video Seconds to create transformation. Example:
                        1,30
  --transform_src TRANSFORM_SRC
                        Transform Video Source.
  --transform_data TRANSFORM_DATA
                        Transform Video Data. Must be a json file.
  --api API             API Request

Web App

You can also launch an interactive web app, using a flask server, to run the Frame Analysis Mode and Transfer Mode in a webpage. You'll still need all the dependencies installed.

description

To run the app in a local server:

python server.py

The visit localhost:8080.

To modify the source code:

cd static
yarn watch

MMS

Local development of the MMS application:

Start ngrok

./ngrok http 7676

Configure the url in Twilio and in the server in NGROK_URL

Start the Redis server

redis-server

Start the Celery worker:

celery -A server.celery worker

Finally start the server

python server.py

License

MIT

scenescoop's People

Contributors

cvalenzuela avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

scenescoop's Issues

minor path issues

i got this to work, but a couple of minor path-oriented things were issues.

  1. it crashes because the directory "temp" doesn't exist. making it manually fixes this.

  2. had to change the paths in run_model.py to the im2txt checkpoint and vocab files because they were switched slightly recently: now models/research/im2txt instead of models/im2txt.

error in using checkpoint (trained model)

Regardless of whether I use the version of TF specified in requirements file or if I upgrade to 1.10.0 , I get the error regarding the checkpoint (running on my mac locally). Any ideas on using a different pre-trained model?

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Tensor name "lstm/basic_lstm_cell/bias" not found in checkpoint files /Users/silverbox/Dropbox/video_manipulation/scenescoop/models/im2txt/checkpoints/model.ckpt-3000000
	 [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]


ImportError: cannot import name account_sid

when trying to run the server, i get:

Traceback (most recent call last):
  File "server.py", line 24, in <module>
    from credentials import account_sid, auth_token, number, movies
ImportError: cannot import name account_sid

pip install credentials is what i used to get this library down but that doesn't seem to fix it.

the checkpoint files in README seems mismatch the tensorflow version in requirement

I tried to test the im2txt model but was stuck in executing the below command in the tutorial:

bazel-bin/im2txt/run_inference \ --checkpoint_path=${CHECKPOINT_DIR} \ --vocab_file=${VOCAB_FILE} \ --input_files=${IMAGE_FILE}

with the error showing that:

tensorflow.python.framework.errors_impl.NotFoundError: Tensor name "lstm/basic_lstm_cell/weights" not found in checkpoint files /home/ubuntu/im2txt/model.ckpt-3000000

This seems to be a mismatch problem between the checkpoint files and the tensorflow version. But I installed exactly the version in requirement.txt ( which is 1.1.0). And I downloaded exactly the same pretrained model in the link in README file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.