We expect POST requests, where the keys are passed to request body at: http://34.67.150.35/story_to_video
story_images: []
- A list of story imagescallback_url: str
- A url where the result will be sent to, use https://webhooks.site/ if you need a quick setupvoice: Optional[str]
- A voice name or url supported by play.ht A list of voices can be found here: https://docs.play.ht/reference/api-list-ultra-realistic-voicesprompt: Optional[str]
- The prompt to generate the text fromis_toontube: bool
- If provided, the API will look for the first (and only) url in story_images, and use these story images`
- Note: The toontube url is from the READER, e.g:
- https://toontube.co/reader/6400809b651f9fc369319f43/653672907b605c94f734f89a?page=1
background_music: str
- A key or phrase to generate background music from`
Note that due to the nature of POC not everything will work correctly, and debugging is based on "id" key of response json
Usage example:
curl -X 'POST' \
'http://34.67.150.35/story_to_video' \
-H 'Content-Type: application/json' \
-d '{
"story_images": [
"https://storage.googleapis.com/public_stories/abc/007.jpg",
"https://storage.googleapis.com/public_stories/abc/011.jpg",
"https://storage.googleapis.com/public_stories/abc/018.jpg",
"https://storage.googleapis.com/public_stories/abc/028.jpg"
],
"callback_url": "https://webhook.site/f3fcb54c-24ce-47e6-9a65-663770829de1"
}'
- using chatgpt for narration
- using elevenlabs for audio
- using ffmpeg for video creation
- using freesound for background music
- using X for captions
A demo output video generated by this project can be found here:
- On Vimeo: https://vimeo.com/855961539
- In this repository:
demo_output/video.mp4
- Install ffmpeg: https://ffmpeg.org/download.html
(or, you can probably install it as a package:
brew install ffmpeg
,apt install ffmpeg
, etc.) - Install Pip Environemnt:
pipenv install && pipenv shell
- Install Gemfile:
sudo gem install bundler -v '2.4.21' && sudo bundle install
- Set OpenAI API Key:
export OPENAI_API_KEY=<KEY>
- Set Elevenlabs API Key:
export ELEVEN_API_KEY=<KEY>
- In
./src/main.py
(bottom): set the channel name, the desired topic, voice name, and the destination dir. - Run
./src/main.py
Missing from pip:
- speech_recognition?
- requests.requests_oauthlib?
- //git+https://github.com/MTG/freesound-python
- QuickTime Player on MacOS plays the audio incorrectly after a few seconds of playback. There is no problem with the video and it's audio, it's an issue with this specific player. The video can be uploaded without issues to YouTube and played with other players.
Feel free to fork, suggest ideas, report issues, and give general constructive feedback.
^(;,;)^