Giter Club home page Giter Club logo

autosub-with-baidu-deepspeech2's Introduction

Autosub-with-Baidu-DeepSpeech2

autosub是目前比较好的语音识别的工具,但是它的问题是,由于受到qiang的限制,我们需要找到合适的vpn才能使用这个开源的工具。针对中文语音的识别,我们结合了百度的deepspeech模型,结合使用autosub和百度的模型来进行识别,这样就很好的解决了上述问题,本项目将这种技术运用在了新闻联播节目的语音识别,来为新闻联播节目加上字幕,取得了一定的效果,因为是直接用了百度训练好的baidu_cn1.2k模型,所以效果并没有特别好,有需要的小伙伴可以自己训练适合于特定项目的模型。

环境需求及安装配置

操作系统:ubuntu16.04 python环境:使用python2.7

环境配置: sudo apt-get install -y pkg-config libflac-dev libogg-dev libvorbis-dev libboost-dev swig

'google-api-python-client>=1.4.2',
'requests>=2.3.0',
'pysrt>=1.0.1',
'progressbar2>=3.34.3',
'six>=1.11.0',
'progressbar',
'scipy>=0.19.0',
'paddlepaddle'->pip install paddlepaddle-gpu==1.1.0.post87,
运行 sh setup.sh

数据及模型下载

测试数据下载:链接:https://pan.baidu.com/s/1xr4YXN3g30fx2pQEKrgx1Q 提取码:r09s 将数据下载到/data/cctv/路径下 下载好数据后,我已将该视频生成的srt文件放在了/data/cctv/目录下,打开视频添加srt文件即可看到测试视频的效果。

训练好的中文模型baidu_cn1.2k下载地址: 链接:https://pan.baidu.com/s/1JI1Qh4x9UT9fdkT6TBYp0Q 提取码:t29d 将模型下载到models/baidu_cn1.2k路径下解压

代码运行

运行代码:终端调用接口task1_interface.py: from task1_interface import extractSubtitlefromVideo(./data/cctv/CCTV.mp4)

识别结果

result1
result2
result3

视频+字幕(windows上用迅雷影音播放的结果,有的播放器会乱码):

注:

1)配置环境需要cuda8.0+cudnn7,建议使用conda install cudnn=7.0.5进行安装
2)若出现带有interpn.so类似字样的错误,建议使用pip卸载resampy,再重新进行安装,并将scipy升级至0.19.0以上
3)大概要使用6GB左右的GPU内存空间

autosub-with-baidu-deepspeech2's People

Contributors

lizhaokun avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.