Giter Club home page Giter Club logo

video-to-text-ocr-demo's People

Contributors

henrylulu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

video-to-text-ocr-demo's Issues

报错

用的VS编译,在import cv2哪里会报错“unresolved import”。
用python直接打开,直接就关了,这是我第一次用github,之前也没有接触过python,但确实按着readme里面操作下来不太行啊。(哭哭~)

报错

root@:~# python3 index.py
Traceback (most recent call last):
File "index.py", line 1, in
import getframe
File "/root/getframe.py", line 31
print c
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(c)?

windows的Ubuntu子系统运行报错,windows搭建python环境也会报同样的错

請問該如何使用

您好:
想嘗試用您的程序來提取韓劇的硬字幕
請問該如何使用呢?
可否請您寫一下簡單的教學?
謝謝

不如来看看隔壁的 extract-subtitles?

https://github.com/duangsuse-valid-projects/extract-subtitles

这个是我改的,原作者比较学院派,没有那些 fixed rate 什么的,主要是 absdiff(m1, m2)scipy.signal.argrelextrema 提取关键帧(key frame)

也就是说没有这种算法:

for frame_no in range(0, video.getprop(cv2.CAP_PROP_FRAME_COUNT), cfg.step):
  video.setprop(cv2.CAP_PROP_POS_FRAMES, frame_no)
  imwrite(f"{frame_no}.png", video.read()[1])

OCR 当然是 PyTesseract ,一个本地的 OCR。 去重的话不是在字幕区图像上面去重,是在文字上应用编辑距离算法。

效果请参看 https://t.me/dsuset/7167

此外你的这个描述… 有点不准确吧

一是防止结果重复;二是能把固定位置的文字收敛(比如台标),避免字幕定位错误。

什么叫做『收敛』…… 这个名词好像是常用在机器学习领域,另外其实还有一种思路是预先裁剪好图像再去 OCR,因为字幕位置复杂的视频嘛… 一般会同时包含纵向字幕,这样即便一次提取完效果其实也不咋样。

像这样比较自动化的识别整个图像然后选 y 位置一致集合里最大的一组视为字幕也可以,如果要优化,你可以参考 extract-subtitle 的关键帧识别算法:

https://github.com/duangsuse-valid-projects/extract-subtitles/blob/64e1d1da376b1ec23740b6645c1b31f52620048d/extract_subtitles.py#L113-L143

代码有问题,运行报错

----------Subtitle Analysis----------
Start subtitle analysis
Traceback (most recent call last):
File "index.py", line 91, in
main()
File "index.py", line 77, in main
if not getsubtitle.main():
File "/Users/MonKong/Downloads/video-subtitle-recognize-master/getsubtitle.py", line 82, in main
if float(word['probility']) < probability:
KeyError: 'probility'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.