Giter Club home page Giter Club logo

lecture-video-to-pdf's People

Contributors

ekarton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

lecture-video-to-pdf's Issues

IndexError: string index out of range

I'm getting this error -
In my subtitle file I dont have any dots (.) I think that is the problem,

This is the output

Number of frames: 103
Getting subtitles for each frame
Traceback (most recent call last):
  File "src/main.py", line 87, in <module>
    runner.run(sys.argv[1:])
  File "src/main.py", line 51, in run
    self.__run__(
  File "src/main.py", line 70, in __run__
    segments = segment_finder.get_subtitle_segments(subtitle_breaks)
  File "D:\Study\Sixth-Sem\Lecture-Video-to-PDF\src\subtitle_segment_finder.py", line 42, in get_subtitle_segments
    pos = self.__get_part_position_of_time_break__(time_break)
  File "D:\Study\Sixth-Sem\Lecture-Video-to-PDF\src\subtitle_segment_finder.py", line 122, in __get_part_position_of_time_break__
    if self.parts[right_part_index].text[right_part_char_index] == ".":
IndexError: string index out of range```

Type Error when running the script

Hi
Thank you for sharing the code. I tried executing the script as per the instructions in readme file. It works fine if run with the example subtitle files (subtitles_1, subtitles_2, etc) are used. But when using my subtitle file with video, the program throws the following error (link to subtitle file that was used: https://github.com/docstar1/Lecture-Video-to-PDF/blob/2daed0bc3426db515ad8da3ef3611834789c23db/tests/subtitles/subtitles_3.vtt):
Traceback (most recent call last):
File "src/main.py", line 77, in
runner.run(sys.argv[1:])
File "src/main.py", line 47, in run
video_segment_finder, video_filepath, subtitle_parser, output_filepath
File "src/main.py", line 61, in run
segments = segment_finder.get_subtitle_segments(subtitle_breaks)
File "..\PycharmProjects\Lecture-Video-to-PDF\src\subtitle_segment_finder.py", line 96, in get_subtitle_segments
pos = self.get_part_position_of_time_break(time_break)
File "..\PycharmProjects\Lecture-Video-to-PDF\src\subtitle_segment_finder.py", line 152, in get_part_position_of_time_break
part = self.parts[part_index]
TypeError: list indices must be integers or slices, not NoneType

SRT files

Hi, I got

Getting subtitles for each frame
Traceback (most recent call last):
  File "src/main.py", line 80, in <module>
    runner.run(sys.argv[1:])
  File "src/main.py", line 46, in run
    self.__run__(
  File "src/main.py", line 61, in __run__
    segment_finder = SubtitleSegmentFinder(subtitle_parser.get_subtitle_parts())
  File "C:\Users\Tilman\AppData\Local\Programs\Python\Python38\src\subtitle_segment_finder.py", line 52, in get_subtitle_parts
    for caption in webvtt.read(self.input_file):
  File "C:\Users\Tilman\AppData\Local\Programs\Python\Python38\lib\site-packages\webvtt\webvtt.py", line 60, in read
    parser = WebVTTParser().read(file)
  File "C:\Users\Tilman\AppData\Local\Programs\Python\Python38\lib\site-packages\webvtt\parsers.py", line 25, in read
    self._validate(content)
  File "C:\Users\Tilman\AppData\Local\Programs\Python\Python38\lib\site-packages\webvtt\parsers.py", line 258, in _validate
    raise MalformedFileError('The file does not have a valid format')
webvtt.errors.MalformedFileError: The file does not have a valid format

when trying to use a .srt subtitle file. Is it possible to add support for that?

video type

apart from mp4, can it handle other type of video, like mkv wmv etc

Feature Request: Extract slides without subtitles

Hi! I like the idea of your project but it's not what I actually want.

I have a few lectures videos without subtitle files and I'm not interested in having subtitles in the PDF document either.
I just want to to obtain a document with the slides in landscape orientation.

Please consider adding the option to achieve the mentioned result.


Side note: I noticed that your program depends on CPU power. Have you ever thought about utilizing the GPU instead?

Cropped slides in output.pdf

python.exe src/main.py "D:\Downloads\vid1.m4v" -s "D:\Downloads\subt1.vtt" -o output.pdf

C:\Users\User\AppData\Local\Programs\Python\Python38>python.exe src/main.py "D:\Downloads\vid1.m4v" -s "D:\Downloads\subt1.vtt" -o output.pdf
Getting selected frames
Getting subtitles for each frame
Merging frames and subtitles
C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\fpdf\fpdf.py:710: UserWarning: Substitutting Arial by core font Helvetica
  warnings.warn("Substitutting Arial by core font Helvetica")

resulted in cropped slides in the output.pdf like this:
grafik

Also, the subtitles are more or less split into two major parts. They were converted from .srt to .vtt via https://www.happyscribe.com and have the following exemplary format:

1090
00:36:26.960 --> 00:36:29.660
inzwischen wird das war schon ein

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.