ekarton / lecture-video-to-pdf Goto Github PK
View Code? Open in Web Editor NEWMaking lecture videos readable
License: GNU General Public License v3.0
Making lecture videos readable
License: GNU General Public License v3.0
I'm getting this error -
In my subtitle file I dont have any dots (.) I think that is the problem,
Number of frames: 103
Getting subtitles for each frame
Traceback (most recent call last):
File "src/main.py", line 87, in <module>
runner.run(sys.argv[1:])
File "src/main.py", line 51, in run
self.__run__(
File "src/main.py", line 70, in __run__
segments = segment_finder.get_subtitle_segments(subtitle_breaks)
File "D:\Study\Sixth-Sem\Lecture-Video-to-PDF\src\subtitle_segment_finder.py", line 42, in get_subtitle_segments
pos = self.__get_part_position_of_time_break__(time_break)
File "D:\Study\Sixth-Sem\Lecture-Video-to-PDF\src\subtitle_segment_finder.py", line 122, in __get_part_position_of_time_break__
if self.parts[right_part_index].text[right_part_char_index] == ".":
IndexError: string index out of range```
Hi
Thank you for sharing the code. I tried executing the script as per the instructions in readme file. It works fine if run with the example subtitle files (subtitles_1, subtitles_2, etc) are used. But when using my subtitle file with video, the program throws the following error (link to subtitle file that was used: https://github.com/docstar1/Lecture-Video-to-PDF/blob/2daed0bc3426db515ad8da3ef3611834789c23db/tests/subtitles/subtitles_3.vtt):
Traceback (most recent call last):
File "src/main.py", line 77, in
runner.run(sys.argv[1:])
File "src/main.py", line 47, in run
video_segment_finder, video_filepath, subtitle_parser, output_filepath
File "src/main.py", line 61, in run
segments = segment_finder.get_subtitle_segments(subtitle_breaks)
File "..\PycharmProjects\Lecture-Video-to-PDF\src\subtitle_segment_finder.py", line 96, in get_subtitle_segments
pos = self.get_part_position_of_time_break(time_break)
File "..\PycharmProjects\Lecture-Video-to-PDF\src\subtitle_segment_finder.py", line 152, in get_part_position_of_time_break
part = self.parts[part_index]
TypeError: list indices must be integers or slices, not NoneType
Hi, I got
Getting subtitles for each frame
Traceback (most recent call last):
File "src/main.py", line 80, in <module>
runner.run(sys.argv[1:])
File "src/main.py", line 46, in run
self.__run__(
File "src/main.py", line 61, in __run__
segment_finder = SubtitleSegmentFinder(subtitle_parser.get_subtitle_parts())
File "C:\Users\Tilman\AppData\Local\Programs\Python\Python38\src\subtitle_segment_finder.py", line 52, in get_subtitle_parts
for caption in webvtt.read(self.input_file):
File "C:\Users\Tilman\AppData\Local\Programs\Python\Python38\lib\site-packages\webvtt\webvtt.py", line 60, in read
parser = WebVTTParser().read(file)
File "C:\Users\Tilman\AppData\Local\Programs\Python\Python38\lib\site-packages\webvtt\parsers.py", line 25, in read
self._validate(content)
File "C:\Users\Tilman\AppData\Local\Programs\Python\Python38\lib\site-packages\webvtt\parsers.py", line 258, in _validate
raise MalformedFileError('The file does not have a valid format')
webvtt.errors.MalformedFileError: The file does not have a valid format
when trying to use a .srt subtitle file. Is it possible to add support for that?
apart from mp4, can it handle other type of video, like mkv wmv etc
Hi! I like the idea of your project but it's not what I actually want.
I have a few lectures videos without subtitle files and I'm not interested in having subtitles in the PDF document either.
I just want to to obtain a document with the slides in landscape orientation.
Please consider adding the option to achieve the mentioned result.
Side note: I noticed that your program depends on CPU power. Have you ever thought about utilizing the GPU instead?
python.exe src/main.py "D:\Downloads\vid1.m4v" -s "D:\Downloads\subt1.vtt" -o output.pdf
C:\Users\User\AppData\Local\Programs\Python\Python38>python.exe src/main.py "D:\Downloads\vid1.m4v" -s "D:\Downloads\subt1.vtt" -o output.pdf
Getting selected frames
Getting subtitles for each frame
Merging frames and subtitles
C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\fpdf\fpdf.py:710: UserWarning: Substitutting Arial by core font Helvetica
warnings.warn("Substitutting Arial by core font Helvetica")
resulted in cropped slides in the output.pdf like this:
Also, the subtitles are more or less split into two major parts. They were converted from .srt to .vtt via https://www.happyscribe.com and have the following exemplary format:
1090
00:36:26.960 --> 00:36:29.660
inzwischen wird das war schon ein
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.