Giter Club home page Giter Club logo

Comments (8)

rstrahan avatar rstrahan commented on June 27, 2024 1

@aejuice-github
OK, we'll look into (1) and post back.. Thanks for letting us know.

from aws-kendra-transcribe-media-search.

rstrahan avatar rstrahan commented on June 27, 2024

from aws-kendra-transcribe-media-search.

aejuice-github avatar aejuice-github commented on June 27, 2024

@rstrahan 1. Incorrect. Your playlist works because results are cached. According to your documentation, you do not parse videos twice. It does not work with any other playlist, I've tried plenty. In fact, we've debugged why it does not work. Your app uses a package pytube which is outdated. YouTube has changed the link structure and it cannot get a video. You can find more info and error message at https://stackoverflow.com/questions/68945080/pytube-exceptions-regexmatcherror-get-throttling-function-name-could-not-find

Here is a link to the playlist https://www.youtube.com/playlist?list=PLr7J3R1sT1C5pcB_xuhH1cTV9W19z1iDk

I hope you'll be able to resolve it. We would be very interested in using the service.

  1. You're correct. The issue is resolved.

from aws-kendra-transcribe-media-search.

rstrahan avatar rstrahan commented on June 27, 2024

Confirming that I can easily repro the problem..

image

Referring to colleague who implemented this feature. Tx.

from aws-kendra-transcribe-media-search.

roshansthomas avatar roshansthomas commented on June 27, 2024

Regarding issue (1). Post investigation
The solution "does not" cache a playlist. When the playlist is changed the solution will index the videos per the new playlist (if they have not been indexed prior).
The issue that is currently causing the stack to fail is with pytube version 15.0.0. Issue - > (pytube/pytube#1707).
We are working to fix this in the interim while a permanent fix is made to the pytube main branch.

Also if you do not want to index the YT media, you could leave the playlist empty and only mention the S3 bucket source where your media is stored then the stack deploys successfully and indexes the media from s3.

from aws-kendra-transcribe-media-search.

roshansthomas avatar roshansthomas commented on June 27, 2024

Tested v0.3.1 which contains the pytube 15.0.0. fix.
I am now able to index youtube videos on the YT playlist provided as default value of the CFN template parameter. Also able to change the playlist to the playlist quoted in the issue above https://www.youtube.com/playlist?list=PLr7J3R1sT1C5pcB_xuhH1cTV9W19z1iDk. And the indexer picks up the new videos as well.
This issue is fixed and can be closed.

from aws-kendra-transcribe-media-search.

rstrahan avatar rstrahan commented on June 27, 2024

Release v0.3.1 based on your PR @roshansthomas addresses this issue. Updated artifacts published to the public S3 bucket.
(The fix is temporary, and applies only to the current version of pytube, 5.0.0. We expect that the next release of pytube will address the issue officially)
@aejuice-github please deploy again and report back if you encounter any remaining issues. Thanks again for letting us know about the problem.

from aws-kendra-transcribe-media-search.

roshansthomas avatar roshansthomas commented on June 27, 2024

Closing issue as the the solution now uses yt_dlp package

from aws-kendra-transcribe-media-search.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.