Giter Club home page Giter Club logo

scrape-yt's People

Contributors

alverated avatar dependabot[bot] avatar suspiciouslookingowl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

scrape-yt's Issues

Search breaks on keywords with "typo" [BUG]

Describe the bug
Search works for most keywords, but when the search terms trigger youtube's "showing results for xxxxxxx Search instead fo xxxxxxx", the search() function has a

(node:8684) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 
'videoId' of undefined
at Object.parseSearch (C:\Users\xxxxx\Projects\Music\node_modules\scrape- 
yt\dist\common\parser.js:142:30)

To Reproduce
Use search function with any "typoed" keywords as search query. search(keywords)
Some example keyword combos that result in error:

  • pompei nightcore
  • helllo
  • pewdiepye

Expected behavior
Should return normal search query results like normal without errors

Screenshots
N/A

Additional context
N/A

Thumbnail on search() sometimes return object

I want to store the thumbnail url (which I presume is a string) into my database, and the field of where I store the thumbnail is type String

However, I receive this error
image
image

Interestingly, some songs are able to store the thumbnail url as a string (coverImage field)......
image

Scraping advertisements by mistake

Hey! Great web scraper btw, it's a great alternative to the youtube search api v3. I mainly use this web scraper to get the video id's of a certain search query (such as getting the video id of a music video)

I am making a web app that displays the music videos of old songs, and so far, your scraper allows me to successfully retrieve the video id. For example:
image
So, url is basically the id of the first index of the results I get back from your module

However, sometimes the web scraper retrieves the url of an advertisement since it can accidentally scrape the first video (thinking that all videos in the search are going to be related to the video). Thus, I get something like this:
image

Is there a way to filter through the advertisements so that the results can strictly have an array of the related videos? Thanks!

[BUG] error 302

Describe the bug
Getting an error 302 after running a test script with 1 video search on an aws instance

To Reproduce
create aws ec2 instance
run

scrapeYt.getVideo("84N7ZOnXB6").then(videos => {
    console.log(videos);
}).catch(error => {
        console.log(error);
});

Expected behavior
should return the video information

Screenshots
If applicable, add screenshots to help explain your problem.
image

Additional context
im running fine on my windows pc
but from my ec2 instance in aws its getting this error

Enhancement : Thumbnail Url on getVideo()

hey would it be possible to get the url of the video thumbnail on the getVideo() like the search()

scrapeYt.getVideo('u1P1nkbuz6k').then((videoInfo: Video) => {
    console.log(videoInfo);
});
{
  id: 'u1P1nkbuz6k',
  title: 'SICKEST Mario Party RAP!! - ANIMATED MUSIC VIDEO (animated by Gregzilla)',
  duration: 215,
  description: 'Get the album NOW ► http://starbomb.com\n' +
    'MERCH!! ► http://www.starbomb.com/merch\n' +
    'Click for MORE VIDS! ► http://www.youtube.com/subscription_c...\n' +
    '\n' +
    'Animated by ► https://www.youtube.com/channel/UC5X_...\n' +
    '\n' +
    '\n' +
    '-----------------------------------------------------------------------------\n' +
    '\n' +
    'Starbomb Facebook: https://www.facebook.com/starbombband\n' +
    '\n' +
    'Starbomb Twitter: https://twitter.com/starbomb\n' +
    '\n' +
    'http://www.starbomb.com',
  channel: {
    id: 'UC0gEw6pgNkLkkzMwzX4UtHA',
    name: 'Egoraptor',
    thumbnail: 'https:https://yt3.ggpht.com/a/AATXAJw3OtXq_3S0KSyS3kJ7o1ZXzzKWGxSJf-aQXg=s176-c-k-c0xffffffff-no-rj-mo',
    url: 'https://www.youtube.com/channel/UC0gEw6pgNkLkkzMwzX4UtHA'
  },
  uploadDate: 'Published on Apr 19, 2019',
  viewCount: 5163653,
  likeCount: 185816,
  dislikeCount: 3154,
  tags: []
}

.search() error for video without publish date

image
The youtube scraper was working perfectly fine before and using the maxRetryCount logic worked fine as well. But now, the addition of simpleText makes it work not at all. Could you please revert it?

Enhancement : Duration on the Video object from a video search

Would it be possible to return the duration when using
.getVideo(videoID)

I am currently chaining calls to get the duration because the .getVideo doesnt grab the duration

scrapeYt.getVideo(videoId).then((videoMain: Video) => {
                            scrapeYt.search(videoMain.title).then((videoInfos: Video[]) => {
...

would be awesome if this is possible!!!!

[BUG] getPlaylist not working on repl.it

SyntaxError: Unexpected token } in JSON at position 10674
at JSON.parse ()
at Object.parseGetPlaylist (/home/runner/Bot/node_modules/scrape-yt/dist/common/parser.js:274:36)
at /home/runner/Discord-MusicBot/node_modules/scrape-yt/dist/index.js:162:64
at step (/home/runner/Discord-MusicBot/node_modules/scrape-yt/dist/index.js:54:23)

.search() sometimes return empty array without error

Hello again, just want to let you know that after a few test runs, it seems as if the ads are filtered out now. Thank you so much for this fix!

I mentioned earlier that I get an "UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'id' of undefined at addSong"
image

addSong is basically my function when I take the video id and store it onto my MongoDB database. Like I mentioned before, I loop through another npm module about billboard music chart data, and then call addSong

image

addSong does something like this:
image

This is really a simple call to your module using the name of the artist and title of the songs as the keyword search query. However, the await makes the code behave synchronously, waiting for the videoId to come back as a resolved Promise. However, as shown previously, I get this:
image
So, I am able to populate my database with 216 video ids when I made 220 calls in reality. On average, 10 videos ids may be missing so that means 10 Promises weren't resolved and I'm 10 songs short at times. Is this an issue involving the way you may have structured your Promise code? Or are my function calls to excessive and can't handle the asynchronous behavior? (I plan on making these function calls every Sunday, not every day, so max 220 calls a week).

Thanks for the swift modifications and cooperation. I hope you can figure something out :)

playlistSidebarSecondaryInfoRenderer undefined when to getting playlist

When using scrape-yt to scrape YouTube playlists I am getting the following the error:

Cannot read property 'playlistSidebarSecondaryInfoRenderer' of undefined.

This error occurred when trying to get the playlist with ID: RDCLAK5uy_mfut9V_o1n9nVG_m5yZ3ztCif29AHUffI .

This was done with the following code running v1.4.5 of scrape-yt:

await scrapeYt.getPlaylist("RDCLAK5uy_mfut9V_o1n9nVG_m5yZ3ztCif29AHUffI")

One other point to add is that I was trying to obtain 10 or so other playlists at the same time, so it might be a throttle kicking in?

BUG

Channel url undefined

IMG_20210220_092535

[ANNOUNCEMENT] No more new features, only bug fixes.

This is not an issue. rather an announcement.

I will stop adding new features to this library and will only do bug fixes.

I was working for v2 of this library but I figured that it will be better to make a new library instead because the v2 doesn't scrape data from Youtube page anymore, instead it will send a request to (what I assume to be) Youtube internal API for Youtube Web App.

It's called Youtubei, it's still in early stage but should be stable. It also should be faster than scrape-yt. Go check it out 😁

[BUG] getVideo & getPlaylist

Bug Description
The getVideo and getPlaylist function are returning an empty object.

To Reproduce
Add a video or playlist id to the getVideo or getPlaylist function.

Expected Behavior
Returns an empty object, no error message.

Additional Context
I'm not sure if it's also happening with getRelated or getUpNext. Since I don't use those functions, no testing happened. It works fine for the search function, though.

[BUG]

Describe the bug
The getVideo isnt working with a id with a "-"

To Reproduce
have this piece of code

var scrapeYt = require("scrape-yt");
scrapeYt.getVideo("DEXTLC-rPI4").then(videos => {
    console.log(videos);
    console.log(videos.title);
});

and run it. The id is valid but it doesn't work because it has a - in it.

Expected behavior
I expected that everything works normal.

[BUG] Unexpected token } in JSON at position 10688

Describe the bug
Not sure when, but after updating to 1.4.6. This happens with all playlist ID

To Reproduce
Steps to reproduce the behavior:
Use any playlist ID. I tested the one in the documentation "PLx65qkgCWNJIgVrndMrhsedBz1VDp0kfm"
And call getPlaylist(playlistId)

Expected behavior
You should get the following error:
Unexpected token } in JSON at position 10688
at JSON.parse ()
at Object.parseGetPlaylist (...\node_modules\scrape-yt\dist\common\parser.js:274:36)
at ...\node_modules\scrape-yt\dist\index.js:162:64
at step ...\node_modules\scrape-yt\dist\index.js:54:23)
at Object.next (...\node_modules\scrape-yt\dist\index.js:35:53)
at fulfilled (...\node_modules\scrape-yt\dist\index.js:26:58)
at processTicksAndRejections (internal/process/task_queues.js:97:5)

Screenshots
Here is a test code that I used

duration returns null on videos over 1 hour long

Doing a video search with the name on a video that is over 1 hour puts null in the duration of the video.

Code:

const scrapeYt = require("scrape-yt").scrapeYt;

scrapeYt.search("1 hour Trance Mix With Mister Anderson - Transmission 006 [HD]").then(videos => {
  console.log(videos[0]);
});

Console output:

{
  id: 'L_LUpnjgPso',
  title: 'Fireplace 10 hours full HD',
  duration: null,
  thumbnail: 'https://i.ytimg.com/vi/L_LUpnjgPso/hqdefault.jpg?sqp=-oaymwEjCPYBEIoBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLAgH7jbYsBHthSpHJIQlHT5fOCnkA',
  channel: {
    id: 'UCR5hpffFzzlEJoAqY2xotcg',
    name: 'Fireplace 10 hours',
    url: 'https://www.youtube.com/channel/UCR5hpffFzzlEJoAqY2xotcg'
  },
  uploadDate: '3 years ago',
  viewCount: 21224450
}

simpleText undefined when trying to get viewCount

image
I am looping through movies in from the Movie Database API and an error s occurring when scraping viewCount. I don't know if this affects the videoId

The query in the parameter is the title of the movie + " trailer", so basicially ${title_of_movie} trailer
These are the list of songs I am looping through:

Movie List

Action Jackson
Buena Vista Social Club
The Red Violin
Rosetta
The Brave Little Toaster to the Rescue
The Bird People in China
The Love Letter
8 ½ Women
Besieged
Finding Graceland
Dr. Quinn Medicine Woman: The Movie
A Lesson Before Dying
You Can Thank Me Later
Alien Arsenal
Black Light
The Unexpected Mrs. Pollifax
The Jesse Ventura Story
Teenage Space Vampires
Bruce Almighty
Dogville
Ichi the Killer
Scorched
The Ghost of Lord Farquaad
The In-Laws
Shade
The Barbarian Invasions
Down
Deep Blue
Yossi & Jagger
Lupin the Third: Voyage to Danger
Blackball
Seducing Doctor Lewis
The Rage in Placid Lake
Martha, Inc.: The Story of Martha Stewart
Valentin
The Era of Vampires
Ice Bound - A Woman's Survival at the South Pole
Stealing Rembrandt
Die Hard: With a Vengeance
Little Odessa
Forget Paris
Amateur
Mosquito
Cyborg Cop III
The Granny
Play Time
White Dwarf
Spenser: A Savage Place
The Black Bomber
Gramps
Life 101
Ghosts of Gettysburg
The Show Formerly Known as the Martin Short Show
Evander Holyfield vs. Ray Mercer
A Night to Die For
Destination Vegas
TypeError: Cannot read property 'title' of undefined
at /Users/kareem/Downloads/Spring 2020/CSCI 39548 (Web Development)/CSCI-39548-NostalgiaNow/nostalgia-master/index.js:606:56
at processTicksAndRejections (internal/process/task_queues.js:97:5)
Addicted to Love
Brassed Off
Poison Ivy: The New Seduction
David Blaine: Street Magic
Changing Habits
The Designated Mourner
Behind Enemy Lines
Weapons of Mass Distraction
Anastasia
Invisible Mom
Kounterfeit
The Lost World: Jurassic Park
Blackwater Trail
The Apocalypse
Patton: A Tribute to Franklin J. Schaffner
Rampage
ECW Buffalo Invasion
Psycho Diver: Soul Siren
Hostage High
Demon Fighter Kocho
Mission: Impossible II
In the Mood for Love
Dinosaur
Road Trip
Small Time Crooks
Our Lips Are Sealed
Sailor Moon S the Movie: Hearts in Ice
Cheaters
Nico and Dani
Nang Nak
Growing Up Brady
Harlan County War
Mermaid
Murder, She Wrote: A Story to Die For
Sleepy Hollow: Behind the Legend
The Linda McCartney Story
The Debut
Lady Audley's Secret
Luminous Motion
Eating Air
Terminator Salvation
Night at the Museum: Battle of the Smithsonian
Mega Shark vs. Giant Octopus
Dance Flick
The Warlords
The Girlfriend Experience
Eden Log
How Bruce Lee Changed the World
Spring Breakdown
Vengeance
Tormented
Everyman's War
The Fear Chamber
Nick Swardson: Seriously, Who Farted?
Steppin: The Movie
Copyright Criminals
Eric Clapton and Steve Winwood - Live from Madison Square Garden
Russell Brand in New York City
Together: The Hendrick Motorsports Story
Jo Koy: Don't Make Him Angry
Legend
About a Boy
The Importance of Being Earnest
Igby Goes Down
Derailed
Brewster's Millions
Sweet Sixteen
Respiro
Ten
Ararat
Deserter
Ten Minutes Older: The Trumpet
Close to Leo
MAS*H: 30th Anniversary Reunion
Night At The Golden Eagle
Late Marriage
Washington Heights
Five Aces
Scared Silent
All Around The Town
Mission: Impossible
Ashes of Time
Flipper
Electra
Norma Jean & Marilyn
Heaven's Prisoners
Naked Souls
Earth
Power 98
A Friend's Betrayal
Over the Wire
The Siege at Ruby Ridge
A Boy Called Hate
For Which He Stands
Josh Kirby... Time Warrior: Last Battle for the Universe
Too Fast Too Young
Prem Granth
Portraits of a Killer
Thrill
Divididos: En vivo en MTV
Indiana Jones and the Kingdom of the Crystal Skull
Face/Off
Æon Flux
Changeling
Wendy and Lucy
Son of Rambow
The Flock
The Edge of Heaven
A Christmas Tale
Heavy Metal in Baghdad
Ocean Flame
Deadwater
The Devil's Ground
Dr. Jekyll and Mr. Hyde
Automaton Transfusion
Cloud 9
Johnny Mad Dog
Grindstone Road
Mystery of the Crystal Skulls
Better Things
Godzilla
As Good as It Gets
Fear and Loathing in Las Vegas
Tron
The Brave Little Toaster Goes to Mars
The Opposite of Sex
April
Gargantua
Under the Skin
Storm Chasers: Revenge of the Twister
Tainted
Provocateur
Broadway Damage
Night Time
Dirty Little Secret
Urban Legends
Natalie Merchant: Ophelia
Perfect Lies
Horseshoe
The Exotic Time Machine
Shrek the Third
Stomp the Yard
Sicko
WΔZ
Water Lilies
Stuck
Jesse Stone: Sea Change
The Mad
Joe Strummer: The Future Is Unwritten
Chop Shop
Big Man Japan
Have Dreams, Will Travel
And Along Come Tourists
Magicians
Madame Tutli-Putli
Careless
The Apocalypse
'Til Lies Do Us Part
Jellyfish
Shrek
Pearl Harbor
Human Nature
Truth or Consequences, N.M.
Conspiracy
New Alcatraz
Battle Queen 2020
R.O.D - Read or Die
The Perfect Wife
Submerged
Lan Yu
Ram Dass: Fierce Grace
All Access: Front Row. Backstage. Live!
Like Mother Like Son: The Strange Story of Sante and Kenny Kimes
Memories from the Sweet Sue's
The Making of Jaws 2
PTKHO
Memories, Dreams & Addictions
Nevermore: [2001] San Jose, California
Angel Dust: [2001] Live in San Jose, CA
Undisputed III: Redemption
MacGruber
Fair Game
You Will Meet a Tall Dark Stranger
Kites
Family Guy Presents: Something, Something, Something, Dark Side
Holy Rollers
Virginia
Witchville
Perrier's Bounty
Fairfield Road
Father of My Children
Mutant Vampire Zombies from the 'Hood!
Cultures of Resistance
Kooky
The Truth
Sneeze Me Away
Two in the Wave
Blood Junkie
Anderson's Cross
The Da Vinci Code
Over the Hedge
The Break-Up
Hollow Man II
Tristan & Isolde
Southland Tales
See No Evil
Severance
Desperation
Mulberry Street
Red Road
Brave Story
White Skin
Zidane: A 21st Century Portrait
The Witches Hammer
It Waits
Salvador (Puig Antich)
The Perfect Marriage
Sinful
Pineapple
Star Wars: Episode III - Revenge of the Sith
The Longest Yard
Dominion: Prequel to the Exorcist
The Muppets' Wizard of Oz
Into the Sun
The Last Drop
The God Who Wasn't There
Inuyasha the Movie: Affections Touching Across Time
6ixtynin9
Dinotopia: Quest for the Ruby Sunstone
Don't Come Knocking
Three Times
Two Drifters
Our Fathers
McLibel
Descent
Mondovino
The Circle
Ghost Lake
Erkan & Stefan 3
Shrek 2
The Assassination of Richard Nixon
Mysterious Skin
2046
Modigliani
Stateside
Marebito
Our Music
The Lion in Winter
Exiles
Omagh
Cat Stevens: Majikat
Star Spangled to Death
Reversible Errors
Thunderstruck
The Voice Behind the Mouse
Pearl Harbor: A Day of Infamy
Larry the Cable Guy: Git-R-Done
Party Animalz
Live Forever: The Rise and Fall of Brit Pop

[BUG] scrapeYt.getVideo(videoID); throws an error when trying to get a video with no description

Thank you so much for making this repo possible. This repo has everything I needed.

Describe the bug
Here is the url to the video that scrapeYt.getVideo(videoID) throws the error.

The error
TypeError: Cannot read property 'runs' of undefined at Object.parseGetVideo (node_modules\scrape-yt\dist\common\parser.js:364:31) at node_modules\scrape-yt\dist\index.js:196:64 at step (node_modules\scrape-yt\dist\index.js:44:23) at Object.next (node_modules\scrape-yt\dist\index.js:25:53) at fulfilled (node_modules\scrape-yt\dist\index.js:16:58) at processTicksAndRejections (internal/process/task_queues.js:97:5)

To Reproduce
use scrapeYt.getVideo(videoID) where videoID is a video without a description like this one I found.

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
I think it's because the code always expect a description on every video. I could be wrong.

getPlaylist(): TypeError: Cannot read property 'shortBylineText' of undefined

const results = await scrapeYt.getPlaylist("PLaYtGORLdHDIOZVNovduRmZBuoTGVOGAS");

but using the playlist id from your example works fine.
const results = await scrapeYt.getPlaylist("PLx65qkgCWNJIgVrndMrhsedBz1VDp0kfm");

so I was thinking that this will happen if the playlist contains Deleted/Private Videos?

[ENHANCEMENT] Performance impact when getting playlist with high videos count

Describe the bug
Hey so I'm seeing when I call

scrapeYt.getPlaylist(playlistMatch[1]).then(playlist => {
    const playlistVideos: Video[] = playlist['videos'];

   console.log(playlistVideos);
});

Where playlistMatch[1] = PLqFSI8ggBE67YFjGyLykEqeh0c37uC8fi

My audio playing on discord has a gap when this is called

To Reproduce

  • should be able to reproduce on any playlist over 30+ if not 30+ then try 100 or more
  • lower numbered playlist dont have this issue as prevalent as when it is a bigger playlist

Expected behavior
No gap in audio when calling this method

Screenshots
N/A

Additional context

  • this is with a discord bot
  • seeing in all 3 of my environments (mac - local, widows - local, ubuntu - aws)
  • im using typescript

sorry for opening so many issue man :/

i tried separating the call into a diff method that isnt connected to the player and calling the getPlaylist function while the something is playing and got the same issue.... I dont see the issue with getVideo() and im not seeing it with the search as well

Search without passing type as a query param

Applying the filter can sometimes remove videos from the results, I'm unsure if this is a YouTube issue or not.

Take the following links as examples:

This first link will bring up the correct video/song that I want:
https://www.youtube.com/results?search_query=official+audio+Stand+High+Patrol+Commando

But when you apply the video filter (which is correct as it is a video...), you will see it disappears:
https://www.youtube.com/results?search_query=official+audio+Stand+High+Patrol+Commando&sp=EgIQAQ%253D%253D

I've looked at the code and can see you're filtering the results by the type param as well as passing it in the query. It would be useful if we could pass a type but not have it included in the query.

As a test I commented out 4 lines in the dist code and it works as intended for me now.

https://i.imgur.com/4Yp7ZP0.png

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.