Giter Club home page Giter Club logo

lyricsgenius's People

Contributors

adamspannbauer avatar allerter avatar danielcliu avatar darreldonald avatar disonds avatar eavelardev avatar eeishaan avatar gal20 avatar hhkarimi avatar hotgiardiniera avatar jiafi avatar johnwmillr avatar kyu avatar ludehon avatar nickreiher avatar npmccord avatar thedustyrover avatar uhlikfil avatar vitominheere avatar vkurenkov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lyricsgenius's Issues

The "save lyrics" methods should be Song and Artist class methods

It'd make sense to at least have the option to do the following:

# Save lyrics for a single song
song = api.search_song("Hello, Goodbye", "The Beatles")
song.save_lyrics()

# Save all lyrics from a given artist
artist = api.search_artist("The Beatles")
artist.save_lyrics()

Currently you save lyrics by calling api.save_artist_lyrics(artist).

Error message

Hi, I get an error message while using your code:

import lyricsgenius as genius
api = genius.Genius('----my api code ---')
artist = api.search_artist('Andy Shauf', max_songs=3)

Error message:

Traceback (most recent call last):
File "C:/Users/Chris/AppData/Local/Programs/Python/Python37-32/top2000/181208 top2000.py", line 3, in
artist = api.search_artist('Andy Shauf', max_songs=3)
File "C:\Users\Chris\AppData\Local\Programs\Python\Python37-32\lib\site-packages\lyricsgenius\api.py", line 283, in search_artist
found_name = artist_info['artist']['name']
TypeError: 'NoneType' object is not subscriptable

Can you help me with this?
Many thanks!

Feature request: add support for the Genius annotations

If we're using the Genius API we really should allow the user to access the lyric annotations, not just the lyrics themselves. It'd take some thought to figure how to properly organize and structure the lyrics, but those decisions may be guided by how Genius already formats their API responses.

Would the lyrics be keys in a dictionary corresponding to the annotation? Would the annotations just be stored sequentially in a list? What's the best format?

Can't find certain songs

Describe the bug
When searching for certain songs, no songs are returned. Examples include:

  • "Sunflower" by Post Malone and Swae Lee
  • "The Glorious Five" by Logic

Expected behavior
Results should show for songs that are easily searchable using the genius.com UI.

To Reproduce
Describe the steps required to reproduce the behavior.

  1. From the CLI, run the following command: lyricsgenius song "Sunflower" "Post Malone"

Error message associated with the bug:

Searching for "Sunflower" by Post Malone...
Could not find specified song. Check spelling?
Could not find specified song. Check spelling?

Version info

  • 1.0.2
  • OS: macOS

Additional context
Doesn't appear to be an issue with special characters or too many characters (in the song or artist).

Encoding error during saving of lyrics for an artist

I tried fetching the lyrics of the french rapper Nekfeu and saving them in txt format but I got that error
Traceback (most recent call last): File "lyrics_fetch.py", line 6, in <module> artist.save_lyrics(format = "txt") File "D:\Anaconda3\lib\site-packages\lyricsgenius\artist.py", line 134, in save_lyrics lyrics_file.write(lyrics_to_write) File "D:\Anaconda3\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\ufeff' in position 20121: character maps to <undefined>

FileNotFoundError when saving song with a "/"

I was trying to save all the lyrics for songs from an artist, but the save_lyrics() function stopped once it hit a song that has a "/" in the song title.

Here is the error message I received:
FileNotFoundError: [Errno 2] No such file or directory: 'lyrics_arianagrande_blessed/rainbow.json'

To reproduce:
artist_name = "{Ariana Grande}"
artist = api.search_artist(artist_name)
artist.save_lyrics()

(The song is the 32nd song of hers pulled up.)

artist.save_lyrics failing

import lyricsgenius as genius
access_token = 'XXXX'
api = genius.Genius(access_token)
artist = api.search_artist("The Beatles", max_songs=3)
artist.save_lyrics(format_='json', filename='out.json')


.\python\lyrics>py -3 ./genius.py
Searching for songs by The Beatles...

Song 1: "12-Bar Original"
Song 2: "1822!"
"1 [Booklet]" is not valid. Skipping.
"20 Greatest Hits - Art and Tracklist" is not valid. Skipping.
Song 3: ""Abbey Road" side two"

Reached user-specified song limit (3).
Done. Found 3 songs.
Traceback (most recent call last):
File "./genius.py", line 19, in
artist.save_lyrics(format_='json', filename='out.json')
File "C:\Python3\lib\site-packages\lyricsgenius\artist.py", line 129, in save_lyrics
lyrics_to_write['songs'][-1]['album'] = song.album
File "C:\Python3\lib\site-packages\lyricsgenius\song.py", line 45, in album
if 'album' in self._body and 'name' in self._body['album']:
TypeError: argument of type 'NoneType' is not iterable

.\python\lyrics>py -3 --version
Python 3.6.2

Originally posted by @robot3498712 in #71 (comment)

Song search needs titles check

Describe the bug
The search_song method doesn't check that it's returning the correct song.

Expected behavior
Searching for "99 problems" returns a Drake song, "All Me", instead of the expected "99 Problems" by Jay-Z.

To Reproduce
Describe the steps required to reproduce the behavior.

  1. song = api.search_song("99 problems")
  2. print(song)

Additional context
I should probably add a check in to make sure we're not missing a search result that actually matches the song name.

Is it possible to add a timeout parameter for api.search_song()?

I'm scraping lyrics of a list of songs, got a Read Timed Out error. Is it possible to change timeout parameter from 5 to 30?

error message:
ReadTimeout: HTTPSConnectionPool(host='api.genius.com', port=443): Read timed out. (read timeout=5)

Version info

  • Package version: 0.9.5
  • OS: MacOS Mojave 10.14

Speed issues

Hello! I've been attempting to use this wrapper (thank you for putting this up!), but I've been noticing that a lot of times the search_artist function slows to a crawl and takes quite a long time to return any results. Is this to avoid some sort of rate limiting? Is there anything that I can do on my end to improve the speed at which lyrics are returned? Thanks again!

EDIT: I think the speed issues were a result of some of the first songs not having any lyrics. Those results seem to take a lot longer than results with lyrics.

Is there a way to just return a JSON object with save_lyrics and not actually download the file?

Is your feature request related to a problem? Please describe.
Write a clear and concise description of what the problem is -- e.g. "I'm always frustrated when [...]"

Describe the solution you'd like
Write a clear and concise description of what you want to happen.

Describe alternatives you've considered
Write a clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Song titles and artist names need to be too exact when searching

Describe the bug
Searching for a song by a given artist requires an input for both song name and artist title too close to the exact song title and artist name.

Expected behavior
Searching for "problems" by "jay z" should get the song "99 problems", but the search fails, even though the search works on Genius.com. Searching for "99 problems" without an artist argument does find the correct song.

To Reproduce
Describe the steps required to reproduce the behavior.

  1. song = api.search_song("99 problems", "jay z")
  2. song is None, but it should have found the song.

Additional context
The lyricsgenius search should be just as flexible as the Genius.com search.

_result_is_lyrics customization

🏷 Enhancement

I agree with most of the filters being applied to reject songs, but having the ability to pass in a list of extra lyric filters or customize the existing criteria could provide additional value to users.

def _result_is_lyrics(self, song_title):
    """Returns False if result from Genius is not actually song lyrics"""
        regex = re.compile(
            r"(tracklist)|(track list)|(album art(work)?)|(liner notes)|(booklet)|(credits)|(remix)|(interview)|(skit)", re.IGNORECASE)
        return not regex.search(song_title)

Add more usage examples and documentation

Is your feature request related to a problem? Please describe.
Users aren't aware of what features are available in lyricsgenius and are requesting features that are already a part of the package.

Describe the solution you'd like
Add documentation to the README that includes examples for more use cases. Eventually it would be nice to have a dedicated documentation site.

Artist.save_lyrics failing

Describe the bug
Using the code artist.save_lyrics() I am given an error when running the script

Expected behavior
I expected the lyrics of a chosen song to be saved to a file

To Reproduce
Describe the steps required to reproduce the behavior.
Use the following code:

import lyricsgenius as genius

api = genius.Genius("MY TOKEN") # Replaced my api token with "MY TOKEN"
artist = api.search_artist("Ariana Grande", max_songs=1)
song = api.search_song("thank u, next", artist.name)
artist.add_song(song)
artist.save_lyrics()

Include the error message associated with the bug.

Traceback (most recent call last):
  File "C:\Users\sebfa\PycharmProjects\TTS\main.py", line 8, in <module>
    artist.save_lyrics()
  File "C:\Users\sebfa\PycharmProjects\TTS\venv\lib\site-packages\lyricsgenius\artist.py", line 109, in save
_lyrics
    filename = "Lyrics_{}.{}".format(self.artist.replace(" ", ""), format_)
AttributeError: 'Artist' object has no attribute 'artist'

Version info

  • Package version: 1.0.0
  • OS: Windows 10

Additional context
Add any other context about the problem here.

Write more tests

There really need to be more unit tests for this package. Help wanted!

The package currently runs with continuous integration on Travis-CI.

JSON is not well formatted

When viewing any of the JSON files exported by any of the save() functions in a Quicklook preview or trying to open the file in Sublime, I get a warning: JSON is not well formatted: Unexpected EOF. The JSON files can still be read into Python just fine using the json module, but I should figure out why I get this warning.

UnicodeEncodeError when parsing Genius.com search results

Occasionally my code barfs when it encounters a character the ascii codec can't encode.

python genius/genius.py --search_song "Begin Again"

Searching for "Begin Again"...
Traceback (most recent call last):
    File "genius/genius.py", line 397, in <module>
        song = G.search_song(sys.argv[2])                                
    File "genius/genius.py", line 147, in search_song
        found_title  = str(search_hit['title']).translate(None,' ').lower()
UnicodeEncodeError: 'ascii' codec can't encode character u'\u200b' in position 0: ordinal not in range(128)

I assume there is a standard easy fix to this issue. So, I should fix it.

Skipping songs taking longer than fetching one

First of all, thanks for the nice program, seems to work well for the most part.
I'm trying to build a corpus of lyrics for a project at my university, so I try to fetch all the songs of the artists I want to incorporate.
Once the program fetched most of the songs, it seems to find many duplicates and attempts to skip, but skipping takes way longer than fetching a song.
Is there any way to speed up the skipping process?
Best regards.

Song.save_lyrics doesn't include song title in default file name

Describe the bug
The Song.save_lyrics method saves a file name with artist name but not song title, potentially overwriting different songs by the same artist.

Expected behavior
Default file name should be f"Lyrics_{song.title}_{song.artist}.txt".

To Reproduce
Describe the steps required to reproduce the behavior.

  1. song = api.search_song("99 problems Jay-z")
  2. song.save_lyrics()

Additional context
Problem is an issue if saving multiple songs individually, potentially by the same artist. I should provide a save_songs method that accepts a list of songs.

Character not recognized

Describe the bug
When I try to scrape lyrics of the top 10 popular Kanye songs, it doesn't recognize one character.

Expected behavior
return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u0150' in position 2716: character maps to <undefined>

This error will pop up, and I think this means it encountered the character u0150.

To Reproduce
Describe the steps required to reproduce the behavior.

`if scrape_mode is True:
artist = genius.search_artist("Kanye West", max_songs=10, sort="popularity")
lyrics = ''

for i in range(10):
    with open('Kanye.txt', 'a') as file:
        file.write(artist.songs[i].lyrics)`

Include the error message associated with the bug.

Version info

  • Package version [import lyricsgenius; print(lyricsgenius.__version__)]
  • OS: [e.g. macOS, Windows, etc.]

Additional context
Add any other context about the problem here.

Package won't install

Describe the bug
When I try to install globally using pip install lyricsgenius, I get the following output:

Collecting lyricsgenius
  Using cached https://files.pythonhosted.org/packages/9d/4e/8cd3ff464d5c08e745bfae7c8ea96e64a3584e248ed8b57b9c2d102150d1/lyricsgenius-1.0.0.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-kJMjH9/lyricsgenius/setup.py", line 21, in <module>
        with open(path.join(this_directory, 'README.md'), encoding='utf-8') as f:
    TypeError: 'encoding' is an invalid keyword argument for this function
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-kJMjH9/lyricsgenius/

Expected behavior
A global pip install would work without errors.

To Reproduce
Describe the steps required to reproduce the behavior.

  1. Open terminal
  2. pip install lyricsgenius

Include the error message associated with the bug.

TypeError: 'encoding' is an invalid keyword argument for this function
    
    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-kJMjH9/lyricsgenius/

Version info

  • Package version: Latest
  • OS: macOS

Additional context
I'm coming from a Node background so this could easily be something I'm doing but I tried this with pipenv, virtualenv, global pip install, and on an AWS Cloud9 instance (to make sure my global pip isn't muddied) and I got similar results each time so I'm thinking there could be an issue at play.

Artist search fails on "Tupac"

Artist search fails when searching for "Tupac" because Genius.com lists him as "2Pac".

The artist page for 2Pac has an AKA section that includes "Tupac". It would probably be possible to check if the user's search term is included in the AKA section of the first artist search result, continuing with the search if a match is found.

How to avoid "SKIPPING `song name` (already found in artist collection)"?

Hi, great wrapper!

I'm trying to grab all the Radiohead lyrics from Genius to do some analysis on them. When I try and save all the songs to the json file I get the message

SKIPPING song name (already found in artist collection)

In my case it's

SKIPPING "Morning Bell/Amnesiac" (already found in artist collection)
SKIPPING "Hunting Bears" (already found in artist collection)
SKIPPING "Feral" (already found in artist collection)

How can I avoid this? I need the data for these three songs.

Thanks!

Duplicative Effort?

1 def songsAreSame(s1, s2):
2 from difflib import SequenceMatcher as sm # For comparing similarity of lyrics
3 seqA = sm(None, s1.lyrics, s2['lyrics'])
4 seqB = sm(None, s2['lyrics'], s1.lyrics)
5 return seqA.ratio() > 0.5 or seqB.ratio() > 0.5

I'm curious as to the purpose of the second SM on line 4 (line 80 in artist.py), wouldn't this be one possible cause of the bottleneck occurring during the JSON writing (line 101 artist.py)? If the second SM is necessary, I believe using a permutation approach to lyric checks could reduce the time to write to file. that is mentioned in the comment above the line.

E.g - A temp list would be created and "Song A" would be compared with "B" and "C", then "A" would be removed from the temp list and "B" would be compared with only "C"

Search is case sensitive

Search is case sensitive, but shouldn't be.
For example:
song = api.search_song('lose yourself', 'Eminem')
returns no results, whereas if you search by url:
https://genius.com/search?q=lose%20yourself
it returns the correct result.
To fix, edit _clean() function:

def _clean(self, s):
    return s.translate(str.maketrans('','',punctuation)).replace('\u200b', " ").strip().lower()

I.e. just add .lower()

Searching for song or artist name requires exact match

My code in the search_song() and search_artist() functions requires an exact match between the user's query and the result returned from the Genius.com search.

Here's an example of the issue:

python genius.py --search_song "Hello Goodbye" "The Beatles"
    Searching for "Hello Goodbye" by The Beatles...
    Specified song was not first result :(

search_song() didn't find "Hello Goodbye" because the top result from Genius.com was "Hello, Goodbye" (note the comma).

Whereas this works:

python genius.py --search_song "Hello, Goodbye" "The Beatles"
   Searching for "Hello, Goodbye" by The Beatles...
   Done.

      "Hello, Goodbye" by The Beatles:
      You say yes, I say no
      You say stop and I say go go go, oh no
      You say goodbye and I say hello
      Hello h...

One simple fix would be stripping any punctuation and capitalization from both the user's search term and the Genius.com search results.

Installation - AUR package link

I created a package of LyricsGenius for Arch Linux and published it to AUR.
Maybe you could put Arch Linux installation instructions under "Installation" like this:

Install the AUR package for Arch Linux manually:

curl -L -O https://aur.archlinux.org/cgit/aur.git/snapshot/python-lyricsgenius.tar.gz
tar -xvf python-lyricsgenius.tar.gz
cd python-lyricsgenius
makepkg -si

set up as a pypi module?

If this is actually considered, this will require

  • adding a setup.py
  • moving config from config to inside python code

It's not an issue, but..

I got a little question here. Is there a node.js version of this, or can somebody convert it?

I'm making a project that gets lyrics, and this is exactly what I need - but it's Python :(

Thanks!

Trouble with cyrillic

TRY:

import lyricsgenius as genius
api = genius.Genius('token')
song = api.search_song('Возможно')

print(song.lyrics)

Possible Solution:

api.py

110: lyrics = html.find("div", class_="lyrics").get_text().encode('ascii','ignore').decode('ascii')

change to

110: lyrics = html.find("div", class_="lyrics").get_text()

Lyrics only?

Is there a way to only pull down the lyrics?
I'm having to sort through the files to remove year/album/artist/etc, and I got to thinking that there just has to be a better way of doing it.

Genius API returns non-songs masquerading as songs

The Genius API includes entries the site refers to as songs that aren't actually songs.

For example, searching for Taylor Swift will return entries for liner notes and a booklet along with actual song lyrics.

My wrapper needs to be able to identify and reject these non-song entries. From what I can tell, the Genius API does not flag these items as non-songs — their type is still listed as "song" in the JSON object.

Error while searching for all lyrics by Kanye West

From a comment on my blog:

I'm looking to use it to analyze how an artist's lyrics change over different albums. My first thought was just to pull all of the artist's songs, but I believe there is a song in their directory with missing lyrics that is causing the search to quit.

So is there either a.) a way to avoid the search from stopping or b.) a way to pull songs by album instead of by artist?

I got the error when using the search function on Kanye West. The seach will run up to "All Falls Down" and it prints this AttributeError: 'NoneType' object has no attribute 'get_text' and stops. Looking on the website, the next song on his list of songs is "All Falls Down (Live)" and says it is "Missing Lyrics" so I assumed this caused the error.

So this probably has to do with calling the get_text() function when there aren't actually lyrics available.

search_artist should use Genius's the list songs endpoint from the artist's page

Is your feature request related to a problem? Please describe.
The current Genius.search_artist method relies on a heuristic for finding song's by the requested artist. This method is slow, inefficient, and may miss songs that belong to the artist.

Describe the solution you'd like
Use the same endpoint Genius.com uses when listing songs on an artist's page.

Here is an example of the all songs endpoint for Jay-Z:

Additional context
Not sure if this API endpoint is publicly listed by Genius, but the endpoint returns a 200 when I make a request to it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.