Trying things with music lyrics.
You need to set a python environment (tested version: 3.10.5). Then, in your shell, type:
git clone [email protected]:MiguelLaura/lyrics_cloud.git
cd lyrics_cloud
pip install .
To install the Git hooks:
pre-commit install
Before committing, black
and generate_readme.py will automatically run.
To change the README file, change README.template.md first and generate the README after (or let the pre-commit do it). You'll need to add the new functions to generate_readme.py in DOCS
.
usage: python -m lyrics_cloud.artists_lyrics [-h] [--method [list,file]] --artists ARTISTS --output-file OUTPUT_FILE --token TOKEN
options:
-h, --help show this help message and exit
--method [list,file]
--artists ARTISTS list of artists or text file with one artist per line
--output-file OUTPUT_FILE
csv file to write the results
--token TOKEN token for the genius API (https://docs.genius.com/)
usage: python -m lyrics_cloud.content_based_recommender [-h] --artist ARTIST --csv-file CSV_FILE --title TITLE
options:
-h, --help show this help message and exit
--artist ARTIST artist singing the song to get recommendations from
--csv-file CSV_FILE csv file containing the lyrics
--title TITLE song title to get recommendations from
usage: python -m lyrics_cloud.word_cloud [-h] --artist ARTIST --csv-file CSV_FILE --title TITLE
options:
-h, --help show this help message and exit
--artist ARTIST artist singing the song to get recommendations from
--csv-file CSV_FILE csv file containing the lyrics
--title TITLE song title to get recommendations from
Function to get artists from a txt file.
Arguments
- file str - name of the input file (txt format).
Yields
str - artist's name.
Function to write lyrics from a list of artists into a csv file.
Arguments
- artists list[str] - list of artists name.
- output_file str - name of the output file (csv format).
- token str - token for the genius API (https://docs.genius.com/).
Function to load and train the word2vec model on the corpus.
Arguments
- corpus list[list[str] - corpus of songs.
Returns
model - trained word2vec model.
Function to create the averaged word2vec embeddings.
Arguments
- df_lyrics Series - lyrics.
- model - trained word2vec model.
Returns
list[float] - lyrics averaged word2vec embeddings.
Function to recommend a song based on a title and artist.
Arguments
- idx int - index of the item to get recommendations from.
- embeddings list[float] - averaged word2vec embeddings.
- nb_reco int - number of recommendations to return.
Function to visualize the word2vec embeddings.
Arguments
- w2v - trained word2vec model.
Returns
dataframe - PCA dataframe.
Function to plot a text word cloud.
Arguments
- text str - text to output in the word cloud.
Function to clean the lyrics.
Arguments
- lyrics str - lyrics of a song.
Returns
str - cleaned lyrics.
Function to prepare the text (remove stop words, punctuation, etc.).
Arguments
- text str - text to prepare.
Returns
str - cleaned and tokenized text.