Command-line tool where:
- You provide as many artist/band names as you want
- Their respective song lyrics are scraped to create a corpus (you can interrupt at any time and pick up the download at a later point)
- The corpus is preprocessed (stop-word removal, tokenization, vectorization, tf-idf) and then used to train a Naive Bayes classifier
- Once the model is trained, you can provide any word or song line from an artist
- The tool predicts which artist the line is from
Technologies:
Python | mypy | scikit-learn | pandas | BeautifulSoup