err0neus / santos-discography-analyser Goto Github PK
View Code? Open in Web Editor NEWJupyter Notebook based interface to download and visualise music artists discography data, statistics and lyrics sentiment
Jupyter Notebook based interface to download and visualise music artists discography data, statistics and lyrics sentiment
basic charts fed empty dataframe
displays horizontal, overlaps
change to 45 degrees
change ARTIST_ID to DISCOGS_ALBUM_ID (must have) + DISCOGS_ARTIST_ID (nice to have)
it seems the vertical size is constant.... can we automate?
string like
• can't
• ....'s etc.
sometimes "'" is replaced by "-" and sometimes not. it seems genius prefers not.
See attached file for examples
C:\ProgramData\anaconda3\lib\site-packages\statsmodels\tools_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead. import pandas.util.testing as tm
after first load, if a selection of albums is changed, the list of albums in the dropdown is not updated
Visualisations - Sentiment Analysis
some tabs switch when chart buttons are pressed. should remain on the current tab
chord packages seems to be limiting usage and forcing to upgrade to a paid version
not sure what the limit is.... do we just need to put a note on the chart?
or do we have any alternatives? [doesn't seem any as good :( ]
https://github.com/shahinrostami/chord
https://github.com/fengwangPhysics/matplotlib-chord-diagram
https://plotly.com/python/v3/filled-chord-diagram/
sntm_scr_ovr_cht_unchta
make sure that chart always includes at least 10 x ticks (no FAT bars)
not scraping Discogs anymore, get ratings using APIs
add descriptions to charts
Getting an error in get_discogs.py:
/Documents/Git/Project_Santos_Mar21/functions/get_discogs.py in get_album_stat(url, df)
274 df_stat_link.loc[df_stat_link["ARTIST_ID"] == row["ARTIST_ID"], "NUM_OF_PPL_WANT"] = num_want.get_text()
275
--> 276 return df_stat_link.drop(['STAT_LINK'], axis=1)
277
Getting the error on the latest pull and the an old version that used to work.
see David Bowie albums
1967 - album name = "David Bowie"
1969 - album name = "David Bowie"
result -> same tracklist !!
not changing ARTIST but only changing selection of ALBUMS resets the content of lyrics and sentiment data to the state of when artist first loaded
fails with DAVID BOWIE album "★ (Blackstar)", [2016] possibly because of the STAR symbol
Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:238: RuntimeWarning: Glyph 9733 missing from current font.
NIRVANA NEVERMIND ends up with each track twice...
(with BILLBOARD_ALBUM_RANK having 2 sets of values)
organize, tidy, remove unused etc.
x axis shows years as floats, i.e. 2010.5, 2012.0 etc
assigning colors doesn't work always
try David Bowie from attached discogs file
(transfer to csv first)
when visualizing BY ALBUM, top and bottom bars/hlines get truncated by the edges of the figure
when loading UI, an empty chart is displayed below the selectors
Case: Red House Painters 1993, 2 albums, same year, same name
Tracklist seem good, but Albums are not uniquely identified and therefore displayed as one in charts etc
Solution - Year/Album column to uniquely identify duplications
in the process change ARTIST_ID to DISCOGS_ALBUM_ID (must have) + DISCOGS_ARTIST_ID (nice to have)
Add charts to Visualisations - Sentiment Analysis
• create new tab (name "Sentiment Over Time")
• 1 chart not split by charted/uncharted... just tracks by sentiment group over time
• + 2 charts splitting the above to Charted and Uncharted
• if possible, manipulate Y axis lables not to show negative sign (i.e. 10-5-0-5-10)
• adjust bar widht to 0.9 [i.e. in ax.bar(x, y_ntr2, width = 0.9, color = '#b2b2b2')]
source notebook: Project_Santos/_support files/DC_dev_file.ipynb
Nirvana | Nevermind | "Lithium (Türkçe Çeviri)" track seems to be a lyrics translation of "Lithium" track
https://genius.com/Nirvana-lithium-turkce-ceviri-lyrics
Nirvana | Bleach | "Bleach [Liner Notes]" are album notes, not actual track
https://genius.com/Nirvana-bleach-liner-notes-annotated
graphically they are distinguishable...:
Ideally these should be ignored and not pulled from GENIUS at all.
define colour palette in hex codes
GREEN
YELLOW
ORANGE
RED
BLUE
GREY
some colours are not as labelled
sentiment vs charted over time fails the first time a new artist is loaded.
can we have a single custom package that would include all the needed packages?
show_sentiment_score_ovr_time doesn't use album selection
rather than let it fail with a python error message, add Notification if a user is trying to run visualizations without selecting artist and/or dicsography first
explore options how for data representation on the chart
e.g. y-axis labels vs in-bar labels vs legend vs hovering
• test colour scheme (ok if others don't work)
• enlarge labels
add the Discogs page link to the album selector for easy check of discography source
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.