ruuen / exportify Goto Github PK
View Code? Open in Web Editor NEWWeb app which allows exporting public & private Spotify playlists to a CSV or JSON file, utilising the Spotify API.
License: GNU General Public License v3.0
Web app which allows exporting public & private Spotify playlists to a CSV or JSON file, utilising the Spotify API.
License: GNU General Public License v3.0
Will add items below as they are identified:
useExporter
hook track CSV and JSON export loading state separately so this can be passed to each button.Spotify API auth code flow docs: https://developer.spotify.com/documentation/web-api/tutorials/code-flow
OAuth2.0 spec: https://datatracker.ietf.org/doc/html/rfc6749
I need to work out best way to generate and store the state token for the auth process.
OAuth2.0 spec notes on the state param: https://datatracker.ietf.org/doc/html/rfc6749#section-10.12
The client MUST implement CSRF protection for its redirection URI. This is typically accomplished by requiring any request sent to the redirection URI endpoint to include a value that binds the request to the user-agent's authenticated state (e.g., a hash of the session cookie used to authenticate the user-agent).
Some tips covered in the OWASP CSRF prevention cheatsheet may come in handy.
To reduce amount of calls made to the Spotify API, backend data store should be checked for any active basic access tokens.
If none are active, a new token should be generated -> stored -> used for the request.
If an active token is found, this should be used for the request instead of a new generation.
Aside from learning and my own usage, I want to provide this as an ongoing service while reducing the cost to myself, and I want to maximise as much of my free usage quota for functions as possible.
I could have done it by only querying Spotify API through the client, but I didn't like the security holes and client complexity this introduced.
Doing all Spotify querying through lambdas allowed me to not expose any public access token to the client, and allowed me to move repeated playlist item fetch logic away from the client. It does, however, introduce a hard limit of 10 seconds for full playlist data calls, which isn't enough to cover all potential playlists in the wild.
I have enough in the free function quota that I could probably proxy every individual playlist track request, but I think putting some time into building a smarter full data endpoint would be a better solution.
Ideally it would be able to:
next
URL back to the client for easy resumption of the process.Current performance says I can get ~1000 songs processed between 5-6 seconds.
My largest playlist of ~3300 songs times out after 10s.
The theoretical maximum playlist size is 30,000.
Some math if you exported a max-length playlist:
If I refactor the endpoint to chunk the data:
If I refactor the endpoint to just proxy individual Spotify API requests:
It's easy to see that the extra time/complexity introduced with the chunking approach would be mitigated by the amount of extra users that could be supported monthly per usage. These are estimated numbers too, I expect it will process more than 1500 songs in a 9sec window.
On Netlify functions alone, it's the difference between running 416 exports/mo vs 6,250 exports/mo if every single playlist had 30,000 songs in it.
In the AWS Lambda free tier, the difference is ~3,300 exports/mo vs 50,000 exports/mo if every one was max length.
I already had plans for introducing some rate-limiting ability and abuse prevention when it was on Netlify, but I will definitely be looking at what options I have in-code when I'm doing all the caching work. Future plans are still to move this project over to AWS if I start to require WAF or huge limiting at all.
Existing getPlaylistTracks endpoint should use basic Spotify API access token when no access token cookie is provided in the request.
If an access token cookie is provided, the endpoints should not generate a basic access token, and should instead pass the cookie value through to Spotify API for user-scoped access.
I was originally going to have a lambda function perform this on schedule, but while reading DynamoDB docs and trying it out I found the TTL function:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/TTL.html?icmpid=docs_dynamodb_help_panel_hp_ttl
Requires completion of #5
Requests to Spotify API should retry on a staggered basis until a max number of retries is hit.
Need to ensure that any backend API endpoints that retry requests to Spotify API will terminate gracefully if the function is at danger of timing out before returning error/partial response.
If a basic-scope token expires mid request, it can be refreshed from within the API (it won't expire until I implement basic token caching in backend, as basic tokens are generated per function call right now)
If a user-scope token expires mid request, an error should be returned to the client for manual/auto re-auth to be performed.
I don't want the Public Playlist view to require that user log in via the Spotify API.
Some users may not have a Spotify account, and others might not be comfortable sharing any permissions to their Spotify user via the official API, even though the app only requires lowest level available of read perms.
This means that the Public Playlist view uses Spotify's client credentials auth flow, and it can only access public playlist URLs.
If a user chooses to sign in via the Spotify API, the Private Playlist view should allow the user to export their private playlists, any public playlists the user has saved, and any shared/collaborative playlists they have with other users.
They'll see a big list of the exact same playlists that are visible in their Spotify client and can trigger desired export actions from each card.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.