ruuen / exportify Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 96 KB

Web app which allows exporting public & private Spotify playlists to a CSV or JSON file, utilising the Spotify API.

License: GNU General Public License v3.0

JavaScript 1.22% HTML 1.29% TypeScript 97.39% CSS 0.09%

exportify's People

Contributors

Stargazers

Watchers

exportify's Issues

Client: Review and improve error states, boundaries and fallbacks

Will add items below as they are identified:

I'd like to have useExporter hook track CSV and JSON export loading state separately so this can be passed to each button.
Make sure all client errors returned from API are handled gracefully on client
Add reset states to any error boundaries
Review if additional ErrorBoundaries are needed
May want to look at building and using an error modal for some things rather than current error displays

Add Spotify authorization code auth flow

Spotify API auth code flow docs: https://developer.spotify.com/documentation/web-api/tutorials/code-flow
OAuth2.0 spec: https://datatracker.ietf.org/doc/html/rfc6749

Rough diagram of the flow:

I need to work out best way to generate and store the state token for the auth process.
OAuth2.0 spec notes on the state param: https://datatracker.ietf.org/doc/html/rfc6749#section-10.12

The client MUST implement CSRF protection for its redirection URI. This is typically accomplished by requiring any request sent to the redirection URI endpoint to include a value that binds the request to the user-agent's authenticated state (e.g., a hash of the session cookie used to authenticate the user-agent).

Some tips covered in the OWASP CSRF prevention cheatsheet may come in handy.

API: Implement basic-scope Spotify token caching

To reduce amount of calls made to the Spotify API, backend data store should be checked for any active basic access tokens.

If none are active, a new token should be generated -> stored -> used for the request.

If an active token is found, this should be used for the request instead of a new generation.

Gracefully handle backend lambda timeouts

Aside from learning and my own usage, I want to provide this as an ongoing service while reducing the cost to myself, and I want to maximise as much of my free usage quota for functions as possible.

I could have done it by only querying Spotify API through the client, but I didn't like the security holes and client complexity this introduced.

Doing all Spotify querying through lambdas allowed me to not expose any public access token to the client, and allowed me to move repeated playlist item fetch logic away from the client. It does, however, introduce a hard limit of 10 seconds for full playlist data calls, which isn't enough to cover all potential playlists in the wild.

I have enough in the free function quota that I could probably proxy every individual playlist track request, but I think putting some time into building a smarter full data endpoint would be a better solution.

Ideally it would be able to:

reliably anticipate timeout
terminate gracefully with the amount of tracks it has accumulated at that stage
provide a next URL back to the client for easy resumption of the process.

Performance

Current performance says I can get ~1000 songs processed between 5-6 seconds.
My largest playlist of ~3300 songs times out after 10s.
The theoretical maximum playlist size is 30,000.

Some math if you exported a max-length playlist:

If I refactor the endpoint to chunk the data:

Assume I can safely get 1500 songs processed in 9 seconds.
30000 / 2000 = 20 lambda calls

If I refactor the endpoint to just proxy individual Spotify API requests:

Each request has a max of 100 songs
30000 / 100 = 300 lambda calls

It's easy to see that the extra time/complexity introduced with the chunking approach would be mitigated by the amount of extra users that could be supported monthly per usage. These are estimated numbers too, I expect it will process more than 1500 songs in a 9sec window.

On Netlify functions alone, it's the difference between running 416 exports/mo vs 6,250 exports/mo if every single playlist had 30,000 songs in it.
In the AWS Lambda free tier, the difference is ~3,300 exports/mo vs 50,000 exports/mo if every one was max length.

I already had plans for introducing some rate-limiting ability and abuse prevention when it was on Netlify, but I will definitely be looking at what options I have in-code when I'm doing all the caching work. Future plans are still to move this project over to AWS if I start to require WAF or huge limiting at all.

Accept auth token cookie in /api/getPlaylistTracks lambda

Existing getPlaylistTracks endpoint should use basic Spotify API access token when no access token cookie is provided in the request.

If an access token cookie is provided, the endpoints should not generate a basic access token, and should instead pass the cookie value through to Spotify API for user-scoped access.

Use DynamoDB Time To Live (TTL) for clearing old state tokens

I was originally going to have a lambda function perform this on schedule, but while reading DynamoDB docs and trying it out I found the TTL function:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/TTL.html?icmpid=docs_dynamodb_help_panel_hp_ttl

Requires completion of #5

API: Remove unused token_type field from SpotifyAccessToken type

I intended to use this field for handling basic vs user-scope auth in calls, but I found a better solution since. It's cluttering things up and is no longer needed:

API: Handle retries if Spotify API returns 429 rate limit error

Requests to Spotify API should retry on a staggered basis until a max number of retries is hit.

Need to ensure that any backend API endpoints that retry requests to Spotify API will terminate gracefully if the function is at danger of timing out before returning error/partial response.

API: Handle re-auth and retry if Spotify API returns 401 token expiry error

If a basic-scope token expires mid request, it can be refreshed from within the API (it won't expire until I implement basic token caching in backend, as basic tokens are generated per function call right now)

If a user-scope token expires mid request, an error should be returned to the client for manual/auto re-auth to be performed.

Fix hardcoded domain in nonce cookie definition

Nothing like a cruel joke you play on yourself... that was fun to troubleshoot.

I wish my branch deploys and deploy previews worked with the .netlify.app domain redirect! Will need to find a method that works long-term.

Build Private Playlist view

I don't want the Public Playlist view to require that user log in via the Spotify API.

Some users may not have a Spotify account, and others might not be comfortable sharing any permissions to their Spotify user via the official API, even though the app only requires lowest level available of read perms.

This means that the Public Playlist view uses Spotify's client credentials auth flow, and it can only access public playlist URLs.

If a user chooses to sign in via the Spotify API, the Private Playlist view should allow the user to export their private playlists, any public playlists the user has saved, and any shared/collaborative playlists they have with other users.

They'll see a big list of the exact same playlists that are visible in their Spotify client and can trigger desired export actions from each card.

Pre-requisites

build out support for Spotify's auth code flow, returning auth token as http-only cookie (#5)
allow passing the auth token cookie to lambda backend for usage in Spotify API calls instead of basic auth token (#6)
handle deletion of orphaned nonce/token pairs in DynamoDb using TTL setting (#7)

Required work

add getUserPlaylists endpoint to call Spotify API /me/playlists endpoint and retrieve a shaped list of user's playlists.
build out PrivatePlaylistView to call endpoint and show playlist cards for each item