martijnboers / blottertrax Goto Github PK
View Code? Open in Web Editor NEW/r/listentothis submissions reddit bot
License: GNU General Public License v3.0
/r/listentothis submissions reddit bot
License: GNU General Public License v3.0
Looking at new posts in /r/listentothis
I see a lot of SoundCloud links, would be nice if this service is also implemented and can be used to enforced the rules
A requirement to test the bot is that as a first step it should send modmail instead of deleting submissions when they exceed the threshold.
Should it also not reply with the artist bio?
This will help users understand any false positives that might occur.
python src/main.py
To prevent things as 'Various Artists' being flagged for too many Last.fm plays:
https://stackoverflow.com/questions/63051943/logging-with-multiprocessing-in-docker
Find a better way for logging then printing everywhere. Database logging could be combined achieved if using Python's logging
package
Hi,
This is a feature request really: Have you considered including track information?
I was thinking of trying to build up a playlist based on the top daily tracks and I sumbled accross your code here. My thinking is I could have enhance each of your service objects to return standard info (artist, track, album, streamcount, url, confidence_level). Then you could put the track URLs in the message body. Another script could then scrape these up and build a playlist for each platform.
Have you got any interest in this idea?
Last.FM summary is licensed under cc by-sa, which means bot has to link to the license to be allowed to use it.
Issue can be solved by either having the bot state the license with a link to cc by-sa or, if possible, by changing to discogs, as they use CC0
Ends up with the following contained in it's markdown:
[wikipedia](https://en.wikipedia.org/wiki/Caravan_(band))
When it reality it needs to be
[wikipedia](https://en.wikipedia.org/wiki/Caravan_(band\))
We should probably escape any closing parens contained within links so that reddit will properly parse the URL.
I setup dependabot for this repository so dependencies don't get outdated
Right now if we want to change say the YouTube listeners limit we are forced to change source code and recreate the docker image. Instead, maybe keep the defaults in source but allow a way to override it in our configuration file.
I'm a GPL kind of guy but this is open to discussion
I want to get some insight in why the bot can't find an artists so failed attempts should get saved
Example: https://youtu.be/XnZuDiF1w8I
The Youtube Service should adopt to this as now it will fail:
Lines 36 to 39 in 2457a4e
This could possibly be cause by a double newline character as they get filtered out here
Lines 51 to 52 in 1331c76
Some examples:
Cut of at a weird position, actually two bands with this name https://www.reddit.com/r/listentothis/comments/f6hrys/vaz_visiting_hours_noise_rock_2013/fi4t8ml/?context=3
Only shows one artist https://www.reddit.com/r/listentothis/comments/f6dot5/aviators_ghosts_of_our_fathers_dark_alternative/fi4s5ay?utm_source=share&utm_medium=web2x
Same here https://www.reddit.com/r/listentothis/comments/f6bqmf/eggplant_pity_alternative1989/
Don't know what the best approach is, maybe the API exposes when an artist name has multiple records and it can be formatted differently
The function _extract_artist_post_title currently matches based on " -" appearing in the post title. However, this could be within the artist name itself. Look into making this function more robust in the hopes of not hitting false matches.
/r/listentothis currently states there is to be double dashes in the title but experience tells me many users do not follow this and use a single dash so we can't necessarily rely on that.
Now it will crash because the config isn't set in the workflow
Dependabot couldn't authenticate with https://pypi.python.org/simple/.
You can provide authentication details in your Dependabot dashboard by clicking into the account menu (in the top right) and selecting 'Config variables'.
Ensure all function names are descriptive as to what they actually do. Preferably ensure comments above the function to describe their intended purpose.
For instance: "Perhaps exceeds_threshold should be named something more descriptive as it is returning a full object rather than a boolean. "get_artist_info" or "get_service_info" or similar?"
DescriptionProvider uses the track title instead of the album title in the template. Patch included.
diff --git a/blottertrax/description_provider.py b/blottertrax/description_provider.py
index 85e2901..b24800f 100644
--- a/blottertrax/description_provider.py
+++ b/blottertrax/description_provider.py
@@ -33,6 +33,7 @@ class DescriptionProvider:
recording = result['recording-list'][0]
artist = self._get_artist_by_id(recording.get('artist-credit')[0]['artist']['id'])
+ album_title = ArrayUtil.safe_list_get(recording, recording['title'], 'release-list', 0, 'title')
album_release_date = ArrayUtil.safe_list_get(recording, False, 'release-list', 0, 'date')
life_span_begin = ArrayUtil.safe_list_get(artist, '?', 'life-span', 'begin')
life_span_end = ArrayUtil.safe_list_get(artist, 'now', 'life-span', 'end')
@@ -55,7 +56,7 @@ class DescriptionProvider:
return templates.musicbrainz_artist_info.strip().format(
artist['name'],
life_span,
- recording['title'],
+ album_title,
album_release_date,
tags,
'none' if not socials else socials,
Looks like we need to test song_title for null in TitleParser before we try any further ops. Otherwise, we sometimes crash when the parsing fails out.
Traceback (most recent call last):
File "./main.py", line 108, in daemon
self._run()
File "./main.py", line 53, in _run
parsed_submission = TitleParser.create_parsed_submission_from_submission(submission)
File "/usr/src/app/blottertrax/helper/title_parser.py", line 46, in create_parsed_submission_from_submission
song_title = re.sub(r"\(.*\)", "", song_title)
File "/usr/local/lib/python3.7/re.py", line 192, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./main.py", line 108, in daemon
self._run()
File "./main.py", line 53, in _run
parsed_submission = TitleParser.create_parsed_submission_from_submission(submission)
File "/usr/src/app/blottertrax/helper/title_parser.py", line 46, in create_parsed_submission_from_submission
song_title = re.sub(r"\(.*\)", "", song_title)
File "/usr/local/lib/python3.7/re.py", line 192, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./main.py", line 116, in <module>
BlotterTrax().daemon()
File "./main.py", line 112, in daemon
self.daemon()
File "./main.py", line 112, in daemon
self.daemon()
File "./main.py", line 112, in daemon
self.daemon()
[Previous line repeated 328 more times]
File "./main.py", line 110, in daemon
traceback.print_exc(file=sys.stdout)
File "/usr/local/lib/python3.7/traceback.py", line 163, in print_exc
print_exception(*sys.exc_info(), limit=limit, file=file, chain=chain)
File "/usr/local/lib/python3.7/traceback.py", line 104, in print_exception
type(value), value, tb, limit=limit).format(chain=chain):
File "/usr/local/lib/python3.7/traceback.py", line 497, in __init__
_seen=_seen)
File "/usr/local/lib/python3.7/traceback.py", line 497, in __init__
_seen=_seen)
File "/usr/local/lib/python3.7/traceback.py", line 497, in __init__
_seen=_seen)
[Previous line repeated 328 more times]
File "/usr/local/lib/python3.7/traceback.py", line 508, in __init__
capture_locals=capture_locals)
File "/usr/local/lib/python3.7/traceback.py", line 333, in extract
limit = getattr(sys, 'tracebacklimit', None)
RecursionError: maximum recursion depth exceeded while calling a Python object
The artist description pulled from discogs has a link for lyrics. This link redirects multiple times and ends up on an adfilled hellscape of a website.
Maybe we should whitelist a handful of sites and remove any links that don't match that whitelist? For instance, the main socials, amazon, wiki, etc.
Basically the same as #61 but for streaming services
Add the MusicBrainz link to ‘Socials’ - although it is down by ‘submit corrections’, for a hot second I couldn’t find the MB link. This would be consistent with the ‘other databases’ links there already.
Disclaimer: I work part-time for the MetaBrainz foundation - I see this tool pop up a lot on my Reddit alerts :)
Feel free to decline the change, of course!
Would like to see us try to detect self promotion. One simple step would be to take the artist name, remove all spaces and drop it to lowercase and compare to lowercased reddit account name. Perhaps if the artist name appears in the name we report the post for mods to double check.
Need to determine how we are intending to host the bot.
Hosting provider: Amazon EC2 Micro? Digital Ocean?
Organize login methods for @martijnboers and @Nedlinin for maintenance needs.
Continuous deployment? Docker container desired?
Traceback (most recent call last):
File "./main.py", line 133, in daemon
self._run()
File "./main.py", line 63, in _run
self._reply_with_sticky_post(submission, self.description_provider.get_reply(parsed_submission))
File "/usr/src/app/blottertrax/description_provider.py", line 51, in get_reply
recording['release-list'][0]['date'],
KeyError: 'date'
https://www.reddit.com/r/listentothis/comments/frnzfw/squid_sludge_experimental_rock_2020/flwrnku/
This contains
socials: bandcamp, discogs, free streaming, purchase for download, social network, social network, social network, soundcloud
as the reply. Should be fairly trivial to check the URL for containing facebook, twitter, instagram, etc and replace the text with the actual social network being linked to rather than forcing the user to click/hover each.
If the repost period has passed for a submission it makes sense to remove data from the database to keep the size small
Users should be able to delete an artist description if this is incorrect by down voting the comment
Some inspiration https://github.com/MrPowerScripts/reddit-karma-farming-bot/blob/master/src/reddit.py#L180-L208
Found some small issues with artist/song title tests that we might want to try to find a way to pass.
('007Bonez feat Adro - Motion [Hip-Hop / Rap] (2019)', '007Bonez feat Adro', None, 'Motion'),
This should be
('007Bonez feat Adro - Motion [Hip-Hop / Rap] (2019)', '007Bonez', 'Adro', 'Motion'),
But, worse yet, artists that actually have & in the name are detected as a featured artist. For example:
'Simon & Garfunkel - The Sound of Silence'
Would list the main artist as Simon and the featured artist as Garfunkel.
Not sure there will be an easy way to solve that second one though.
Now it shows collection albums etc if there's more than one release for the track
https://github.com/martijnboers/BlotterTrax/blob/master/blottertrax/description_provider.py#L36
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.