irishprime / nielsen Goto Github PK
View Code? Open in Web Editor NEWOrganize TV shows.
License: GNU General Public License v3.0
Organize TV shows.
License: GNU General Public License v3.0
Some series names do not benefit from the default behavior of title-casing and some files may be processed without wanting to change the format of the series name at all. Add a --filter
/--no-filter
option to allow the user to change this behavior at runtime.
api.process_file()
calls files.set_file_ownership()
and file.set_file_mode()
before checking the DryRun
option in the CONFIG
object. This obviously results in changes to both file ownership and file permissions when none are intended.
api.organize_file()
doesn't check for the DryRun
option at all. This results in the attempted creation of directories in the MediaPath
directory when none are intended.
A limitation in os.rename
causes failures when the source and destination paths are on different devices. Should revert back to shutil.move
for file organization.
The project doesn't really need to be containerized, but it's as good an excuse as any to package something up and experiment with Docker.
As Nielsen grows to be a bit more feature rich, putting everything in one file seems worse and worse. Moving the configuration to its own module should allow for more straightforward development of other modules (such as one to handle fetching episode titles or maybe dealing with anime or movies).
Move the CLI to its own module rather than having the functionality and interface so tightly coupled.
Some shows include a year as part of the title (generally to differentiate them from previously aired shows with the same name). Sometimes the year will be recognized as the the season and episode number while the actual season and episode number will be used as the title.
the.flash.2014.215.hdtv.mp4
, for example, will yield The Flash -20.14- 215.mp4
.
It's difficult to be certain if a four digit number is the year or the season and episode number when the original name doesn't use the SXXEXX
format.
All other options which can be specified in the config file can be overridden or otherwise set on the command line. OrganizeFiles
should have a matching flag for argparser
.
Multi-episode files will not match any of the current patterns (or will match incorrectly if the input file name is particularly lazy).
Sample filename: Firefly 0101 - Serenity.mkv
.
This should yield: Firefly -01.01- Serenity.mkv
.
The second two regular expressions in the patterns
list used in the get_file_info
function have re.IGNORECASE
set, but there are no letters in the regular expression itself. There's no reason to specify the re.IGNORECASE
flag.
The LogLevel set in the config file gets overwritten by the command line argument. This works as intended. However, since the argument has a default, it's always set and will always overwrite the value in the config file.
The utilities used to move/organize files do not support moving files between devices.
DEBUG Creating and/or moving to: /mnt/data/Videos/TV/Watchmen/Season 01
ERROR [Errno 18] Invalid cross-device link: 'Watchmen.S01E07.720p.WEB.h264-TBS[rarbg]/Watchmen -01.07- An Almost Religious Awe.mkv' -> '/mnt/data/Videos/TV/Watchmen/Season 01/Watchmen -01.07- An Almost Religious Awe.mkv'
Switching from os.rename
to shutil.move
should resolve the issue.
Some patterns in the list assume two digit seasons, and therefore zero padding. Should adjust the pattern to match 1 or more digits for the season, and 2 or more digits for the episode number (since episodes are always zero padded).
Create a TV
class that implements the Media
abstract base class. Metadata fields should include:
Any test which depends on the TVmaze API should use a mock and not depend on the actual external service being reachable. Additionally, some of the tests are redundant, try to trim them down to the minimal set which still tests the different types of file and series names.
Show titles such as "DC's Legends of Tomorrow" or "Marvel's Agents of S.H.I.E.L.D." can be problematic or annoying.
An all lowercase file, for example, would yield "Dc'S Legends Of Tomorrow" instead of "DC's Legends of Tomorrow" due to the nature of the str.title()
function. Some shows sometimes also include a year with the title (helpful on IMDB, generally less useful for me).
A means of detecting these titles and manipulating them into a form more consistent with their better-named counterparts would be very useful.
If the config file cannot be loaded, references to CONFIG['Options']
throw a KeyError
. The DEFAULT
section works properly for missing values, but an Options
section must first exist.
WARNING:root:Unable to load config. Traceback (most recent call last): File "/home/media/downloads/nielsen/nielsen.py", line 172, in <module> main() File "/home/media/downloads/nielsen/nielsen.py", line 157, in main level=getattr(logging, CONFIG['Options']['LogLevel'], 30)) File "/usr/lib/python3.5/configparser.py", line 956, in __getitem__ raise KeyError(key) KeyError: 'Options'
The Wiki has some outdated references from when the project was using the omdb.py
module and lack documentation for the new interactive
option. Review and update as needed.
titles.get_series_id
prompts the user for an interactive selection if more than one result is returned by the TVmaze API. This interactive action should be optional and/or handled by the caller.
Traceback (most recent call last):
File "/home/irish/.local/bin/nielsen", line 11, in <module>
load_entry_point('Nielsen==0.9.6', 'console_scripts', 'nielsen')()
File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-0.9.6-py3.6.egg/nielsen/api.py", line 253, in main
File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-0.9.6-py3.6.egg/nielsen/api.py", line 161, in process_file
File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-0.9.6-py3.6.egg/nielsen/api.py", line 70, in get_file_info
File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-0.9.6-py3.6.egg/nielsen/titles.py", line 58, in get_episode_title
File "/home/irish/.local/lib/python3.6/site-packages/omdb-0.7.0-py3.6.egg/omdb/api.py", line 51, in imdbid
File "/home/irish/.local/lib/python3.6/site-packages/omdb-0.7.0-py3.6.egg/omdb/api.py", line 23, in get
File "/home/irish/.local/lib/python3.6/site-packages/omdb-0.7.0-py3.6.egg/omdb/client.py", line 104, in get
File "/home/irish/.local/lib/python3.6/site-packages/omdb-0.7.0-py3.6.egg/omdb/client.py", line 60, in request
File "/home/irish/.local/lib/python3.6/site-packages/requests-2.13.0-py3.6.egg/requests/models.py", line 909, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: http://www.omdbapi.com/?i=tt4052886&page=1&Season=02&Episode=04&plot=short&tomatoes=False
If the OMDB API call fails, the HTTPError
exception is unhandled and the process fails altogether rather than continuing on with the information it has available.
A file was renamed incorrectly. Game.of.Thrones.S06E07.1080p.HDTV.6CH.ShAaNiG.mkv
was handled by the filter (?P<series>.+)\.+S?(?P<season>\d{2})\.?E?(?P<episode>\d{2})\.*(?P<title>.*)?\.+(?P<extension>\w+)$
and was renamed to Game of Thrones S06E07 -10.80- P.mkv
.
Due to #53, I changed OrganizeFiles
to False
in nielsen.ini
and processed a large number of files which were already organized, but which were missing episode titles.
# From /srv/tv/<Series>
nielsen **/*
This correctly moved into all the Season *
directories and processed the files therein, but then moved them all to the /srv/tv/<Series>
directory from which nielsen
was called.
If OrganizeFiles
is False
, the files should be renamed in place.
Create a Protocol
for the MetadataFetcher
defining how they should be used and associated with the Media
classes.
While determining file information, renaming, and even organizing files should work fine on Windows, chown
and chmod
are not available and will probably throw OSError
s (or fail in some other way).
I have no idea where Windows will try to load the config file from, or if the xdg
package will even exist (probably not).
[HorribleSubs] Drifters - 01v2 [720p].mkv
would be better as
Drifters -01- Fight Song [HorribleSubs 720p v2].mkv
.
Rename the title
module tv
and move all TV-related functions to it (identifying a series, fetching episode titles, etc.).
It feels a little lazy and looks kind of strange to import modules from other parts of the project with an implicit package reference (import .<module>
). Besides, The Zen of Python says that explicit is better than implicit, so just import modules with the actual package prefix.
When the files to operate on are within a subdirectory, the containing dirname
is included when setting the new file destination.
2017-09-18 19:54:01,177 INFO Processing 'Game.of.Thrones.S07E01.720p.HDTV.x264-AVS[rarbg]/Game.of.Thrones.S07E01.720p.HDTV.x264-AVS.mkv'
2017-09-18 19:54:01,183 INFO Show ID for 'Game of Thrones': 82
2017-09-18 19:54:01,183 INFO Series ID: 82, Season: 07, Episode: 01
2017-09-18 19:54:01,186 DEBUG Starting new HTTP connection (1): api.tvmaze.com
2017-09-18 19:54:01,608 DEBUG http://api.tvmaze.com:80 "GET /shows/82/episodebynumber?season=07&number=01 HTTP/1.1" 200 None
2017-09-18 19:54:01,610 INFO Title: Dragonstone
2017-09-18 19:54:01,610 DEBUG Series: 'Game of Thrones'
2017-09-18 19:54:01,610 DEBUG Season: '07'
2017-09-18 19:54:01,610 DEBUG Episode: '01'
2017-09-18 19:54:01,610 DEBUG Title: 'Dragonstone'
2017-09-18 19:54:01,611 DEBUG Extension: 'mkv'
2017-09-18 19:54:01,611 INFO Rename to: 'Game of Thrones -07.01- Dragonstone.mkv'
2017-09-18 19:54:01,611 DEBUG Creating and/or moving to: /home/irish/Videos/TV/Game of Thrones/Season 07
2017-09-18 19:54:01,615 ERROR [Errno 2] No such file or directory: '/home/irish/Videos/TV/Game of Thrones/Season 07/Game.of.Thrones.S07E01.720p.HDTV.x264-AVS[rarbg]/Game of Thrones -07.01-
Dragonstone.mkv'
The DEFAULT
section made sense when there was only one section in the ConfigParser
, but now that there are three sections, it doesn't really make sense anymore. The DEFAULT
section places User
, Group
, Mode
, etc. in every section of the config unnecessarily.
Explicitly set the default options for the Options
section and get rid of the DEFAULT
section.
Using the --no-organize
option raises an error.
Traceback (most recent call last):
File "/home/irish/.local/bin/nielsen", line 11, in <module>
load_entry_point('Nielsen==1.0.1', 'console_scripts', 'nielsen')()
File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-1.0.1-py3.6.egg/nielsen/api.py", line 230, in main
File "/usr/lib/python3.6/configparser.py", line 1192, in set
self._validate_value_types(option=option, value=value)
File "/usr/lib/python3.6/configparser.py", line 1177, in _validate_value_types
raise TypeError("option values must be strings")
TypeError: option values must be strings
Almost everything that uses a config file lets you specify a config file when invoked. Add an option to the argparser
.
Create a base Media
class for other classes to inherit from, extend, or implement.
The class should hold a location of the file it represents as well as metadata (e.g. Series Name, Season, Episode Number, Episode Title for TV; Artist, Album, Track Number, Track Title for Music, etc.).
Add all newly selected IMDB IDs to config file.
Use the pathlib
module for most file-level operations rather than a combination of os
and shutil
functions. The current implementation feels hacky and not as portable as it should be.
While I think <Series> -<Season>.<Episode>- <Title>.<Extension>
is the best filename format, not everyone agrees. Users should be able to configure how they want to name their files.
Some files are named so poorly that it's either difficult to write a pattern to match them, or simply not worth doing because the odds of encountering them are so low. Add a pattern (or several) which attempts only to match the series name and season/episode numbers. With just that information, the rest can be fetched and the new file name can be constructed.
The project is now (slightly) larger than a single script, and installing with pip
will now place nielsen.py
, config.py
, and titles.py
directly in the site-packages
directory, which is just bad form and asking for problems.
Create a Nielsen package and reorganize the files slightly to make distribution cleaner.
Add a --dry-run
option that will output the would-be new filename, but doesn't actually rename or organize the files.
Because there are logging
calls in load_config()
, the logging.basicConfig()
call which occurs after the load_config()
call does nothing. No logging
calls should be made prior to logging.basicConfig()
.
Rather than parsing command line options and overriding options in main
, handle all configuration options in the config
module.
Some episode titles have characters in them which aren't valid in filenames on some filesystems (e.g. colons and slashes).
Implement a means of filtering episode titles so that troublesome characters or substrings can be replaced when renaming files.
Traceback (most recent call last):
File "/home/irish/.local/bin/nielsen", line 11, in <module>
load_entry_point('Nielsen==1.0.3', 'console_scripts', 'nielsen')()
File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-1.0.3-py3.6.egg/nielsen/api.py", line 266, in main
File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-1.0.3-py3.6.egg/nielsen/api.py", line 184, in process_file
FileNotFoundError: [Errno 2] No such file or directory: 'Brooklyn Nine-Nine/Season 02/Brooklyn.Nine-Nine.S02E20.HDTV.x264-ASAP.mp4' -> 'Brooklyn Nine-Nine/Season 02/Brooklyn Nine-Nine -02.20- AC/DC.mp4'
The forward slash in the episode title (AC/DC
) causes the file rename operation to fail because a filename cannot have a path separator character in it. Invalid filename characters should be filtered in some way, either by removing them or replacing them with a different character (a hyphen would probably be fine in nearly all circumstances).
nielsen.py Downton Abbey/Downton.Abbey.S06E01.mp4
INFO:Processing 'Downton Abbey/Downton.Abbey.S06E01.mp4'
INFO:Series: 'Downton Abbey/Downton Abbey'
INFO:Season: '06'
INFO:Episode: '01'
INFO:Title: ''
INFO:Extension: 'mp4'
INFO:Rename to: 'Downton Abbey/Downton Abbey -06.01- .mp4'
DEBUG:Creating and/or moving to: /srv/tv/Downton Abbey/Downton Abbey/Season 06
IMDB episode fetching features are working well, but the user may not always be online, which would cause a failure. Allow the user to enable/disable at runtime.
If a file has already been processed by Nielsen and successfully renamed, Nielsen can no longer recognize the file. Not handling the filename format that the program itself produces seems a major oversight and would prevent the user from organizing files that have already been renamed.
Add multiple patterns for various filename formats and use the first one that matches.
As one might expect, the process_file
function raises a FileNotFoundError
in cases where the file isn't found. Handle the exception more gracefully and continue with the rest of the files rather than crashing.
Do not rename or move files if the destination already exists.
Many files don't have an episode title included, but there are plenty of places on the internet where the episode title could be found. Make use of one of them.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.