Giter Club home page Giter Club logo

nielsen's People

Contributors

irishprime avatar rojoraf avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

thomas-holmes

nielsen's Issues

Add (no-)filter option

Some series names do not benefit from the default behavior of title-casing and some files may be processed without wanting to change the format of the series name at all. Add a --filter/--no-filter option to allow the user to change this behavior at runtime.

Dry-run is not so dry

api.process_file() calls files.set_file_ownership() and file.set_file_mode() before checking the DryRun option in the CONFIG object. This obviously results in changes to both file ownership and file permissions when none are intended.

api.organize_file() doesn't check for the DryRun option at all. This results in the attempted creation of directories in the MediaPath directory when none are intended.

Renaming files across devices fails

A limitation in os.rename causes failures when the source and destination paths are on different devices. Should revert back to shutil.move for file organization.

Build a Docker Container

The project doesn't really need to be containerized, but it's as good an excuse as any to package something up and experiment with Docker.

Move config code to its own module

As Nielsen grows to be a bit more feature rich, putting everything in one file seems worse and worse. Moving the configuration to its own module should allow for more straightforward development of other modules (such as one to handle fetching episode titles or maybe dealing with anime or movies).

Make CLI its own module

Move the CLI to its own module rather than having the functionality and interface so tightly coupled.

Year sometimes incorrectly used as season and episode number

Some shows include a year as part of the title (generally to differentiate them from previously aired shows with the same name). Sometimes the year will be recognized as the the season and episode number while the actual season and episode number will be used as the title.

the.flash.2014.215.hdtv.mp4, for example, will yield The Flash -20.14- 215.mp4.

It's difficult to be certain if a four digit number is the year or the season and episode number when the original name doesn't use the SXXEXX format.

Add command line option to set OrganizeFile

All other options which can be specified in the config file can be overridden or otherwise set on the command line. OrganizeFiles should have a matching flag for argparser.

re.IGNORECASE set for regular expressions with no letters

The second two regular expressions in the patterns list used in the get_file_info function have re.IGNORECASE set, but there are no letters in the regular expression itself. There's no reason to specify the re.IGNORECASE flag.

LogLevel not set properly from config file

The LogLevel set in the config file gets overwritten by the command line argument. This works as intended. However, since the argument has a default, it's always set and will always overwrite the value in the config file.

Invalid cross-device link

The utilities used to move/organize files do not support moving files between devices.

DEBUG    Creating and/or moving to: /mnt/data/Videos/TV/Watchmen/Season 01
ERROR    [Errno 18] Invalid cross-device link: 'Watchmen.S01E07.720p.WEB.h264-TBS[rarbg]/Watchmen -01.07- An Almost Religious Awe.mkv' -> '/mnt/data/Videos/TV/Watchmen/Season 01/Watchmen -01.07- An Almost Religious Awe.mkv'

Switching from os.rename to shutil.move should resolve the issue.

Patterns assumes two digit seasons

Some patterns in the list assume two digit seasons, and therefore zero padding. Should adjust the pattern to match 1 or more digits for the season, and 2 or more digits for the episode number (since episodes are always zero padded).

Create `TV` Class

Create a TV class that implements the Media abstract base class. Metadata fields should include:

  • Series name
  • Season number
  • Episode number
  • Episode title

Mock tests for TVmaze API

Any test which depends on the TVmaze API should use a mock and not depend on the actual external service being reachable. Additionally, some of the tests are redundant, try to trim them down to the minimal set which still tests the different types of file and series names.

Add filter for series names

Show titles such as "DC's Legends of Tomorrow" or "Marvel's Agents of S.H.I.E.L.D." can be problematic or annoying.

An all lowercase file, for example, would yield "Dc'S Legends Of Tomorrow" instead of "DC's Legends of Tomorrow" due to the nature of the str.title() function. Some shows sometimes also include a year with the title (helpful on IMDB, generally less useful for me).

A means of detecting these titles and manipulating them into a form more consistent with their better-named counterparts would be very useful.

KeyError if unable to load config

If the config file cannot be loaded, references to CONFIG['Options'] throw a KeyError. The DEFAULT section works properly for missing values, but an Options section must first exist.

WARNING:root:Unable to load config. Traceback (most recent call last): File "/home/media/downloads/nielsen/nielsen.py", line 172, in <module> main() File "/home/media/downloads/nielsen/nielsen.py", line 157, in main level=getattr(logging, CONFIG['Options']['LogLevel'], 30)) File "/usr/lib/python3.5/configparser.py", line 956, in __getitem__ raise KeyError(key) KeyError: 'Options'

Wiki Out of Date

The Wiki has some outdated references from when the project was using the omdb.py module and lack documentation for the new interactive option. Review and update as needed.

Make interactive series selection optional

titles.get_series_id prompts the user for an interactive selection if more than one result is returned by the TVmaze API. This interactive action should be optional and/or handled by the caller.

Fails to rename file if OMDB API call fails

Traceback (most recent call last):
  File "/home/irish/.local/bin/nielsen", line 11, in <module>
    load_entry_point('Nielsen==0.9.6', 'console_scripts', 'nielsen')()
  File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-0.9.6-py3.6.egg/nielsen/api.py", line 253, in main
  File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-0.9.6-py3.6.egg/nielsen/api.py", line 161, in process_file
  File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-0.9.6-py3.6.egg/nielsen/api.py", line 70, in get_file_info
  File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-0.9.6-py3.6.egg/nielsen/titles.py", line 58, in get_episode_title
  File "/home/irish/.local/lib/python3.6/site-packages/omdb-0.7.0-py3.6.egg/omdb/api.py", line 51, in imdbid
  File "/home/irish/.local/lib/python3.6/site-packages/omdb-0.7.0-py3.6.egg/omdb/api.py", line 23, in get
  File "/home/irish/.local/lib/python3.6/site-packages/omdb-0.7.0-py3.6.egg/omdb/client.py", line 104, in get
  File "/home/irish/.local/lib/python3.6/site-packages/omdb-0.7.0-py3.6.egg/omdb/client.py", line 60, in request
  File "/home/irish/.local/lib/python3.6/site-packages/requests-2.13.0-py3.6.egg/requests/models.py", line 909, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: http://www.omdbapi.com/?i=tt4052886&page=1&Season=02&Episode=04&plot=short&tomatoes=False

If the OMDB API call fails, the HTTPError exception is unhandled and the process fails altogether rather than continuing on with the information it has available.

Pattern mishandled file

A file was renamed incorrectly. Game.of.Thrones.S06E07.1080p.HDTV.6CH.ShAaNiG.mkv was handled by the filter (?P<series>.+)\.+S?(?P<season>\d{2})\.?E?(?P<episode>\d{2})\.*(?P<title>.*)?\.+(?P<extension>\w+)$ and was renamed to Game of Thrones S06E07 -10.80- P.mkv.

Files may still be moved even if OrganizeFiles is False

Due to #53, I changed OrganizeFiles to False in nielsen.ini and processed a large number of files which were already organized, but which were missing episode titles.

# From /srv/tv/<Series>
nielsen **/*

This correctly moved into all the Season * directories and processed the files therein, but then moved them all to the /srv/tv/<Series> directory from which nielsen was called.

If OrganizeFiles is False, the files should be renamed in place.

Add Windows support

While determining file information, renaming, and even organizing files should work fine on Windows, chown and chmod are not available and will probably throw OSErrors (or fail in some other way).

I have no idea where Windows will try to load the config file from, or if the xdg package will even exist (probably not).

Handle anime

[HorribleSubs] Drifters - 01v2 [720p].mkv would be better as
Drifters -01- Fight Song [HorribleSubs 720p v2].mkv.

Rename title module to tv

Rename the title module tv and move all TV-related functions to it (identifying a series, fetching episode titles, etc.).

Use explicit package names

It feels a little lazy and looks kind of strange to import modules from other parts of the project with an implicit package reference (import .<module>). Besides, The Zen of Python says that explicit is better than implicit, so just import modules with the actual package prefix.

File destination erroneously includes dirname

When the files to operate on are within a subdirectory, the containing dirname is included when setting the new file destination.

2017-09-18 19:54:01,177 INFO     Processing 'Game.of.Thrones.S07E01.720p.HDTV.x264-AVS[rarbg]/Game.of.Thrones.S07E01.720p.HDTV.x264-AVS.mkv'
2017-09-18 19:54:01,183 INFO     Show ID for 'Game of Thrones': 82
2017-09-18 19:54:01,183 INFO     Series ID: 82, Season: 07, Episode: 01
2017-09-18 19:54:01,186 DEBUG    Starting new HTTP connection (1): api.tvmaze.com
2017-09-18 19:54:01,608 DEBUG    http://api.tvmaze.com:80 "GET /shows/82/episodebynumber?season=07&number=01 HTTP/1.1" 200 None
2017-09-18 19:54:01,610 INFO     Title: Dragonstone
2017-09-18 19:54:01,610 DEBUG    Series: 'Game of Thrones'
2017-09-18 19:54:01,610 DEBUG    Season: '07'
2017-09-18 19:54:01,610 DEBUG    Episode: '01'
2017-09-18 19:54:01,610 DEBUG    Title: 'Dragonstone'
2017-09-18 19:54:01,611 DEBUG    Extension: 'mkv'
2017-09-18 19:54:01,611 INFO     Rename to: 'Game of Thrones -07.01- Dragonstone.mkv'
2017-09-18 19:54:01,611 DEBUG    Creating and/or moving to: /home/irish/Videos/TV/Game of Thrones/Season 07
2017-09-18 19:54:01,615 ERROR    [Errno 2] No such file or directory: '/home/irish/Videos/TV/Game of Thrones/Season 07/Game.of.Thrones.S07E01.720p.HDTV.x264-AVS[rarbg]/Game of Thrones -07.01-
 Dragonstone.mkv'

DEFAULT section in ConfigParser being used incorrectly

The DEFAULT section made sense when there was only one section in the ConfigParser, but now that there are three sections, it doesn't really make sense anymore. The DEFAULT section places User, Group, Mode, etc. in every section of the config unnecessarily.

Explicitly set the default options for the Options section and get rid of the DEFAULT section.

--no-organize passes invalid type to configparser

Using the --no-organize option raises an error.

Traceback (most recent call last):
  File "/home/irish/.local/bin/nielsen", line 11, in <module>
    load_entry_point('Nielsen==1.0.1', 'console_scripts', 'nielsen')()
  File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-1.0.1-py3.6.egg/nielsen/api.py", line 230, in main
  File "/usr/lib/python3.6/configparser.py", line 1192, in set
    self._validate_value_types(option=option, value=value)
  File "/usr/lib/python3.6/configparser.py", line 1177, in _validate_value_types
    raise TypeError("option values must be strings")
TypeError: option values must be strings

Create base Media class

Create a base Media class for other classes to inherit from, extend, or implement.

The class should hold a location of the file it represents as well as metadata (e.g. Series Name, Season, Episode Number, Episode Title for TV; Artist, Album, Track Number, Track Title for Music, etc.).

Use pathlib for file operations

Use the pathlib module for most file-level operations rather than a combination of os and shutil functions. The current implementation feels hacky and not as portable as it should be.

Customize file names

While I think <Series> -<Season>.<Episode>- <Title>.<Extension> is the best filename format, not everyone agrees. Users should be able to configure how they want to name their files.

Make a last ditch effort to find a series name and season/episode numbers

Some files are named so poorly that it's either difficult to write a pattern to match them, or simply not worth doing because the odds of encountering them are so low. Add a pattern (or several) which attempts only to match the series name and season/episode numbers. With just that information, the rest can be fetched and the new file name can be constructed.

Create an actual package

The project is now (slightly) larger than a single script, and installing with pip will now place nielsen.py, config.py, and titles.py directly in the site-packages directory, which is just bad form and asking for problems.

Create a Nielsen package and reorganize the files slightly to make distribution cleaner.

Logging is not being configured correctly

Because there are logging calls in load_config(), the logging.basicConfig() call which occurs after the load_config() call does nothing. No logging calls should be made prior to logging.basicConfig().

Filter Episode Titles

Some episode titles have characters in them which aren't valid in filenames on some filesystems (e.g. colons and slashes).

Implement a means of filtering episode titles so that troublesome characters or substrings can be replaced when renaming files.

Invalid filename characters in episode titles causes renaming to fail

Traceback (most recent call last):
  File "/home/irish/.local/bin/nielsen", line 11, in <module>
    load_entry_point('Nielsen==1.0.3', 'console_scripts', 'nielsen')()
  File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-1.0.3-py3.6.egg/nielsen/api.py", line 266, in main
  File "/home/irish/.local/lib/python3.6/site-packages/Nielsen-1.0.3-py3.6.egg/nielsen/api.py", line 184, in process_file
FileNotFoundError: [Errno 2] No such file or directory: 'Brooklyn Nine-Nine/Season 02/Brooklyn.Nine-Nine.S02E20.HDTV.x264-ASAP.mp4' -> 'Brooklyn Nine-Nine/Season 02/Brooklyn Nine-Nine -02.20- AC/DC.mp4'

The forward slash in the episode title (AC/DC) causes the file rename operation to fail because a filename cannot have a path separator character in it. Invalid filename characters should be filtered in some way, either by removing them or replacing them with a different character (a hyphen would probably be fine in nearly all circumstances).

Processing files within folders adds the folder to the series name

Command

nielsen.py Downton Abbey/Downton.Abbey.S06E01.mp4

Log

INFO:Processing 'Downton Abbey/Downton.Abbey.S06E01.mp4'
INFO:Series: 'Downton Abbey/Downton Abbey'
INFO:Season: '06'
INFO:Episode: '01'
INFO:Title: ''
INFO:Extension: 'mp4'
INFO:Rename to: 'Downton Abbey/Downton Abbey -06.01- .mp4'
DEBUG:Creating and/or moving to: /srv/tv/Downton Abbey/Downton Abbey/Season 06

Add command line options for IMDB

IMDB episode fetching features are working well, but the user may not always be online, which would cause a failure. Allow the user to enable/disable at runtime.

Add multiple patterns for determining episode data

If a file has already been processed by Nielsen and successfully renamed, Nielsen can no longer recognize the file. Not handling the filename format that the program itself produces seems a major oversight and would prevent the user from organizing files that have already been renamed.

Add multiple patterns for various filename formats and use the first one that matches.

Retrieve missing episode titles

Many files don't have an episode title included, but there are plenty of places on the internet where the episode title could be found. Make use of one of them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.