Giter Club home page Giter Club logo

acousticbrainz-client's People

Contributors

alastair avatar freso avatar ianmcorvidae avatar jesseweinstein avatar jonnyjd avatar legoktm avatar mayhem avatar mineo avatar rsh7 avatar zas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

acousticbrainz-client's Issues

Offline mode

It would be nice if there was an offline mode and then a batch submit option. My internet access randomly drops causing the script to die, so I have it running inside a bash loop right now.

sqlite error: database is locked

When running multiple instances of abzsubmit in parallel, I occasionally hit the following error:

Traceback (most recent call last):
  File "./abzsubmit", line 22, in <module>
    main(args.p)
  File "./abzsubmit", line 14, in main
    acousticbrainz.process(path)
  File "/home/cwalton/Development/Musicbrainz/acousticbrainz-client/abz/acousticbrainz.py", line 143, in process
    process_file(path)
  File "/home/cwalton/Development/Musicbrainz/acousticbrainz-client/abz/acousticbrainz.py", line 117, in process_file
    add_to_filelist(filepath)
  File "/home/cwalton/Development/Musicbrainz/acousticbrainz-client/abz/acousticbrainz.py", line 28, in add_to_filelist
    r = c.execute(query, (filepath.decode("utf-8"), reason))
sqlite3.OperationalError: database is locked

Provide pypi package

setup.py already exists, so it would be great if you could take the additional step of pushing it to pypi (or if that's not possible to provide a docker image with instructions)

connection error in linux

Hey,
I'm running the submitter client on ubuntu 14.04 and I get connection errors almost constantly.
A typical run would go something like this:

hasty@simplex:~/abzsubmit-0.1$ ./abzsubmit ../Music/

[... ] processing /home/hasty/Music
...
... a bunch of files listed ...
...
[:) done ] /home/hasty/Music/Autechre/Oversteps/14 - Yuop.m4a
[:) done ] /home/hasty/Music/Autechre/Oversteps/10 - d-sho qub.m4a
[:) done ] /home/hasty/Music/Autechre/Oversteps/4 - pt2ph8.m4a
[:) done ] /home/hasty/Music/Autechre/Envane/3 - Laughing Quarter.mp3
[:) done ] /home/hasty/Music/Autechre/Envane/2 - Latent Quarter.mp3
[:) done ] /home/hasty/Music/Autechre/Envane/4 - Draun Quarter.mp3
[:) ] /home/hasty/Music/Autechre/Envane/1 - Goz Quarter.mp3
[:) ] /home/hasty/Music/Autechre/Exai/11 - nodezsh.flac
[:) ] /home/hasty/Music/Autechre/Exai/16 - recks on.flac
[:) ] /home/hasty/Music/Autechre/Exai/12 - runrepik.flac
Traceback (most recent call last):echre/Exai/6 - vekoS.flac
File "./abzsubmit", line 24, in
main(sys.argv[1:])
File "./abzsubmit", line 15, in main
acousticbrainz.process(path)
File "/home/hasty/abzsubmit-0.1/abz/acousticbrainz.py", line 165, in process
process_directory(path)
File "/home/hasty/abzsubmit-0.1/abz/acousticbrainz.py", line 155, in process_directory
process_file(os.path.abspath(os.path.join(dirpath, f)))
File "/home/hasty/abzsubmit-0.1/abz/acousticbrainz.py", line 131, in process_file
submit_features(recid, features)
File "/home/hasty/abzsubmit-0.1/abz/acousticbrainz.py", line 91, in submit_features
r = requests.post(url, data=featstr)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 88, in post
return request('post', url, data=data, *_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, *_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, in request
resp = self.send(prep, *_send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, in send
r = adapter.send(request, *_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 378, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='acousticbrainz.org', port=80): Max retries exceeded with url: /154585c1-f167-41bc-a134-d9a4f691ba83/low-level (Caused by <class 'socket.error'>: [Errno 32] Broken pipe)

It will generally process up to 20-ish files before this happens. I have checked the actual connection - at least as far as pinging acousticbrainz.org goes, there are no issues. Also, a windows machine on the same LAN has no problems submitting.

Possibly track full status of files in sqlite log

Currently, the sqlite log file essentially stores two states: unprocessed, by the file not being in the DB, or processed, by it being there, with an optional reason it's marked processed (so, a third 'failed' state, roughly).

This is sufficient for the log, and for keeping track that a particular file failed due to an extractor error (and thus should be retried, though at present only will be if manually deleted from the database). However, it might be useful to keep track of a more complete state, possibly along the lines of:

  • check for file in DB; if marked currently processing (perhaps with a PID and/or timestamp to check against for validity) or if marked completed/failed (perhaps including the essentia build sha and possibly a timestamp, to check for things that should be re-tried) go to next file
  • if not, mark the file as processing (by this PID, or at this timestamp). perhaps store a hash of the file so it can be checked if changed in the future, as well
  • check for MBID, store if there was none in the file as a "failed, no MBID" state
  • otherwise process with essentia, when done mark completed but not submitted (maybe including what temporary filename it's in), or mark as failed if applicable
  • submit to server, delete temporary file, mark completed

This would let multiple processes be ostensibly looking at the same set of files, for example, since a file marked currently processing would be skipped by another worker. Storing things like timestamps, PIDs, essentia build hashes, and file hashes could let us do more automatically, such as retrying files that failed based on extractor issues when a new extractor is being used, or when files are retagged but not renamed (or renamed but otherwise unchanged).

Overkill, useful, somewhere in between?

todo list

  • multicore
  • fingerprint lookup to get mbids for files with no tag
  • make output/logging better
  • better error reporting
  • Compile extractor with small static libav
  • fix lossless detection
  • make the processed file storage more optimal (#7)
  • look for the extractor better, including in $PATH, and with different arch suffixes
  • setup.py, pypi
  • fix the readme

--help for usage

It’s hard to use a tool where you can’t check on the command line how it works.

64-bit streaming_extractor_music build doesn't work on Arch Linux

If I try to use https://acousticbrainz.org/static/download/essentia-extractor-v2.1_beta2-linux-x86_64.tar.gz on Arch Linux, I consistently get:

Process step: Read metadata
Process step: Compute md5 audio hash and codec
Process step: Replay gain
Process step: Compute audio features
fish: 'streaming_extractor_music Sofia…' terminated by signal SIGSEGV (Address boundary error)

Any chance for a new (64-bit) build anytime soon? (I'm not sure whether to report this here or for essentia. The erroring code is essentia, but the particular essentia build is the only one supported for abzsubmit.)

`invalid or missing encoding declaration for 'streaming_extractor_music'` (Py3)

It looks like Py3 is trying to import streaming_extractor_music? Possibly (well, probably) related to #23.

(venv)freso@koume> python setup.py install
running install
running build
running build_py
running build_scripts
Traceback (most recent call last):
  File "/home/freso/Development/AcousticBrainz/venv/lib/python3.4/tokenize.py", line 375, in find_cookie
    line_string = line.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa8 in position 40: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 23, in <module>
    "Topic :: Scientific/Engineering :: Information Analysis"
  File "/usr/lib64/python3.4/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib64/python3.4/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/usr/lib64/python3.4/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/usr/lib64/python3.4/distutils/command/install.py", line 539, in run
    self.run_command('build')
  File "/usr/lib64/python3.4/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib64/python3.4/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/usr/lib64/python3.4/distutils/command/build.py", line 126, in run
    self.run_command(cmd_name)
  File "/usr/lib64/python3.4/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/usr/lib64/python3.4/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/usr/lib64/python3.4/distutils/command/build_scripts.py", line 50, in run
    self.copy_scripts()
  File "/usr/lib64/python3.4/distutils/command/build_scripts.py", line 82, in copy_scripts
    encoding, lines = tokenize.detect_encoding(f.readline)
  File "/home/freso/Development/AcousticBrainz/venv/lib/python3.4/tokenize.py", line 416, in detect_encoding
    encoding = find_cookie(first)
  File "/home/freso/Development/AcousticBrainz/venv/lib/python3.4/tokenize.py", line 380, in find_cookie
    raise SyntaxError(msg)
SyntaxError: invalid or missing encoding declaration for 'streaming_extractor_music'
(venv)[1] freso@koume> python --version
Python 3.4.2

Recognise "Not found" as an error and don't mark it as submitted

> abzsubmit .

[...       ] processing /tmp/freso-tmp
[...       ] /tmp/freso-tmp/Various Artists - Fly Girls_ B‐Boys Beware - Revenge of the Super Female Rappers_ _Disc 2 of 2_ (2009) [FLAC]/10. Roxanne Sh[:( submit ] /tmp/freso-tmp/Various Artists - Fly Girls_ B‐Boys Beware - Revenge of the Super Female Rappers_ _Disc 2 of 2_ (2009) [FLAC]/10. Roxanne Shanté - Bite This.flac
{
  "message": "Not found"
}

[:)        ] /tmp/freso-tmp/Various Artists - Fly Girls_ B‐Boys Beware - Revenge of the Super Female Rappers_ _Disc 2 of 2_ (2009) [FLAC]/10. Roxanne Shanté - Bite This.flac
[...       ] /tmp/freso-tmp/Various Artists - Fly Girls_ B‐Boys Beware - Revenge of the Super Female Rappers_ _Disc 2 of 2_ (2009) [FLAC]/09. Tina B - J[:( submit ] /tmp/freso-tmp/Various Artists - Fly Girls_ B‐Boys Beware - Revenge of the Super Female Rappers_ _Disc 2 of 2_ (2009) [FLAC]/09. Tina B - Jazzy Sensation.flac
{
  "message": "Not found"
}

[:)        ] /tmp/freso-tmp/Various Artists - Fly Girls_ B‐Boys Beware - Revenge of the Super Female Rappers_ _Disc 2 of 2_ (2009) [FLAC]/09. Tina B - Jazzy Sensation.flac
(…)

Saves to log db with reason set to NULL even though submission endpoint was "not found".

(See also https://tickets.metabrainz.org/browse/AB-368 )

abzsubmit support for checking against server before running essentia

The server doesn't currently keep everything it's submitted, just one thing for each MBID -- so it'll only make a change if you're replacing lossy with lossless or submitting something new. Might speed things up to let abzsubmit parse the tags, check, and then only run essentia/resubmit if it'll get used.

Might benefit from server support for this check as well, of course.

setup.py install fails if `streaming_extractor_music` not in the same dir

(venv)freso@koume> python setup.py install                                                                ~/Development/AcousticBrainz/acousticbrainz-client
running install
running build
running build_py
creating build
creating build/lib
creating build/lib/abz
copying abz/config.py -> build/lib/abz
copying abz/acousticbrainz.py -> build/lib/abz
copying abz/__init__.py -> build/lib/abz
copying abz/fingerprint.py -> build/lib/abz
copying abz/compat.py -> build/lib/abz
copying abz/default.conf -> build/lib/abz
running build_scripts
creating build/scripts-3.4
copying and adjusting abzsubmit -> build/scripts-3.4
error: file '[...]/acousticbrainz-client/streaming_extractor_music' does not exist
(venv)[1] freso@koume> python --version
Python 3.4.2

Rescan previously failed files

Currently there are a number of reasons a scan can fail, incl. missing MBIDs and essentia extractor binary using outdated system calls. These can in multiple cases be fixed without the file name/location changing (e.g., for the listed cases: tagging with Picard without moving/renaming and changing between 32-bit and 64-bit extractor binary respectively), but abzsubmit will not reprocess those files without first pruning/removing its database.

ValueError processing a file

Not sure what's up here. Log:

Processing file /home/ianmcorvidae/Music/proper-tags/flac/Metallica/Death Magnetic/08 The Judas Kiss.flac
 - has recid b957775f-9c63-4f8d-9b61-250582f2e71a
Process step: Read metadata
Process step: Compute md5 audio hash
Process step: Replay gain
Process step: Compute audio features
Process step: Compute aggregation
All done
Writing results to file /tmp/tmp3Nhrl5
Traceback (most recent call last):
  File "./abzsubmit", line 17, in <module>
    main(args.p)
  File "./abzsubmit", line 9, in main
    acousticbrainz.process(path)
  File "/home/ianmcorvidae/Source/acousticbrainz-client/abz/acousticbrainz.py", line 115, in process
    process_directory(path)
  File "/home/ianmcorvidae/Source/acousticbrainz-client/abz/acousticbrainz.py", line 105, in process_directory
    process_file(os.path.abspath(os.path.join(dirpath, f)))
  File "/home/ianmcorvidae/Source/acousticbrainz-client/abz/acousticbrainz.py", line 88, in process_file
    features = json.load(open(tmpname))
  File "/usr/lib64/python2.7/json/__init__.py", line 290, in load
    **kw)
  File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.7/json/decoder.py", line 382, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting : delimiter: line 20 column 11 (char 541)

Will be The_Judas_Kiss.flac in http://ianmcorvidae.net/essentia/ once uploaded.

Consider using sqlite for log file

As it says: consider it a replacement for fopen() :) It also means there's built-in ACID compliance and concurrency, and indexing/UNIQUE would allow for some niceties that probably apply.

Or maybe I just really like SQL, but I still think it's a good idea.

Use High-level classifier models in Windows for analyzing own music

Hello, this is not a direct question about this project, but I would like to ask this question.

I would like to analyze my local high-level classifier models in Windows, but using the streaming_extractor_music in Essentia is impossible to extract the high-level models because Gaia2 is not included in the binary for Windows.
Identifier 'GaiaTransform' not found in registry...

I have tried to compile the Windows Essentia binary with Gaia2, but I have failed to compile it because Gaia2 with MinGW showed a lot of errors on Ubuntu during cross-compiling.

I would like to ask how the high-level classifier models were used in the acousticbrainz-client in Windows to calculate the high-level scores such as the below on Windows.

highlevel:
    compute: 1
    svm_models: ['svm_models/danceability.history', 'svm_models/gender.history', 'svm_models/genre_dortmund.history', 'svm_models/genre_electronic.history', 'svm_models/genre_rosamerica.history', 'svm_models/genre_tzanetakis.history', 'svm_models/ismir04_rhythm.history', 'svm_models/moods_mirex.history', 'svm_models/mood_acoustic.history', 'svm_models/mood_aggressive.history', 'svm_models/mood_electronic.history', 'svm_models/mood_happy.history', 'svm_models/mood_party.history', 'svm_models/mood_relaxed.history', 'svm_models/mood_sad.history', 'svm_models/timbre.history', 'svm_models/tonal_atonal.history', 'svm_models/voice_instrumental.history']

A lot of these models were used for high level data in acousticbrainz.

However, I am still puzzled on how acousticbrainz did this, as the streaming_extractor_music binary that was with the acousticbrainz client on Windows also did not have the GaiaTransform identifier.

I would really hope to know how this was done.
Thank you very much.

error output should go to stderr

Currently errors like this:

[:( nombid ] /var/data/music/flac-db/Various Artists/1999: Fritz Hitz: Die beste Musik der Welt, Volume 0,5/00 - Butch Water - Obst.flac
Process step: Read metadata
  Cannot find musicbrainz recording id
Quitting early.

go to stdout.
So something like abzsubmit /var/data/music/flac-db 2> error.log doesn't work.

[Feature Request] Adhere to XDG Base Directory Specification

Proposal:

Store configuration and persistent application data separately, in the user-defined directories defined by the ${XDG_*_HOME} environment variables

In configuring the client, I couldn't help but groan when I saw yet another new hidden directory in my ${HOME} directory. Like many long-time users of UNIX-like operating systems, the amount of time it takes to scroll from one end of ${HOME} to the other is perhaps the most omnipresent reminder (and penultimately insufferable, after joint pains) of how far removed I've become from the young man I still expect to gaze back at me from the mirror. Freedesktop.org, bless their hearts, offers a relatively simple and widely-adopted solution in the form of the XDG Base Directory Specification and it would be lovely if the client were to be compliant with it. I've summarized the behavior changes I believe would be necessary to arrive at that result, in case someone else who agrees with the proposed changes and has more free time wants to mockup a PR to this effect.

Behavior changes:

  • Switch from using a single directory for application files at ${HOME}/.abzsubmit to two directories for configuration/profiles and persistent activity logs respectively, both with fallback filepaths located a minimum of one nested level beneath ${HOME}.

    def get_config_dir():
    confdir = os.path.join(os.path.expanduser("~"), ".abzsubmit")
    return confdir

  • Look for user configuration file abzsubmit.conf and defaults file default.conf in ${XDG_CONFIG_HOME}/abz instead of at ${HOME}/.abzsubmit/abzsubmit.conf. If either the files or directory do not exist, attempt to create them automatically with octal permissions of 0700 and 0755, respectively. If the ${XDG_CONFIG_HOME} environment variable is unset, fallback to using ${HOME}/.config/abz as the configuration directory.

    CONFIG_FILE = "abzsubmit.conf"

    configfile = os.path.join(get_config_dir(), CONFIG_FILE)

  • Store submissions database file submissions.sqlite in ${XDG_DATA_HOME}/abz or elsewhere as defined in abzsubmit.conf instead of at ${HOME}/.abzsubmit/filelog.sqlite. If either the file or directory do not exist, attempt to create them automatically with octal permissions of 0700 and 0755, respectively. If the ${XDG_DATA_HOME} environment variable is unset, fallback to using ${HOME}/.local/share/abz as the persistent data directory.

    def get_sqlite_file():
    dbfile = os.path.join(get_config_dir(), "filelog.sqlite")
    return dbfile
    def load_settings():
    if not os.path.exists(get_config_dir()):
    os.makedirs(get_config_dir())
    dbfile = get_sqlite_file()
    if not os.path.exists(dbfile):
    create_sqlite(dbfile)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.