Giter Club home page Giter Club logo

acoustid-server's Issues

Submission with duration > 32767

Submitting a fingerprint for a duration > 32767 currently fails due to

if p['duration'] <= 0 or p['duration'] > 0x7fff:

Is this a strict limitation because that value gets used elsewhere as a signed integer, or could it be lifted? Maybe at least allow the size of an unsigned int with 32bits? The only real limitation I could immediately see is the schema using smallint, but I guess that could be easily set to integer.

See the related issue reported against Picard on https://tickets.metabrainz.org/browse/PICARD-1486

Compilation errors

Greetings!

I'm trying to compile the PostgreSQL extension. Right now, I'm having the following issues:

gcc -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -fPIC -pie -I/usr/include/mit-krb5 -DLINUX_OOM_SCORE_ADJ=0 -fno-omit-frame-pointer -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -fpic -I. -I./ -I/usr/include/postgresql/9.3/server -I/usr/include/postgresql/internal -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2 -I/usr/include/tcl8.6 -c -o acoustid_compare.o acoustid_compare.c
acoustid_compare.c:87:20: error: unknown type name ‘int4’
match_fingerprints(int4 a, int asize, int4 *b, int bsize)
^
acoustid_compare.c:87:40: error: unknown type name ‘int4’
match_fingerprints(int4 *a, int asize, int4 *b, int bsize)
^
acoustid_compare.c:119:21: error: unknown type name ‘int4’
match_fingerprints2(int4 *a, int asize, int4 *b, int bsize, int maxoffset)
^
acoustid_compare.c:119:41: error: unknown type name ‘int4’
match_fingerprints2(int4 *a, int asize, int4 *b, int bsize, int maxoffset)
^
acoustid_compare.c: In function ‘acoustid_compare’:
acoustid_compare.c:239:2: warning: implicit declaration of function ‘match_fingerprints’ [-Wimplicit-function-declaration]
result = match_fingerprints(
^
acoustid_compare.c:42:23: error: ‘int4’ undeclared (first use in this function)
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:240:3: note: in expansion of macro ‘ARRPTR’
ARRPTR(a), ARRNELEMS(a),
^
acoustid_compare.c:42:23: note: each undeclared identifier is reported only once for each function it appears in
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:240:3: note: in expansion of macro ‘ARRPTR’
ARRPTR(a), ARRNELEMS(a),
^
acoustid_compare.c:42:29: error: expected expression before ‘)’ token
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:240:3: note: in expansion of macro ‘ARRPTR’
ARRPTR(a), ARRNELEMS(a),
^
acoustid_compare.c:42:29: error: expected expression before ‘)’ token
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:241:3: note: in expansion of macro ‘ARRPTR’
ARRPTR(b), ARRNELEMS(b));
^
acoustid_compare.c: In function ‘acoustid_compare2’:
acoustid_compare.c:259:2: warning: implicit declaration of function ‘match_fingerprints2’ [-Wimplicit-function-declaration]
result = match_fingerprints2(
^
acoustid_compare.c:42:23: error: ‘int4’ undeclared (first use in this function)
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:260:3: note: in expansion of macro ‘ARRPTR’
ARRPTR(a), ARRNELEMS(a),
^
acoustid_compare.c:42:29: error: expected expression before ‘)’ token
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:260:3: note: in expansion of macro ‘ARRPTR’
ARRPTR(a), ARRNELEMS(a),
^
acoustid_compare.c:42:29: error: expected expression before ‘)’ token
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:261:3: note: in expansion of macro ‘ARRPTR’
ARRPTR(b), ARRNELEMS(b),
^
acoustid_compare.c: In function ‘acoustid_extract_query’:
acoustid_compare.c:289:2: error: unknown type name ‘int4’
int4 *orig, *query;
^
acoustid_compare.c:42:23: error: ‘int4’ undeclared (first use in this function)
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:294:9: note: in expansion of macro ‘ARRPTR’
orig = ARRPTR(a);
^
acoustid_compare.c:42:29: error: expected expression before ‘)’ token
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:294:9: note: in expansion of macro ‘ARRPTR’
orig = ARRPTR(a);
^
acoustid_compare.c:42:29: error: expected expression before ‘)’ token
#define ARRPTR(x) ( (int4 *) ARR_DATA_PTR(x) )
^
acoustid_compare.c:308:10: note: in expansion of macro ‘ARRPTR’
query = ARRPTR(q);
^
acoustid_compare.c:311:8: error: expected ‘;’ before ‘x’
int4 x = ACOUSTID_QUERY_STRIP(orig[i]);
^
acoustid_compare.c:316:20: error: ‘x’ undeclared (first use in this function)
if (query[j] == x) {
^
make: *
[acoustid_compare.o] Erro 1

Faster fingerprint comparison

Currently we are trying to find all keys that have less than 2-bit errors. This is both inefficient and too strict. We should estimate the best alignment of the two fingerprints and calculate a simple bit-error-rate. Guessing the alignment is not easy, but when acoustid-index is released, we will get information on which keys matched and use their offsets.

Access-Control-Allow-Origin: http://musicbrainz.org

Could you please add HTTPheader to Web services to allow cross domain POST requests on your Web Services with XHR2 ?
Access-Control-Allow-Origin: http://musicbrainz.org ← at least for musicbrainz.org server
I would use it with list_by_mbid (POST) to stop requesting for each track’s AcoustID, I would batch query the whole release instead.
Thanks very much, Luks ! :)

Provide more User Submitted Data with AcoustId (TrackNo/AlbumArtist)

AcoustID has 45M unique ids, MusicBrainz has 20M so there are roughly 25M recordings in AcoustID that can not be matched to MusicBrainz, and this disparity looks set to increase over time.

So for these songs the metadata submitted when a fingerprint is added to AcoustId becomes more important. Although this data has to be used carefully as there is no 3rd party verification it can be used to provide some basic identification of songs.

AcoustId currently returns:

Artist
Title
Album

if they were submitted with the fingerprint

But could this be extended with two common fields that would be found in alot of user submitted metadata, trackNo and albumArtist.

If AcoustId returned TrackNo then we could construct basic album metadata for a group of songs that could only be matched to AcoustId.

If AcoustId returned AlbumArtist, this would be very helpful for identifying if the album is a various artists compilation release.

Could not resolve 'deb.oxygene.sk'

Executing sudo apt-get update
results in this error:

Err:1 http://deb.oxygene.sk/ubuntu yakkety InRelease
Could not resolve 'deb.oxygene.sk'

(Yes, I have executed echo "deb http://deb.oxygene.sk/ubuntu lsb_release -c -s main" | sudo tee /etc/apt/sources.list.d/oxygene.list before)

MusicBrainz metadata caching

MusicBrainz lookups are a big chunk of the time spend in search requests, which really should not happen. Fingerprint search should be the most expensive operation, metadata should be cheap. We need to either cache it or make then querying faster.

testcase error in test pg

I notice the testcase in test_pg.py as below:

@with_database
def test_match_similar_3(conn):
    query = sql.select([sql.func.acoustid_compare2(TEST_1A_FP_RAW, TEST_1D_FP_RAW, 80)])
    score = conn.execute(query).scalar()
    assert_true(score < const.TRACK_MERGE_THRESHOLD)

If this case purpose is to test match similar, the score should more than const.TRACK_MERGE_THRESHOLD rather than less. Is it an error?
Should change to the code below?

@with_database
def test_match_similar_3(conn):
    query = sql.select([sql.func.acoustid_compare2(TEST_1A_FP_RAW, TEST_1D_FP_RAW, 80)])
    score = conn.execute(query).scalar()
    assert_true(score >= const.TRACK_MERGE_THRESHOLD)

Missing 'requirements.txt' in pip install

Executing
(e) acoustid@ubuntu:~$ pip install -r requirements.txt
results in this error:
Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'

Where can we get this missing file?

Replication file missed key information

In replication files, there exists these types of updates:

<event id="335617599" op="U" table="track_mbid"><keys><column name="track_id">9578103</column></keys><values><column name="submission_count">112</column></values></event>

For table track_mbid, it always uses only track_id as key, but the unique key should be (track_id, mbid), or use primary key id.
Also, one track_id can correspond to many mbids. In this case, this update doesn't tell which (track_id, mbid) to update.

Show min/max valid durations for an AcoustID, hilight recordings which are outside that range

By my understanding of the maximum duration difference system for AcoustID, there is a maximum range of 7 seconds that any given acoustID can have — if a recording is found which is outside that range it would be assigned a new AcoustID, and any linked recordings which are outside that range are either wrongly-linked or have a wrong duration themselves.

It would be helpful if this range were displayed somewhere. Example:
http://acoustid.org/track/12ec629c-dd1c-422a-9e3b-90f12280ea35
would show something like: “max range 3:31–3:39”. Of course for some this range would be wider, up to 14 seconds.

It would also then be helpful to highlight (in red?) the linked recordings which are outside that range.

Numeric value out of range

I am unsure if this is the correct repository to post this question.

I've setup a postgresql server with the fingerprint table from your scripts containing the integer[] column. When I try inserting the raw fingerprints from fpcalc, I get the error

Numeric value out of range: 7 ERROR: value "3464512762" is out of range for type integer

Am I using fpcalc wrong or is the database wrong?

Implement MusicBrainz-like replication

We should use dbmirror to produce replication packets in the same format as MusicBrainz. The mbslave project can be then generalized to be able to import them into the Acoustid database.

unable to create Index & one function

Hello All.

I'm using SQL query and created a database in Postgre SQL. I have only left with two issue:

1: Unable to create the Index
CREATE INDEX fingerprint_idx_fingerprint ON fingerprint USING gin (acoustid_extract_query(fingerprint) gin__int_ops);

Error:

RROR: function acoustid_extract_query(integer[]) does not exist
LINE 1: ...erprint_idx_fingerprint ON fingerprint USING gin (acoustid_e...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.

********** Error **********

ERROR: function acoustid_extract_query(integer[]) does not exist
SQL state: 42883
Hint: No function matches the given name and argument types. You might need to add explicit type casts.
Character: 68

2: Creating Function:
CREATE OR REPLACE FUNCTION fp_hash(int[]) RETURNS bytea
AS $$
SELECT digest($1::text, 'sha1');
$$ LANGUAGE 'SQL' IMMUTABLE STRICT;

Error:

ERROR: function digest(text, unknown) does not exist
LINE 3: SELECT digest($1::text, 'sha1');
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.

********** Error **********

ERROR: function digest(text, unknown) does not exist
SQL state: 42883
Hint: No function matches the given name and argument types. You might need to add explicit type casts.
Character: 74

Please let me know any new changes, i have copied the entire sql using GITHUB, may be i'm doing some mistake.

Thanks
Ajitpal

Add support for multiple hashes per fingerprint

When there is a new submission that is very similar to an existing fingerprint, it would be currently skipped. We should store the hash instead, linking to the existing fingerprint. This requires a database schema change.

Display MB log in errors

The login.html template has a spot for errors.mb, but the website never displays log in errors such as missing username or invalid password.

I tried to figure this out by reading the code, but couldn't see what I needed to fix.

Some acoustid-server replication files corrupted

Hi Lukáš! I've working on acoustid replication script. And found invalid replication dump: http://data.acoustid.org/replication/acoustid-update-4620.xml.bz2

xml.sax.parse failed on this particular replication set

xmllint --format ./acoustid-update-4620.xml  
./acoustid-update-4620.xml:2: parser error : PCDATA invalid Char value 31
">Séries</column><column name="artist">Television</column><column name="track">
                                                                               ^
./acoustid-update-4620.xml:2: parser error : PCDATA invalid Char value 4
column name="artist">Television</column><column name="track">�xœcpJLOILQHNLKUÀ

It may be bug in xml.etree.cElementTree (used in export_tables.py) but xml ecaping should be performed well during xml generation as shown in sample:

>>> r = etree.Element('test')
>>> r.text = u'bla &'
>>> etree.tostring(r, encoding="UTF-8")
"<?xml version='1.0' encoding='UTF-8'?>\n<test>bla &amp;</test>"

Simple repair solution:

tidy -xml  -o ./acoustid-update-4620-fixed.xml ./acoustid-update-4620.xml

Better DB connection management for search requests

We are always using multi DB sessions in the web handlers, but specifically for search, we only need one connection at the time, so for example, while musicbrainz lookups are done, we no longer need the fingerprint db connection and should release it.

Make it possible to run API-only version

The website and API handlers should live in separate Python packages and be imported only when necessary. It should be possible to configure the server to handle only API requests and not need any website-specific dependencies.

Internal server error

Track lookups result in 'internal server error.' Not sure if this is the correct place to post this, or if anyone is actively maintaining, but since I couldn't find clear guidance this seemed like the best spot. The status page shows everything is working correctly but I'm unable to get new fingerprints or look up existing AcoustIDs.

EAGAIN in acoustid/indexclient.py

After updating to commit 82a28f4 I receive the following messages while executing lookup:

ERROR:acoustid.api.v2:Error while handling API request
Traceback (most recent call last):
  File "/usr/share/acoustid-server/acoustid/api/v2/__init__.py", line 113, in handle
    return self._ok(self._handle_internal(params), params.format)
  File "/usr/share/acoustid-server/acoustid/api/v2/__init__.py", line 520, in _handle_internal
    matches = searcher.search(p['fingerprint'], p['duration'])
  File "/usr/share/acoustid-server/acoustid/data/fingerprint.py", line 121, in search
    matches = self._search_index(fp, length)
  File "/usr/share/acoustid-server/acoustid/data/fingerprint.py", line 88, in _search_index
    results = idx.search(fp_query)
  File "/usr/share/acoustid-server/acoustid/indexclient.py", line 108, in search
    self._request('set attribute %s %s' % (name, value))
  File "/usr/share/acoustid-server/acoustid/indexclient.py", line 91, in _request
    return line
  File "/usr/share/acoustid-server/acoustid/indexclient.py", line 74, in _getline
    data = self.sock.recv(1024)
error: [Errno 11] Resource temporarily unavailable
ERROR:acoustid.api.v2:WS error: internal error
INFO:werkzeug:127.0.0.1 - - [04/Jul/2012 18:38:35] "POST /ws/v2/lookup HTTP/1.0" 500 -

Errno 11 equals to errno.EAGAIN.
I got the same behaviour while running scripts/import_queued_submissions.py.
The problem has been fixed with the patch below:

$ diff -u acoustid/indexclient.py.bk acoustid/indexclient.py
--- acoustid/indexclient.py.bk  2012-07-04 18:33:02.653948396 +0400
+++ acoustid/indexclient.py     2012-07-04 18:36:15.853936405 +0400
@@ -76,6 +76,10 @@
                         if e.errno == errno.EINTR:
                             continue
                         raise
+                    except socket.error, e:
+                        if e.errno == errno.EAGAIN:
+                            break
+                        raise
                     if not data:
                         break
                     self._buffer += data

Database dumps and replication

Your last commits indicate that the database backend is undergoing a restructure.

Will the database dumps and / or replication still be available then, perhaps in a new format? I'm asking since it seems that http://data.acoustid.org, which is linked at https://acoustid.org/database, is down.

Also, would it be possible that the latest dump of the old database format is still available for download somewhere? This would make sure that any 3rd party implementations which still depend on the old format can obtain a dataset. Thank you very much for your great work!

Change PUID imports to only match tracks with the same name

A large part of the database was built on PUID imports. The algorithm currently only checks for the song length. PUIDs are very often mistagged, so we must load all MusicBrainz tracks, check the most common name and use only tracks with that name. That should help bringing in the PUID errors into the database. We will need to empty the database and reimport submissions after this.

custom ip and port for binding run_http

I'm running acoustid-server under high load. In this case I noticed that run_http working only on one CPU core.

For scaling load on few cores I have patched run_http.py:

============================
--- run_http.py.bak 2012-07-20 16:33:48.632211768 +0400
+++ run_http.py 2012-07-20 17:02:25.867271849 +0400
@@ -5,6 +5,12 @@
from werkzeug.serving import run_simple
from acoustid.server import make_application

+import argparse
+parser = argparse.ArgumentParser()
+parser.add_argument('--ip', type=str, default='127.0.0.1')
+parser.add_argument('--port', type=int, default=8080)
+args = parser.parse_args()
+
logging.basicConfig(level=logging.DEBUG)

config_path = '/etc/acoustid.conf'
@@ -17,8 +23,8 @@
'/static': static_path,
}

-host = '127.0.0.1'
-port = 8080
+host = args.ip
+port = args.port

run_simple(host, port, application, use_reloader=False, static_files=static_files)

Will be great if you submit patch in upstream.

Internal server error on AcoustId submission

Documentation not uptodate - How start server?

i installed the latest acoustid-server version from git. And followed the "install" README.

With some problems i got the Database setup completed (PostgreSQL has no /contrib/pgcrypto.sql file)

But now i'm lost. How i can start the server?

/opt/acoustid-server/scripts# ./run_http.py
Traceback (most recent call last):
File "./run_http.py", line 6, in
from acoustid.server import make_application
ImportError: No module named acoustid.server

it seems i must install the acoustid part in my python lib?

IndexClientError: unable to connect to the index server at 127.0.0.1:6080

Call /api/ws/v2/lookup
results in this error:

`[ERROR] acoustid.data.fingerprint - Index search error
Traceback (most recent call last):
File "/home/ubuntu/VideoFingerprint/acoustid-server/acoustid/data/fingerprint.py", line 123, in search
matches = self._search_index(fp, length)

File "/home/ubuntu/VideoFingerprint/acoustid-server/acoustid/data/fingerprint.py", line 87, in _search_index
with closing(self.idx.connect()) as idx:

File "/home/ubuntu/VideoFingerprint/acoustid-server/acoustid/indexclient.py", line 208, in connect
client = IndexClient(**self.args)

File "/home/ubuntu/VideoFingerprint/acoustid-server/acoustid/indexclient.py", line 39, in init
self._connect()

File "/home/ubuntu/VideoFingerprint/acoustid-server/acoustid/indexclient.py", line 56, in _connect
raise IndexClientError('unable to connect to the index server at %s:%s' % (self.host, self.port))

IndexClientError: unable to connect to the index server at 127.0.0.1:6080`

I don't understand index server.

thanks~

Enforce SNI

We currently still have many clients using HTTPS without SNI, which restricts out options for frontend scalability. We need to make SNI a requirement.

CreateFunctions.sql Error

Postgres 9.1 on Ubuntu 12.04

postgres@Acoustid:/home/datasurfer/acoustid-server$ ./run_psql.sh <sql/CreateFunctions.sql
CREATE FUNCTION
ERROR:  operator does not exist: integer[] - integer
LINE 3:     SELECT uniq(sort(subarray($1 - 627964279,
                                         ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.
CREATE FUNCTION
ERROR:  function digest(text, unknown) does not exist
LINE 3:     SELECT digest($1::text, 'sha1');
                   ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

acoustid-server not working uwsgi

Hello!
I have installed a acoustid-server, using the clone of the master branch. I'm using virtualenv, as it describe from README.md.
After all steps has been done - i run a standalone server for testing.
after connection to http://myhost_ip:5000 - i see a page.

Then i have tried to configure it to run via uwsgi. I have created a uwsgi configuration file:
/etc/uwsgi/apps-enabled/acoustid.ini
`[uwsgi]
project = acoustid
base = /usr/local/acoustid-server

chdir = %(base)/
home = %(base)/e/
venv = %(base)/e/
module = %(project).wsgi:application

module = %(project).test

master = true
workers = 5

env = ACOUSTID_CONFIG=/usr/local/acoustid-server/acoustid.conf
uid = root
gid = root
http =<ip_address_here>:8080
vacuum = true
`
After i have run uwsgi - it starts correctly, but when i trying to connect to the server to 8080 port - i have received 404_not_foud error for each GET.
In wsgi logs at that moment i see something like this:

[pid: 14710|app: 0|req: 3/4] ip () {36 vars in 649 bytes} [Fri Oct 14 11:51:10 2016] GET / => generated 233 bytes in 0 msecs (HTTP/1.1 404) 2 headers in 72 bytes (1 switches on core 0) [pid: 14714|app: 0|req: 2/5] ip () {36 vars in 603 bytes} [Fri Oct 14 11:51:11 2016] GET /favicon.ico => generated 233 bytes in 0 msecs (HTTP/1.1 404) 2 headers in 72 bytes (1 switches on core 0

For testing i have created a small script, called test.py, which returns "Hello world!" string - it works fine.
Also i have tried to use gunicorn - i'we got the same issue with 404.

Uwsgi version is 2.0.12-debian
Where can i find an issue?
Thank you very much!

Update Documentaion. Populate new instance of acoustid db.

I encounter with problem initialising db.

I presume, nowdays acoustid-servers uses triple-db configuration: acoustid_app, acoustid_fingerprint, acoustid_ingest. Also it requires musicbrainz db. But in README.md there is no words about it. How to create this (relatively) new format of database schema? and how to perform initial database population with new format of data?

Also no traces of scripts/acoustid_sync.py that is called by admin/run-sync-acoustid.sh, so there is unknown how to perform syncs with https://data.acoustid.org/ in correct way.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.