Currently, the sqlite log file essentially stores two states: unprocessed, by the file

Possibly track full status of files in sqlite log about acousticbrainz-client HOT 3 OPEN

mtg commented on August 13, 2024

Possibly track full status of files in sqlite log

from acousticbrainz-client.

Comments (3)

alastair commented on August 13, 2024

What is this going to solve in the long term? If we mark additional failures, we need a way of dealing with them as well.
I could probably be convinced to do this, but here are some counters/comments to your specific points:

Your complex pid/sha/timestamp system only seems to fix a potential problem with the current system of running find/parallel/timeout. You shouldn't be getting 2 processes doing the same file because parallel distributes your work for you. If you want to quickly process a whole directory tree, wait for multiprocessing support.
No mbid: Yes, should probably be added
Extractor issues: Once we get bugs fixed, hopefully this won't happen again. We already mark 'failed essentia' files. Storing the filename that features are in only seems to catch the "we fail to submit" case. I don't think we need to add so much complexity here. Why don't we just retry to submit x times and if that fails (network disconnected?) just quit.

I agree we need to think a bit about how to submit when people get a new extractor build. Do we store the old extractor version? Do we tell people to just delete the history file?

from acousticbrainz-client.

ianmcorvidae commented on August 13, 2024

I think I agree on the pid/timestamp system with a 'processing' state, and about the 'pending submission' state. Just seemed like I should sketch out a really complete/complex system so we can pull out the useful bits.

One thing storing a hash of the file would solve that isn't otherwise is retagging with newer data from MB (but not which changes the file location of the file). Perhaps most notably this can sometimes change the recording MBID, though it should usually be from a redirected to a non-redirected one (though at present neither client nor server does anything about redirected MBIDs). Maybe we don't care about this and just expect anyone using the data to look things up by recording MBID rather than using the tags section, though?

from acousticbrainz-client.

zas commented on August 13, 2024

Extractor issues: Once we get bugs fixed, hopefully this won't happen again.

I disagree with this point, fixing bugs in extractor is a must, but client should be able to gracefully handle unexpected failures anyway. Future versions of extractor will come with their own bugs ;)

+1 for file hash, it would help to track moved files having no change, and files at the same place with metadata changes.

from acousticbrainz-client.

Possibly track full status of files in sqlite log about acousticbrainz-client HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent