Comments (3)
What is this going to solve in the long term? If we mark additional failures, we need a way of dealing with them as well.
I could probably be convinced to do this, but here are some counters/comments to your specific points:
- Your complex pid/sha/timestamp system only seems to fix a potential problem with the current system of running find/parallel/timeout. You shouldn't be getting 2 processes doing the same file because parallel distributes your work for you. If you want to quickly process a whole directory tree, wait for multiprocessing support.
- No mbid: Yes, should probably be added
- Extractor issues: Once we get bugs fixed, hopefully this won't happen again. We already mark 'failed essentia' files. Storing the filename that features are in only seems to catch the "we fail to submit" case. I don't think we need to add so much complexity here. Why don't we just retry to submit x times and if that fails (network disconnected?) just quit.
I agree we need to think a bit about how to submit when people get a new extractor build. Do we store the old extractor version? Do we tell people to just delete the history file?
from acousticbrainz-client.
I think I agree on the pid/timestamp system with a 'processing' state, and about the 'pending submission' state. Just seemed like I should sketch out a really complete/complex system so we can pull out the useful bits.
One thing storing a hash of the file would solve that isn't otherwise is retagging with newer data from MB (but not which changes the file location of the file). Perhaps most notably this can sometimes change the recording MBID, though it should usually be from a redirected to a non-redirected one (though at present neither client nor server does anything about redirected MBIDs). Maybe we don't care about this and just expect anyone using the data to look things up by recording MBID rather than using the tags section, though?
from acousticbrainz-client.
Extractor issues: Once we get bugs fixed, hopefully this won't happen again.
I disagree with this point, fixing bugs in extractor is a must, but client should be able to gracefully handle unexpected failures anyway. Future versions of extractor will come with their own bugs ;)
+1 for file hash, it would help to track moved files having no change, and files at the same place with metadata changes.
from acousticbrainz-client.
Related Issues (20)
- graphical submitter HOT 1
- sqlite error: database is locked HOT 3
- error output should go to stderr HOT 2
- setup.py install fails if `streaming_extractor_music` not in the same dir HOT 1
- `invalid or missing encoding declaration for 'streaming_extractor_music'` (Py3) HOT 1
- Check that python setup.py install works on windows HOT 3
- Offline mode HOT 1
- Build a WIndows version HOT 4
- --help for usage
- Use more than one core HOT 1
- connection error in linux HOT 1
- 64-bit streaming_extractor_music build doesn't work on Arch Linux HOT 2
- Rescan previously failed files
- Recognise "Not found" as an error and don't mark it as submitted
- Use High-level classifier models in Windows for analyzing own music HOT 1
- Provide pypi package HOT 1
- [Feature Request] Adhere to XDG Base Directory Specification
- Consider using sqlite for log file HOT 2
- todo list HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from acousticbrainz-client.