Comments (4)
Thanks for your report. I'm not sure if it is a valid archive or not (the standard is not always as clear as I'd like it to be), but it seems there are such archives in the wild and it shouldn't be too difficult to support, so I'll give it a try!
from python-dwca-reader.
I'm a little ambivalent about this issue.
- My initial reaction is to support it like that: if an Archive contains a single directory, let's consider it as the Archive root and make things work. I'm not sure if it's a technically valid archive, but supporting it would be easy/low-risk and may help some people. If I remember correctly, some IPT versions produced such archives.
- But the Archive submitted above wouldn't work, since it contains two directories at the root (there's also the infamous __MACOSX directory, probably silently created by Mac OS X Finder's when the Archive was manually hacked). I don't really see a simple and resilient way for the DwCAReader to open such an archive (entering each subdirectory in the Archive root and looking for something that actually looks like a DwCA seems a bit too far-fetched).
To me, it looks like an error at GBIF to provide such a sample file for their DwCA validator. I'd be tempted to not fix it here (or just the single-directory simple case) and report it as an issue to GBIF. What do you think, @nickynicolson ?
from python-dwca-reader.
Thanks @niconoe - I agree.
Re your first point: I've seen a lot of these single sub-directory archives in use - perhaps from IPT instances, but also from the Scratchpads project and emonocot. If we can jump into the subdir when only one subdir exists, that seems like a good solution.
I also agree that the sample DWCA referenced from the GBIF validator should be cleaner.
from python-dwca-reader.
That's good to know. I'll implement this "single subdir fix" so at least we support those common archives!
from python-dwca-reader.
Related Issues (20)
- Extend CSVDataFile to support hash index on Core file HOT 3
- `.close()` errors do not work on non-MS operating system HOT 6
- Documentation: update contributing (nosetests -> pytest) HOT 1
- Support Python 3.12
- Handle dynamic properties HOT 1
- Support URLs for the metadata file
- Test failure on some systems with Python 3.7
- Assign column types (instead of considering everything is a string) HOT 2
- Any extension of this library for transforming the dwca to sql? HOT 8
- Get a logo! HOT 2
- Headers consistency checks HOT 4
- Support for fields that have both a default value and a data column
- Remove Python 2 related code
- conda repository version is almost 7 years out-of-date HOT 8
- Write a GDAL Python Driver for DWCA HOT 1
- Replace Travis-CI by GitHub actions HOT 1
- InvalidArchive: The descriptor references a non-existent field (index=17) HOT 5
- Crashes with recent GBIF downloads HOT 1
- Update contributing documentation to refer to the nosetests -> pytest update HOT 1
- Add functionality to iterate over a StarRecord HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from python-dwca-reader.