Comments (7)
Hi @paschatz, thank you for bringing this issue to our attention.
I've reviewed your report, and I'm unable to reproduce the error on my system. It seems that read_data_entity_names() is functioning as expected, returning a data frame containing data entity identifiers and names.
> # A slightly modified version of the original script
>
> library(EDIutils)
>
> # A list of data package IDs from which to get data entity names
> package_ids <- c("edi.1.1", "edi.3.1")
>
> # Get data entity identifiers and names
> for (i in 1:length(package_ids)) {
+ data_entity_names <- read_data_entity_names(packageId = package_ids[i])
+ print(data_entity_names) # Print results so we can see them
+ }
entityId entityName
1 cba4645e845957d015008e7bccf4f902 E1 Plant Biomass 6 16.csv
2 482fef41e108b34ad816e96423711470 E1_Plant_Species_composition_6_16_long.csv
entityId entityName
1 76d277e7bcc9c97f2daa4fdfd55ef11f SBCMBON integrated benthic cover
>
Based on the comment in your script, "Download all of the data entities for the package IDs in the list," I understand that your objective may extend beyond merely retrieving data entity IDs and names. You may be interested in downloading these data entities and potentially parsing them into the R environment.
To achieve this, you'll need to iterate through each data entity ID, pass it to an appropriate parser, and store the results in variables within the R environment. For a practical example, please refer to the vignette on data access.
We acknowledge that this process may not be the most straightforward way to access data entities, and we are actively working on a more streamlined solution.
Please don't hesitate to reach out if you have any further questions or if I'm misinterpreting your use case.
from ediutils.
Yes, my objective is to download the data and I thought I first need to get the entitityID.
But, I reproduced you code and it works also for me... so I did some digging and I noticed that I was trying to apply the function to a vector with NAs... Once I removed the NAs my code run smoothly. But still get the entityID only for the first entry. I will keep working on it.
As a recommendation for improvement I could (respectfully) suggest you to introduce a warning message:
"hey dummy, your vector has NAs, drop them and re-try". (or something else :P )
Cheers and thank you for your response,
Paschalis.
from ediutils.
Hey,
So I went around it and I fixed the loop but the downloaded zip files seem to be broken.
I run an example outside the loop to test whether is the loop or my code:
try_read <- read_data_entity(packageId = "knb-lter-cdr.444.8",
entityId = "aa6271cfbaa0a63c092733fb8ae6c543")
transaction <- create_data_package_archive("knb-lter-cdr.444.8")
try_download <- read_data_package_archive("knb-lter-cdr.444.8",
transaction,
path = "data_cleaning/11_Cedar_Creek/")
Seems that even if I download a single file, I get the same problem.
A colleague tried remotely and she has the same issue.
I tried to download the data package from the website and works perfectly.
Do you have any idea if there is a problem with the compressed files?
The error I get in my computer is:
"unable to expand 'file_name.zip'. It is an unsupported format." (tried both in Mac and windows).
Thanks,
Paschalis
from ediutils.
Thanks for this additional information @paschatz.
I am able to reproduce the error on my machine and will look into it now.
My session info:
R version 4.3.0 (2023-04-21)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.5.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] EDIutils_1.0.2
loaded via a namespace (and not attached):
[1] httr_1.4.6 compiler_4.3.0 R6_2.5.1 tools_4.3.0 curl_5.0.1
from ediutils.
This issue is occurring due to a change in the repository API 'Read Data Package Archive' method. The corresponding read_data_package_archive function has been updated and is available for immediate use by installing EDIutils with:
devtools::install_github(
repo = "rOpenSci/EDIutils",
ref = "refactor-read-data-package-archive"
)
The function no longer uses a transaction
identifier, so the new call becomes:
try_download <- read_data_package_archive(
packageId = "knb-lter-cdr.444.8",
path = "data_cleaning/11_Cedar_Creek/"
)
Next steps are to update the docs, tests, and release into the development and main branches.
@paschatz, does this fix the issue? Is there anything else? Thanks again for reporting it!
from ediutils.
Hey @clnsmth,
Now works smoothly!! 💯
I appreciate your support.
Best,
Paschalis
from ediutils.
Happy to help @paschatz!
I'm going to reopen this issue, to serve as a reminder, until I get the fix released into the main branch.
from ediutils.
Related Issues (20)
- function validate_path() should tell me the invalid path HOT 1
- include EDIutils in EDIorg-repository-index HOT 3
- enhance input to api_get_provenance_metadata to accept urls and dois HOT 4
- LTER affiliation no longer valid HOT 1
- Make read_tables() more flexible HOT 2
- api_update_data_package() returning 401 after successful PUT HOT 3
- Function to push a "staged" data package to the production environment
- In vignettes/retrieve_downloads.Rmd documentation on using .Renviron is confusing. HOT 2
- Error : 'validate_file_names' is not an exported object from 'namespace:EDIutils' HOT 6
- Create vignette on working with EML and XML
- Add parameter to report functions for only listing warns and errors
- Demonstrate common search patterns using Solr queries
- xml2df() drops null elements HOT 2
- comma in entity names causes read_data_entity_names to return extra columns HOT 1
- Failing CRAN checks related to vcr HOT 2
- Using search_data_packages only returns YYYY for pubdate instead of YYYY-MM-DD HOT 7
- add vignette to get the newest version of a dataset HOT 6
- Update EDI contact email
- Draft `make_query` Function for R-Style Solr Queries HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ediutils.