Hey! I need a hand here. I have a list with the latest package I

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks for this additional information <a class="user-mention notranslate" data-hoverc

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Happy to help <a class="user-mention notranslate" data-hovercard-type="user" data-hove

read_data_entity_names of multiple package_ids about ediutils HOT 7 CLOSED

ropensci commented on July 29, 2024

read_data_entity_names of multiple package_ids

from ediutils.

Comments (7)

clnsmth commented on July 29, 2024

Hi @paschatz, thank you for bringing this issue to our attention.

I've reviewed your report, and I'm unable to reproduce the error on my system. It seems that read_data_entity_names() is functioning as expected, returning a data frame containing data entity identifiers and names.

> # A slightly modified version of the original script
> 
> library(EDIutils)
> 
> # A list of data package IDs from which to get data entity names
> package_ids <- c("edi.1.1", "edi.3.1")
> 
> # Get data entity identifiers and names
> for (i in 1:length(package_ids)) {
+   data_entity_names <- read_data_entity_names(packageId = package_ids[i])
+   print(data_entity_names) # Print results so we can see them
+ }
                          entityId                                 entityName
1 cba4645e845957d015008e7bccf4f902                  E1 Plant Biomass 6 16.csv
2 482fef41e108b34ad816e96423711470 E1_Plant_Species_composition_6_16_long.csv
                          entityId                       entityName
1 76d277e7bcc9c97f2daa4fdfd55ef11f SBCMBON integrated benthic cover
>

Based on the comment in your script, "Download all of the data entities for the package IDs in the list," I understand that your objective may extend beyond merely retrieving data entity IDs and names. You may be interested in downloading these data entities and potentially parsing them into the R environment.

To achieve this, you'll need to iterate through each data entity ID, pass it to an appropriate parser, and store the results in variables within the R environment. For a practical example, please refer to the vignette on data access.

We acknowledge that this process may not be the most straightforward way to access data entities, and we are actively working on a more streamlined solution.

Please don't hesitate to reach out if you have any further questions or if I'm misinterpreting your use case.

from ediutils.

paschatz commented on July 29, 2024

Yes, my objective is to download the data and I thought I first need to get the entitityID.
But, I reproduced you code and it works also for me... so I did some digging and I noticed that I was trying to apply the function to a vector with NAs... Once I removed the NAs my code run smoothly. But still get the entityID only for the first entry. I will keep working on it.

As a recommendation for improvement I could (respectfully) suggest you to introduce a warning message:
"hey dummy, your vector has NAs, drop them and re-try". (or something else :P )

Cheers and thank you for your response,
Paschalis.

from ediutils.

paschatz commented on July 29, 2024

Hey,
So I went around it and I fixed the loop but the downloaded zip files seem to be broken.
I run an example outside the loop to test whether is the loop or my code:

try_read <- read_data_entity(packageId = "knb-lter-cdr.444.8",
                             entityId =  "aa6271cfbaa0a63c092733fb8ae6c543")

transaction <- create_data_package_archive("knb-lter-cdr.444.8")

try_download <- read_data_package_archive("knb-lter-cdr.444.8",
                                          transaction,
                                          path = "data_cleaning/11_Cedar_Creek/")

Seems that even if I download a single file, I get the same problem.
A colleague tried remotely and she has the same issue.
I tried to download the data package from the website and works perfectly.
Do you have any idea if there is a problem with the compressed files?

The error I get in my computer is:
"unable to expand 'file_name.zip'. It is an unsupported format." (tried both in Mac and windows).

Thanks,
Paschalis

from ediutils.

clnsmth commented on July 29, 2024

Thanks for this additional information @paschatz.

I am able to reproduce the error on my machine and will look into it now.

My session info:

R version 4.3.0 (2023-04-21)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.5.2

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] EDIutils_1.0.2

loaded via a namespace (and not attached):
[1] httr_1.4.6     compiler_4.3.0 R6_2.5.1       tools_4.3.0    curl_5.0.1

from ediutils.

clnsmth commented on July 29, 2024

This issue is occurring due to a change in the repository API 'Read Data Package Archive' method. The corresponding read_data_package_archive function has been updated and is available for immediate use by installing EDIutils with:

devtools::install_github(
  repo = "rOpenSci/EDIutils", 
  ref = "refactor-read-data-package-archive"
)

The function no longer uses a transaction identifier, so the new call becomes:

try_download <- read_data_package_archive(
    packageId = "knb-lter-cdr.444.8", 
    path = "data_cleaning/11_Cedar_Creek/"
)

Next steps are to update the docs, tests, and release into the development and main branches.

@paschatz, does this fix the issue? Is there anything else? Thanks again for reporting it!

from ediutils.

paschatz commented on July 29, 2024

Hey @clnsmth,

Now works smoothly!! 💯

I appreciate your support.

Best,
Paschalis

from ediutils.

clnsmth commented on July 29, 2024

Happy to help @paschatz!

I'm going to reopen this issue, to serve as a reminder, until I get the fix released into the main branch.

from ediutils.

read_data_entity_names of multiple package_ids about ediutils HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent