Giter Club home page Giter Club logo

Comments (7)

clnsmth avatar clnsmth commented on July 29, 2024

Hi @paschatz, thank you for bringing this issue to our attention.

I've reviewed your report, and I'm unable to reproduce the error on my system. It seems that read_data_entity_names() is functioning as expected, returning a data frame containing data entity identifiers and names.

> # A slightly modified version of the original script
> 
> library(EDIutils)
> 
> # A list of data package IDs from which to get data entity names
> package_ids <- c("edi.1.1", "edi.3.1")
> 
> # Get data entity identifiers and names
> for (i in 1:length(package_ids)) {
+   data_entity_names <- read_data_entity_names(packageId = package_ids[i])
+   print(data_entity_names) # Print results so we can see them
+ }
                          entityId                                 entityName
1 cba4645e845957d015008e7bccf4f902                  E1 Plant Biomass 6 16.csv
2 482fef41e108b34ad816e96423711470 E1_Plant_Species_composition_6_16_long.csv
                          entityId                       entityName
1 76d277e7bcc9c97f2daa4fdfd55ef11f SBCMBON integrated benthic cover
>

Based on the comment in your script, "Download all of the data entities for the package IDs in the list," I understand that your objective may extend beyond merely retrieving data entity IDs and names. You may be interested in downloading these data entities and potentially parsing them into the R environment.

To achieve this, you'll need to iterate through each data entity ID, pass it to an appropriate parser, and store the results in variables within the R environment. For a practical example, please refer to the vignette on data access.

We acknowledge that this process may not be the most straightforward way to access data entities, and we are actively working on a more streamlined solution.

Please don't hesitate to reach out if you have any further questions or if I'm misinterpreting your use case.

from ediutils.

paschatz avatar paschatz commented on July 29, 2024

Yes, my objective is to download the data and I thought I first need to get the entitityID.
But, I reproduced you code and it works also for me... so I did some digging and I noticed that I was trying to apply the function to a vector with NAs... Once I removed the NAs my code run smoothly. But still get the entityID only for the first entry. I will keep working on it.

As a recommendation for improvement I could (respectfully) suggest you to introduce a warning message:
"hey dummy, your vector has NAs, drop them and re-try". (or something else :P )

Cheers and thank you for your response,
Paschalis.

from ediutils.

paschatz avatar paschatz commented on July 29, 2024

Hey,
So I went around it and I fixed the loop but the downloaded zip files seem to be broken.
I run an example outside the loop to test whether is the loop or my code:

try_read <- read_data_entity(packageId = "knb-lter-cdr.444.8",
                             entityId =  "aa6271cfbaa0a63c092733fb8ae6c543")

transaction <- create_data_package_archive("knb-lter-cdr.444.8")

try_download <- read_data_package_archive("knb-lter-cdr.444.8",
                                          transaction,
                                          path = "data_cleaning/11_Cedar_Creek/")

Seems that even if I download a single file, I get the same problem.
A colleague tried remotely and she has the same issue.
I tried to download the data package from the website and works perfectly.
Do you have any idea if there is a problem with the compressed files?

The error I get in my computer is:
"unable to expand 'file_name.zip'. It is an unsupported format." (tried both in Mac and windows).

Thanks,
Paschalis

from ediutils.

clnsmth avatar clnsmth commented on July 29, 2024

Thanks for this additional information @paschatz.

I am able to reproduce the error on my machine and will look into it now.

My session info:

R version 4.3.0 (2023-04-21)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.5.2

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] EDIutils_1.0.2

loaded via a namespace (and not attached):
[1] httr_1.4.6     compiler_4.3.0 R6_2.5.1       tools_4.3.0    curl_5.0.1

from ediutils.

clnsmth avatar clnsmth commented on July 29, 2024

This issue is occurring due to a change in the repository API 'Read Data Package Archive' method. The corresponding read_data_package_archive function has been updated and is available for immediate use by installing EDIutils with:

devtools::install_github(
  repo = "rOpenSci/EDIutils", 
  ref = "refactor-read-data-package-archive"
)

The function no longer uses a transaction identifier, so the new call becomes:

try_download <- read_data_package_archive(
    packageId = "knb-lter-cdr.444.8", 
    path = "data_cleaning/11_Cedar_Creek/"
)

Next steps are to update the docs, tests, and release into the development and main branches.

@paschatz, does this fix the issue? Is there anything else? Thanks again for reporting it!

from ediutils.

paschatz avatar paschatz commented on July 29, 2024

Hey @clnsmth,

Now works smoothly!! 💯

I appreciate your support.

Best,
Paschalis

from ediutils.

clnsmth avatar clnsmth commented on July 29, 2024

Happy to help @paschatz!

I'm going to reopen this issue, to serve as a reminder, until I get the fix released into the main branch.

from ediutils.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.