Giter Club home page Giter Club logo

dataset-json-sas's People

Contributors

lexjansen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

dataset-json-sas's Issues

Question about length of date/time datatype variable.

I have a question about length of date/time datatype variable.

According to the define.xml, the length attribute must be empty when datatype is not integer, float, or text. (Rule ID: DD0068)
But, according to the "CDISC_COSA_webinar_20231005_dataset-json_SAS" slide, the length of ISO8601 formatted data in json file has string datatype and is assigned length.
Is j-son more similar to dataset than define.xml?
I want to know why the length should be assigned in json file.
As you know, it is taken from dataset and it requires more work.

Question about macro variable in nested code

Hi,
I run the code using call execute. The 1st code run occurs errors, because dset_id variable has the value 0. But, 2nd run has no issue.
How can I fix it?

Thank you in advace.
MinJi

Format needs a dot if it is missing.

In write_datasetjson macro:

228: if formatl gt 0 then displayFormat=cats(displayFormat, put(formatl, best.), ".");
229: if formatd gt 0 then displayFormat=cats(displayFormat, put(formatd, best.));
230: if index(displayFormat,'.')=0 then displayFormat=strip(displayFormat)||'.'; * put a dot on the end of format if we are still missing it ;

The last line 230 needs to be added.

Get dataset and variable labels plus dataset keys when using versions of Define-XML v1.0

Unless I missed it in the slides or presentation, the dataset-json-sas package doesn't explicitly indicate the Define-XML versions supported.

Testing with Define-XML v1.0 for old actual data packages that I could use, the dataset-json-sas package was not retrieving dataset and variable labels plus dataset keys.

I made the following adjustments:

  1. read_definexml.lua: dataset and variable labels plus dataset keys: different xpath to get the values without need to specify the Define-XML version.
  2. 02_create_metadata_from_definexml.sas: manual update to populate the item key based on the item group retrieved keys, if item keys not already set.

Rationale for requesting the adjustment:
I assume there are old actual data packages that testers could use and adjusting the dataset-json-sas package for this purpose will allow for more testing (without the need to up-version the Define-XML document).
If the use case is considered convenient, I'll post the revisions after cross-checking with another set of data.

Issue when length attribute not included in Dataset-JSON

I tested two scenarios related to length attributes:

  1. In the first scenario, I converted a dataset-JSON to a SAS dataset with the length attributes completely missing. In this case, some errors appeared in the log.
  2. In the second scenario, I specified the length attribute for only one variable. The macro ran successfully, but an extra variable for ITEMGROUPDATASEQ was present in the SAS dataset.

Scenario 1: Error in the log
image
image

Scenario 2: Extra variable in the dataset

image

add support for SEND

I am using this code to convert SEND dataset (with define file) to Dataset-Json. I am not a SAS programmer, but able to copy/paste then modify the STDM code to support SEND. Do you think I should merge in the code with SEND support? The changes will be for those SAS files under programs folder, plus create folders for send. Thanks.

Requirements for JSON file?

I think this code enforces some requirements for JSON structure, limiting scope of JSON files that the code can parse and translate to a SAS data set.

Fine, since the name "dataset-json" makes this at least implicit. It would be helpful if the top-level README makes this explicit. If I understand correctly, JSON is a generic container, and one JSON could include multiple, complex, nested, etc. data structures. I don't think that such flexibility is the intention of this "dataset-json" project, and not surprisingly the code fails for out-of-scope JSON structures that represent more than a single SAS table structure. All fair enough, and similar to our restricted typical use of "xpt" containers for a single SAS data set.

Rather than "a SAS implementation for converting Dataset-JSON files to and from SAS datasets", does this project have a narrower scope - specifically with a single SAS dataset in mind:

  • a SAS implementation to convert a SAS dataset to and from a Dataset-JSON file, single SAS data set structure.

?

This would more clearly limit the scope to JSON structure directly translatable to/from a SAS data structure.

Do not try to create a variable label when not defined in Dataset-JSON

Having no label in Dataset-JSON results in:

proc datasets library=work noprint nolist nodetails;
change IG_CLASS_ITEMDATA = CLASS;
modify CLASS (label = "Student Data");
rename element1=ITEMGROUPDATASEQ element2=Name element3=Sex element4=Age element5=Height element6=Weight;
label ITEMGROUPDATASEQ= "Record Identifier" Name= "" Sex= "" Age= "" Height= "" Weight= "";
quit;

correct is:

proc datasets library=work noprint nolist nodetails;
change IG_CLASS_ITEMDATA = CLASS;
modify CLASS (label = "Student Data");
rename element1=ITEMGROUPDATASEQ element2=Name element3=Sex element4=Age element5=Height element6=Weight;
label ITEMGROUPDATASEQ= "Record Identifier";
quit;

Format missing for numeric dates when displayFormat attributes not used

I tested a scenario where a dataset-JSON is created from an R dataframe without providing any metadata. In this case, display formats are not included in the dataset-JSON. When converting this dataset-JSON to a SAS dataset, there is no format applied to numeric date values. Would it be possible to implement a default format for date, time, and datetime for numeric date(time) values, as these formats may not always be present in the dataset-JSON?

image

image

Return all errors from schema validation

Currently only the first schema error is reported. It would be good to report all errors in a JSON file.
A limit on the number of errors should be set as an option.

Value truncated for non-ASCII character

I tested the read_datasetjson macro with a dataset-JSON containing non-ASCII characters. In the dataset-JSON, the length was defined as the number of characters. In the resulting SAS dataset, the value was truncated.

image
image

Log
image

Dataset
image

Get missing Define-XML lengths (optionally) from datasets when creating Dataset-JSON

Right now I'm hardcoding when creating metadata tables from Define-XML.
This is NOT a big issue, since data is not truncated,
but it will make compares between datasets in roundtrip better.

/* Create metadata from Define-XML for SDTM */
%CreateMetadataFromDefineXML(
definexml=&project_folder/data/sdtm_xpt/define.xml,
metadatalib=metasdtm
);

/* Some manual data type updates */
data metasdtm.metadata_columns;
set metasdtm.metadata_columns;
if xml_datatype='float' and name ne 'LBSTRESN'
then json_datatype='decimal';
if missing(length) then do;
if xml_datatype="date" then length=10;
if xml_datatype="partialDate" then length=10;
if xml_datatype="partialDatetime" then length=19;
if xml_datatype="durationDatetime" then length=19;
if xml_datatype="datetime" then length=19;
end;
run;

Fix signatures and sample program names

Changes signatures in read_datasetjson.sas and write_datasetjson.sas macros to be in sync with Dataset-JSON keys.
Especially, remove leading underscores.

Change 03_test_write_json.sas and 04_test_read_json.sas to 03_test_write_datasetjson.sas
04_test_read_datasetjson.sas

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.