lexjansen / dataset-json-sas Goto Github PK
View Code? Open in Web Editor NEWDataset-JSON SAS Implementation
License: MIT License
Dataset-JSON SAS Implementation
License: MIT License
he 06_validate_datasetjson.sas program (or underlying code) appears to truncate the message to 250 characters:
'1' is not of type 'number' Failed validating 'type' in schema['properties']['clinicalData']['properties']['itemGroupData']['additionalProperties']['properties']['itemData']['items']['items'][0]: {'description': 'The first item in the data array need
I have a question about length of date/time datatype variable.
According to the define.xml, the length attribute must be empty when datatype is not integer, float, or text. (Rule ID: DD0068)
But, according to the "CDISC_COSA_webinar_20231005_dataset-json_SAS" slide, the length of ISO8601 formatted data in json file has string datatype and is assigned length.
Is j-son more similar to dataset than define.xml?
I want to know why the length should be assigned in json file.
As you know, it is taken from dataset and it requires more work.
There already is a Python script in the repo (scripts/json_validate.py), but I want to be able to call that from SAS.
This can be done with PROC FCMP.
See:
Using PROC FCMP Python Objects
Configuring SAS to Run the Python Language
Hi,
I run the code using call execute. The 1st code run occurs errors, because dset_id variable has the value 0. But, 2nd run has no issue.
How can I fix it?
Thank you in advace.
MinJi
In write_datasetjson macro:
228: if formatl gt 0 then displayFormat=cats(displayFormat, put(formatl, best.), ".");
229: if formatd gt 0 then displayFormat=cats(displayFormat, put(formatd, best.));
230: if index(displayFormat,'.')=0 then displayFormat=strip(displayFormat)||'.'; * put a dot on the end of format if we are still missing it ;
The last line 230 needs to be added.
When saving the metadata:
Unless I missed it in the slides or presentation, the dataset-json-sas package doesn't explicitly indicate the Define-XML versions supported.
Testing with Define-XML v1.0 for old actual data packages that I could use, the dataset-json-sas package was not retrieving dataset and variable labels plus dataset keys.
I made the following adjustments:
Rationale for requesting the adjustment:
I assume there are old actual data packages that testers could use and adjusting the dataset-json-sas package for this purpose will allow for more testing (without the need to up-version the Define-XML document).
If the use case is considered convenient, I'll post the revisions after cross-checking with another set of data.
I tested two scenarios related to length attributes:
Scenario 2: Extra variable in the dataset
I am using this code to convert SEND dataset (with define file) to Dataset-Json. I am not a SAS programmer, but able to copy/paste then modify the STDM code to support SEND. Do you think I should merge in the code with SEND support? The changes will be for those SAS files under programs folder, plus create folders for send. Thanks.
I think this code enforces some requirements for JSON structure, limiting scope of JSON files that the code can parse and translate to a SAS data set.
Fine, since the name "dataset-json" makes this at least implicit. It would be helpful if the top-level README makes this explicit. If I understand correctly, JSON is a generic container, and one JSON could include multiple, complex, nested, etc. data structures. I don't think that such flexibility is the intention of this "dataset-json" project, and not surprisingly the code fails for out-of-scope JSON structures that represent more than a single SAS table structure. All fair enough, and similar to our restricted typical use of "xpt" containers for a single SAS data set.
Rather than "a SAS implementation for converting Dataset-JSON files to and from SAS datasets", does this project have a narrower scope - specifically with a single SAS dataset in mind:
?
This would more clearly limit the scope to JSON structure directly translatable to/from a SAS data structure.
Suggestion/wish: combine some SAS code for both ADaM and SDTM as macros instead of two different pieces of code.
Refs: 02_create_metadata_from_definexml.sas and 05_compare_data.sas
Instead of PROC COPY, use the %XPT2LOC macro.
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/movefile/p1tp8gighlgeifn173i6kzw2w3bu.htm
Having no label in Dataset-JSON results in:
proc datasets library=work noprint nolist nodetails;
change IG_CLASS_ITEMDATA = CLASS;
modify CLASS (label = "Student Data");
rename element1=ITEMGROUPDATASEQ element2=Name element3=Sex element4=Age element5=Height element6=Weight;
label ITEMGROUPDATASEQ= "Record Identifier" Name= "" Sex= "" Age= "" Height= "" Weight= "";
quit;
correct is:
proc datasets library=work noprint nolist nodetails;
change IG_CLASS_ITEMDATA = CLASS;
modify CLASS (label = "Student Data");
rename element1=ITEMGROUPDATASEQ element2=Name element3=Sex element4=Age element5=Height element6=Weight;
label ITEMGROUPDATASEQ= "Record Identifier";
quit;
I tested a scenario where a dataset-JSON is created from an R dataframe without providing any metadata. In this case, display formats are not included in the dataset-JSON. When converting this dataset-JSON to a SAS dataset, there is no format applied to numeric date values. Would it be possible to implement a default format for date, time, and datetime for numeric date(time) values, as these formats may not always be present in the dataset-JSON?
When a dataset already contains a character variable named ITEMGROUPDATASEQ , the write_datasetjson macro ends in an error condition:
ERROR: Variable ITEMGROUPDATASEQ has been defined as both character and numeric.
Currently only the first schema error is reported. It would be good to report all errors in a JSON file.
A limit on the number of errors should be set as an option.
code does not work of CreateMetadataFromDefineXML if define file version is 2.0
Right now I'm hardcoding when creating metadata tables from Define-XML.
This is NOT a big issue, since data is not truncated,
but it will make compares between datasets in roundtrip better.
/* Create metadata from Define-XML for SDTM */
%CreateMetadataFromDefineXML(
definexml=&project_folder/data/sdtm_xpt/define.xml,
metadatalib=metasdtm
);
/* Some manual data type updates */
data metasdtm.metadata_columns;
set metasdtm.metadata_columns;
if xml_datatype='float' and name ne 'LBSTRESN'
then json_datatype='decimal';
if missing(length) then do;
if xml_datatype="date" then length=10;
if xml_datatype="partialDate" then length=10;
if xml_datatype="partialDatetime" then length=19;
if xml_datatype="durationDatetime" then length=19;
if xml_datatype="datetime" then length=19;
end;
run;
Changes signatures in read_datasetjson.sas and write_datasetjson.sas macros to be in sync with Dataset-JSON keys.
Especially, remove leading underscores.
Change 03_test_write_json.sas and 04_test_read_json.sas to 03_test_write_datasetjson.sas
04_test_read_datasetjson.sas
Right now they expect file paths to JSON files or XPT files.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.