Giter Club home page Giter Club logo

ctsa-metrics's People

Contributors

apeshansky avatar darrenwhenderson avatar empfff avatar jake-gillberg avatar jordan-donovan avatar kengersing avatar madhukarmosali avatar mgkahn avatar mim18 avatar reidsimon avatar rfollett avatar rosemonde2 avatar rtmill avatar tjmierzwa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ctsa-metrics's Issues

Missing denominators?

These appear in document but not in spreadsheet
C2 denominator: Total # lab tests
D3 denominator: Total # medication records
E3 denominator: Total # diagnoses
F4 denominator: Total # unique diagnoses
G3 denominator: Total #procedures
G4 denominator: Total #unique procedures
L1 denominator: total # unique patients over age 12

For some of these (F3, F4, G3, G4 I3, I4), the numerators also are not in spreadsheet. Did I miss a version of the document that removed these?

Issues Found in InformaticsMetricACTPostgres.sql (Latest commit 9a91cbb on Apr 20, 2021)

Please see below issues found in the InformaticsMetricACTPostgres.sql (latest commit 9a91cbb on Apr 20, 2021):

  1. Postgres does not support use of DUAL like other DBMS.
INSERT INTO ctsa_clic_metric
SELECT
	'nlp_any' as variable_name
	, '0' as one_year
	, '0' as five_year
FROM dual;

Change this sql to following instead:

INSERT INTO ctsa_clic_metric
SELECT
	'nlp_any' as variable_name
	, '0' as one_year
	, '0' as five_year
;
  1. Throughout the script wherever there is a LIKE operation on string values that contain \, it needs to be escaped.
    For example, the following query will not find any matching concept_cd:
    SELECT CONCEPT_CD FROM i2b2demodata.CONCEPT_DIMENSION WHERE CONCEPT_PATH LIKE '\ACT\Procedures\ICD9\%'
    However, the properly escaped string value in the query as follows will find many matching concept_cds:
    SELECT CONCEPT_CD FROM i2b2demodata.CONCEPT_DIMENSION WHERE CONCEPT_PATH LIKE '\\ACT\\Procedures\\ICD9\\%'

This applies to all the metric components where a 'concept_path like <some string containing a \ in the value>' like search is being performed.

  1. Lastly, its not an issue but more of a suggestion, can the dates be parameterized so that an end-user only has to specify them one time at the beginning of the script instead of having to replace them in every metric component in the script (about 28x2 instances).
    i.e. these will need to be changed every year to a different date range.
  • OBS.START_DATE BETWEEN '2020/01/01' AND '2020/12/31'
  • OBS.START_DATE BETWEEN '2016/01/01' AND '2020/12/31'

Issues found in "Informatics Metric i2b2-ACT MSSQL.sql"

There are two issues with the script that I had to modify in order to make it run:

*Line 127 - incorrect syntax for commenting (i.e., should replace '//' with '--')
*Line 144 - incorrect table name (i.e., it should be NCATS_DEMOGRAPHICS from the metadata table rather than PCORNET_DEMO. The PCORNET_DEMO would be a correct table name for PCORNET script, but not for i2b2-ACT script.)
 
There is one more issue, which didn't stop me from executing the script, but it doesn't do anything useful:
*Line 88 - the sql command requres 'lower(c_operator) in (''in'', ''='')'

i2b2-ACT MSSQL error

Hello. I ran "Informatics Metric i2b2-ACT MSSQL.sql" and got the following results:

(9 row(s) affected)
2722917

(1 row(s) affected)

(3 row(s) affected)
Msg 208, Level 16, State 1, Line 128
Invalid object name 'NCATS_DEMOGRAPHICS'.
Msg 4902, Level 16, State 1, Procedure CTSA_QIMCOUNT, Line 160
Cannot find the object "ontInOperator" because it does not exist or you do not have permissions.

(9 row(s) affected)

Incorrect use of table alias in OMOP Postgres SQL script

A couple of column names use table alias that are not appropriate. Please see details below. This is with reference to https://github.com/ncats/CTSA-Metrics/blob/master/Informatics%20Metric%20OMOP%20PostgreSQL.sql

Issue instance #1:
line# 219 ----> WHERE DATE_PART('year', per.birth_datetime) < 1900
description of issue: non-existing table alias 'per' in 'per.birth_datetime'

Issue instance #2:
line# 240 ----> WHERE DATE_PART('year', per.birth_datetime) < 1900
description of issue: non-existing table alias 'per' in 'per.birth_datetime'

Syntax errors in 'Informatics Metric PCORnet MSSQL.sql'

When attempting to execute the code, there appear to be some syntax errors related to missing aliases for subqueries.

Msg 102, Level 15, State 1, Line 459
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 465
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 477
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 483
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 494
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 500
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 511
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 517
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 528
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 534
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 545
Incorrect syntax near ')'.
Msg 102, Level 15, State 1, Line 551
Incorrect syntax near ')'.

Adding an alias (e.g. as subq) at each of these points in the code allows the query to run.

OMOP Oracle Date issues

The date ranges are not formatted right for oracle. ex), cond.condition_start_date BETWEEN '01-01-2021' AND '12-31-2021' for Oracle. Something like TO_DATE('21-12-31', 'RR-MM-DD')

dth.death_date and per.birth_datetime are in date format so don’t need to cast. CAST doesn’t work.
AND (EXTRACT(YEAR FROM CAST(dth.death_date AS DATE)) - EXTRACT(YEAR FROM CAST(per.birth_datetime AS DATE))) > 120
 replace with: and extract(year from dth.death_date) - extract(year from per.birth_datetime) > 120

If you hard code ’12-31-2022’ to extract the year, can you simply subtract from 2022?
WHERE (EXTRACT(YEAR FROM CAST('12-31-2022' AS DATE)) - EXTRACT(YEAR FROM CAST(birth_datetime AS DATE))) > 12
 replace with: where 2022 - extract(year from birth_datetime) > 120
 or if you want to use the date format, then
replace with: where extract(year from to_date('2022-12-31', 'YYYY-MM-DD')) - extract(year from birth_datetime) > 120

Hard-coded 'UNIQUE_TOTAL_PATIENTS' value in Informatics Metric PCORnet MSSQL

Line 45 of the following file contains two hard-coded values for UNIQUE_TOTAL_PATIENTS (10):

https://github.com/ncats/CTSA-Metrics/blob/master/Informatics%20Metric%20PCORnet%20MSSQL

Select 'Diagnosis as ICD/SNOMED' as domain, num.cnt as Patients_with_Standards, 10 as UNIQUE_TOTAL_PATIENTS, 100.0 * num.cnt/10 as Percent_Standards, 'Not Applicable' as Values_Present

should be changed to

Select 'Diagnosis as ICD/SNOMED' as domain, num.cnt as Patients_with_Standards, den.cnt as UNIQUE_TOTAL_PATIENTS, 100.0 * num.cnt/den.cnt as Percent_Standards, 'Not Applicable' as Values_Present

C1: Time constraints

C1: Document header says "at any time" but spreadsheet says "in time frame". Should the search for a lab test be limited to those tests done within one-year/five-year windows or not?

Currently I am only limiting the cohort of patients or encounters by the one-year/five-year windows but I am NOT limiting the search for labs by the same time windows. Should I?

This would change the meaning of C1 to be: Patients seen in one year who have at least one lab test (from any coding system) that occurred in one year. My current logic is patients seen in one year who have at least one lab test (from any coding system) that occurred at any time. (limiting the "one-year" only to the patient cohort, not the lab test).

InformaticsMetricACTSQLServer.sql - Datetime compared to Date strings - decreases results erroneously

START_DATE in OBSERVATION_FACT is datetime (timestamp) and is it compared using the between operator in several places throughout the script to DATE literals. This will not implicitly cast START_DATE to date prior to the comparison. The code as written is missing all encounters beyond midnight on Dec 31 (12/31/2020 00:00:00). You can run the following two statements and see the counts increase by a small amount. It is minimal in our warehouse, but inaccurate nonetheless.

Demo:
SELECT COUNT(DISTINCT OBS.ENCOUNTER_NUM) CNT
FROM OBSERVATION_FACT OBS WHERE OBS.START_DATE BETWEEN '01-01-2016' AND '12-31-2020'

Counts increase when explicitly converting datetime to date prior to between:
SELECT COUNT(DISTINCT OBS.ENCOUNTER_NUM) CNT
FROM OBSERVATION_FACT OBS WHERE CONVERT(DATE,OBS.START_DATE) BETWEEN '01-01-2016' AND '12-31-2020'

Below are a few references:

FROM OBSERVATION_FACT OBS WHERE OBS.START_DATE BETWEEN '2020/01/01' AND '2020/12/31'

FROM OBSERVATION_FACT OBS WHERE OBS.START_DATE BETWEEN '01-01-2016' AND '12-31-2020'

FROM OBSERVATION_FACT OBS WHERE OBS.START_DATE BETWEEN '01-01-2020' AND '12-31-2020'

Simple fix, find-replace OBS.START_DATE BETWEEN WITH CONVERT(DATE,OBS.START_DATE) BETWEEN

D1 inconsistency

Numerator D1:

Informatics Script writing topics document dated 2/10/2021: # patients
Spreadsheet: #encounters

Based on pattern used in E thru J, I'm assuming D1 should be #encounters (spreadsheet).

Datetime Issue in Query for Demo Age/DOB Measure

Since birth_datetime is an optional field in the current OMOP version, I would suggest modifying the this line to have an 'OR' statement that includes the commented lines of code.

So the query would look like this:

		SELECT 'Demo Age/DOB' AS Domain,
		CAST(COUNT(DISTINCT D.person_id) as Float) AS 'Patients with Standards'
		FROM Person D
		-- We may want to alter this to be only Year of birth present at this time Year, Month and Day are required in order to count
		Where D.birth_datetime  is NOT NULL OR (D.Year_of_Birth  is NOT NULL
			and  D.month_of_Birth is NOT NULL
			and  D.Day_of_Birth  is NOT NULL)

ACT/i2b2 scripts are missing ACT in their name

Many apologies Reid.
Would you mind renaming the i2b2/ACT scripts?
CLICMetric2019Oracle.sql -> CLICMetric2019ACTi2b2Oracle.sql
CLICMetric2019MSSql.sql -> CLICMetric2019ACTi2b2MSSql.sql

i2b2/ACT script

  1. It does not identify or count categories of data that have not been incorporated into the ACT related work. This is fully expected, but limits the utility to only describe how a site SHRINE node will respond, and not what is the overall availability of data in i2b2. Some of these are really different domains of data, but some are different structure for the same category of data (ie. our Cancer Registry ontology, which mixes diagnoses, procedures, and other facts or our Microbiology is a more detailed structure for a type of "labs")

  2. It assumes that the site ETL and specifically the concept paths adhere to the ACT Paths. This is expected also, given the ACT context, but there are legitimate reasons that a site would not fit this expectation. Most notably, this approach does not incorporate any expectation that a sight might be using an adapter mapping file for their ACT implementation. In our case, we have a pre-existing i2b2 infrastructure and a view-based model for handling various i2b2 and shrine instances. This allows us to avoid replication of the observation_fact table for each i2b2/SHRINE instance. Our ACT/SHRINE activity includes the use of an adapter mapping file in order to maintain our own internal structures (not ACT specific) and still be able to respond to the SHRINE instances. Thus, our internal paths differ from the paths in the scripts. There are a few other nuanced differences such as out use of multiple ontology tables rather than a single table.

  3. The concept of "Notes" is just a simple fact and not knowing what exactly that means makes it pretty useless.

  4. Also, There were some syntax problems that cause compiler or execution errors with this version: https://github.com/ncats/CTSA-Metrics/blob/master/Informatics%20Metric%20i2b2-ACT%20Oracle.sql

Line 166: Contains a fatal error. Oracle comments begin with --, not //
// Save the summary level of the counts -- assumes there are no categorical overlaps in dimcode defn!
Once you fix that comment, the procedure compiles without error.

But when you try to run it, line 135 contains a run-time error ‘missing right parentheses’. The set of parenthese after sysdate() have to be removed. Sysdate() is a MySQL function, Sysdate is the Oracle version
Bad version in their script is:
select count(distinct patient_num) into total_dob from PATIENT_DIMENSION where birth_date is not null and extract(year from birth_date)>1900 and extract(year from birth_date)<extract(year from sysdate());
The correct version is:
select count(distinct patient_num) into total_dob from PATIENT_DIMENSION where birth_date is not null and extract(year from birth_date)>1900 and extract(year from birth_date)<extract(year from sysdate);

K1: Vital signs

K1: Is there a list of terms for allowable vital signs?
Each CDM will need to convert to their own coding system. But a list such as (pretty standard):

Any one of (by any measurement method or position):

  • BP systolic
  • BP diastolic
  • BP systolic/diastolic (combined)
  • Heart or pulse rate
  • Respiratory rate
  • Temperature

Pushing a bit more, could include

  • Height or length
  • Weight

C1 inconsistency

C1 in Informatics 2/4/2021 document does not restrict lab test to LOINC whereas the C1 label in spreadsheet restricts to LOINC

C2 in Informatics 2/4/2021 document restricts to LOINC whereas there is no C2 in spreadsheet.

Has C1 in document been deleted so that C2 in document is C1 in spreadsheet?

Vitals counts - OMOP

While testing OMOP-Oracle script on our data, I discovered that it doesn't return any counts of Vitals encounters, because we do not populate encounter_id (visit_occurrence_id) for the vitals. (NB: this is a limitation of our source system for pre-Epic data). OMOP lists visit_occurrence_id as optional in the measurement table, so this is allowed by CDM. How do you propose we deal with it for other sites that may have the same situation? Or are we unique?

i2b2 scripts - add Drop table at start of script or rename ctsa_clic_metric

To be consistent with other scripts and not drive people nuts that they already have the table from last year:

in the i2b2 scripts please add

drop table ctsa_clic_metric ;
commit ;

or rename table ctsa_clic_metric throughout script to ctsa_clic_metric_22? This would allow institutions to keep the table year after year if needed. Either option works.

Issue with Column Names for Observations Present - PCORnet Oracle Code

Running the latest PCORnet Oracle code, 44b05d4, provided me with errors. I updated the column names as follows and then it ran successfully:
select 'Observations Present' AS domain, null as Patients_with_Standards, null as UNIQUE_TOTAL_PATIENTS, null as Percent_Standards , Case When Count(*) = 0 then 'No Observation' else 'Observations Present' end as values_present From Vital

Inconsistent date literal on line 78 will cause script maintenance issues in the future

FROM OBSERVATION_FACT OBS WHERE CONVERT(DATE,OBS.START_DATE) BETWEEN CAST('2020/01/01' AS DATE) AND CAST('12-31-2020' AS DATE)

The date literal '2020/01/01' on line 78 is the only time the floor for the one-year census period is written in yyyy/mm/dd format in the entire script. It will make find-replace in future maintenance of the script fail.

Vitals doesn't include many applicable SNOMED concepts

For our site, we use the attached list of concepts/ancestors for all our vitals records. And out of 117 unique ancestors only one (4245997) is included in the NCATS vitals inclusion list (https://github.com/ncats/CTSA-Metrics/blob/master/Informatics%20Metric%20OMOP%20PostgreSQL.sql):

/* 
Spec: At least one vital: height, weight, blood pressure, BMI, or temperature
height
3036277
, weight
3025315
, blood pressure
45876174
, BMI
1002813
4245997
, temperature
1004059
4178505
# concepts exist in both meas and obs
*/

As a result our metric shows as if we have (almost) no vital signs data in our repository. Please advise how we should proceed for this particular item.

VitalsAncestorConceptId_WUSTL.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.