Giter Club home page Giter Club logo

codeditr's People

Contributors

ernestguevarra avatar

Stargazers

 avatar

Watchers

 avatar

codeditr's Issues

Typing mistakes or incomplete code

ICD10:

  • Minimum 4 characters is considered gold standard
  • Number of 3 character codes are reported (but not considered incorrect)
  • Number of 4 character codes are reported
  • 2 character codes are counted and are considered incorrect
A & B	        Infectious and Parasitic Diseases
C	        Neoplasms
D	        Neoplasms, Blood, Blood-forming Organs
E	        Endocrine, Nutritional, Metabolic
F	        Mental and Behavioral Disorders
G	        Nervous System
H	        Eye and Adnexa, Ear and Mastoid Process
I	        Circulatory System
J	        Respiratory System
K	        Digestive System
L	        Skin and Subcutaneous Tissue
M	        Musculoskeletal and Connective Tissue
N	        Genitourinary System
O	        Pregnancy, Childbirth and the Puerperium
P	        Certain Conditions Originating in the Perinatal Period
Q	        Congenital Malformations, Deformations and Chromosomal Abnormalities
R	        Symptoms, Signs and Abnormal Clinical and Lab Findings
S	        Injury, Poisoning, Certain Other Consequences of External Causes
T	        Injury, Poisoning, Certain Other Consequences of External Causes
U	        no codes listed, will be used for emergency code additions
V, W, X, Y	External Causes of Morbidity (homecare will only have to code how patient was hurt; other settings will also code where injury occurred, what activity patient was doing)
Z	        Factors Influencing Health Status and Contact with Health Services (similar to current "V-codes")

Retrieve database from `.accdb` files provided by WHO

CoDEdit v2.0 uses Microsoft Access (latest version).

This can be programmatically accessed through R but only on a Windows computer (because Access DB drivers can only be installed on Windows machine)

Plan is to retrieve data on a Windows machine and then save the data as CSV and add to this package as reference. If DB is too big (doesn't seem like it) we can put it on Dolthub and maintain it as needed especially if WHO comes up with updates.

create vignette on data quality checks performed by the package

@AnitaMakori, in this issue for the codeditr package I will be writing a vignette that would describe the different quality checks that the package performs and what these types of errors mean or signify.

I think this information will be useful for either your methods section. I aim to get this completed before next Monday. Once done, I will tag you and feel free to liberally draw from this vignette for your thesis.

Notifiable diseases

not sure that we really need this from a quality check perspective but if our goal is high fidelity in porting the CoDEdit software, we should take this into account.

list out rules/checks performed by CoDEdit tool and convert to R functions

Following are the types of checks performed by CoDEdit based on this document.

For causes that are specific to one sex, the tool will flag as error when the combination of sex and cause is wrong. For e.g. a female death from prostate cancer is an error.

For causes that are specific to certain ages, the tool will flag as error when the combination of age and cause is doubtful. For e.g. maternal death at a female aged 5 years or death from senility at age 15 years.

Diseases that are usually notifiable in countries such as yellow fever, cholera and plague are flagged if there are deaths. Also if there is any death from small pox, this is flagged as the disease is now considered as eradicated.

If code is typed as “J18A” for e.g., this is flagged as error as the 4th character cannot be a letter “A”. If the coding is generally done at the 4-character level of the ICD-10, code “E10” for e.g. will be flagged since it is missing a 4th character. It should be either E100 or E101, … or E109.

A common mistake is the use of “asterix” codes. As per the ICD-10 rules, “asterix” codes should not be used for underlying cause of death. Instead the “dagger” codes should be used. In addition to the “asterix” codes, there is another list in the ICD-10 volume 2, section 4.1.12 showing all the codes that should not be used for underlying cause of death. If any of those codes are used, they would be flagged as errors. The CodEdit tool will flag each of those errors and provide an alternative code for correcting each error. Note that the alternative code provided by CoDEdit is only a suggestion. If you accept the suggested code, you can now click on it to accept it. Data producers should in principle review the death certificate and after discussion with the certifier or relevant responsible officer, they could then decide to use another valid ICD-10 code.

The ICD-10 is regularly updated following proposals from users and advancement in knowledge. The updates are made available on WHO web site indicating the year when they become operational. https://www.who.int/standards/classifications/classification-of-diseases/list-of-official-icd-10-updates. In reality, such changes imply considerable investment at the country level to be implemented. The result is that many countries do not implement the updates. Among the countries which implement the ICD-10 updates, some implement them correctly and others do not. This results in non-comparability of data across countries and time. CoDEdit takes into account the above complex situation. There is a box to allow data producers to indicate if they are using ICD-10 updates. If they are using the ICD-10 updates, their data would be verified to ensure that those updates are implemented correctly. Those who do not use ICD-10 updates, their data would be checked with the initial (first edition) ICD-10 codes.

These are:

Category or subcategory Code
I46.1 Sudden cardiac death
I46.9 Cardiac arrest, unspecified
(I50.-) Acute heart failure in I50.-
I95.9 Hypotension, unspecified
I99 Circulatory disease, unspecified
J96.0 Acute respiratory failure
J96.9 Respiratory failure, unspecified
P28.5 Respiratory failure of newborn
R00-R57.1, R57.8-R64, R65.2-R65.3, R68.0-R94, R96-R99 Symptoms, signs and abnormal laboratory findings

check code structure

for ICD10 see https://www.healthnetworksolutions.net/index.php/understanding-the-icd-10-code-structure

for ICD11 see https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-021-01539-1

1.2.4.1 Code structure
The codes of the ICD–11 are alphanumeric and cover the range from 1A00.00 to ZZ9Z.ZZ. These are referred to as stem codes. The structure of stem codes is described below:

The first character of the code always relates to the chapter number. It may be a number or a letter.
Codes starting with ‘X’ indicate an extension code (see [2.9](https://icdcdn.who.int/icd11referenceguide/en/html/index.html#extension-codes)).
There is always a letter in the second position to differentiate ICD-11 codes from the codes in ICD–10.
The inclusion of a forced number at the third character position prevents spelling ‘undesirable words’.
The letters ‘O’ and ‘I’ are omitted to prevent confusion with the numbers ‘0’ and ‘1’.
For example, [1A00](https://icd.who.int/browse11/l-m/en/#/http%3A%2F%2Fid.who.int%2Ficd%2Fentity%2F257068234) is a code in Chapter 01, and [BA00](https://icd.who.int/browse11/l-m/en/#/http%3A%2F%2Fid.who.int%2Ficd%2Fentity%2F761947693) is a code in Chapter 11.

For example: ED1E.EE

E corresponds to a ‘base 34 number’ (0-9 and A-Z; excluding O, I);
D corresponds to ‘base 24 number’ (A-Z; excluding O, I); and
1 corresponds to the ‘base 10 integers’ (0-9)
The first E starts with ‘1’ and is allocated for the chapter. (i.e. 1 is for the first chapter, 2: chapter 02, … A chapter 10, etc.)
The terminal letter Y is reserved for the residual category ‘other specified’ and the terminal letter ‘Z’ is reserved for the residual category ‘unspecified’. For the chapters that have more than 240 blocks, ‘F’ (‘other specified’) and ‘G’ (‘unspecified’) are also used to indicate residual categories (due to limitations in the coding space).

design principles

Design principles are:

  • Use base R functions as much as possible with as little nice-to-have dependencies;
  • Use base pipe function (|>) so will need to have a dependency to version R 4.3;
  • Think through really useful helper/utility functions that increase efficiencies;
  • Ensure that functions are vectorised;
  • Ensure that functions are pipe-friendly

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.