--- title: "PCCC: An Example Using the Center for Disease Control's Multiple Cause of Death Dataset" author: - James Feinstein - Seth Russell - Tell Bennett date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{pccc-example} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>") ``` # Introduction This vignette provides an example using publicly available death certificate data to illustrate how the `pccc` package generates the Complex Chronic Condition (CCC) categories from ICD-9-CM and ICD-10-CM codes. For an overview of the CCC classification system, see [pccc-overview](pccc-overview.html). To evaluate the code chunks in this example you will need to load the following R packages. ```{r, message = FALSE} library(pccc) library(dplyr) ``` # Accessing the Data The Center for Disease Control maintains vital statistics including death certificate data. The publicly available death certificate data, known as the Multiple Cause of Death (MCD) file, contain ICD diagnostic codes specifying the diseases and conditions leading to each decedent's death. In particular, the 1996 MCD data contain both ICD-9-CM and ICD-10 codes, making it an ideal example to demonstrate how the PCCC software categorizes ICD codes. Please note that because of the way ICD-9-CM codes are mapped to ICD-10-CM codes (https://www.cms.gov/Medicare/Coding/ICD10/2018-ICD-10-CM-and-GEMs.html), the calculated frequencies of CCCs may differ between corresponding ICD-9-CM and ICD-10-CM diagnosis codes for a decedent. The data documentation and instructions for direct download are available at: ftp://ftp.cdc.gov/pub/health_statistics/nchs/datasets/comparability/icd9_icd10/ICD9_ICD10_comparability_file_documentation.pdf # Preparing the Data For this illustrative example, we have provided just 2 columns of the data for decedents <=21 years old: the ICD-9-CM underlying cause of death diagnosis code and the ICD-10-CM underlying cause of death diagnosis code. If you wish to recreate the data yourself from the direct download site, you will need to utilize column positions 142-145 (ICD-9-CM) and 444-447 (ICD-10) and restrict the data to records with age <=21 years (column positions 64 - 66). Here's a sample of how the file could be read and processed: ```{r, eval = FALSE} # download and unzip file from ftp://ftp.cdc.gov/pub/health_statistics/nchs/datasets/comparability/icd9_icd10/ICD9_ICD10_comparability_public_use_ASCII.ZIP # columns of interest # start end width description # 64 - 64 1 Age Code # 65 - 66 2 Age Value # Code Value Description # 0 01-99 Years less than 100 # 1 00-99 Years 100 or more # 2 01-11,99 Months # 3 01-03,99 Weeks # 4 01-27,99 Days # 5 01-23, 99 Hours # 6 01-59, 99 Minutes # 9 99 Age not stated # 142 - 145 4 ICD Code 9th Revision (Underlying Cause of Death) # 444 - 447 4 ICD-10 Underlying Cause Code library(readr) mcod <- readr::read_fwf("ICD9_ICD10_comparability_public_use_ASCII.dat", readr::fwf_positions( start = c(64, 65, 142, 444), end = c(64, 66, 145, 447), col_names = c('age_code', 'age', 'icd9', 'icd10')), col_types = 'iicc') mcod <- mcod[ (mcod$age_code == 0 & mcod$age <= 21) | (mcod$age_code %in% c(2, 3, 4, 5, 6)) , ] mcod <- dplyr::mutate(mcod, id = seq_along(age)) mcod <- mcod[c("id", "icd9", "icd10")] ``` # Running the PCCC Software Within the example data, there are 2 string variables for ICD-9-CM and ICD-10 codes. If you inspect the first 10 rows of the codes, you will notice they conform to the formatting guidelines outlined in the PCCC overview file [pccc-overview](pccc-overview.html). ```{r} # Show data head(pccc::comparability, 10) ``` To run the PCCC classification on the ICD-9-CM codes: ```{r} # Run PCCC on ICD-9-CM codes ccc_result_icd9 <- ccc(pccc::comparability, # get id, dx, and pc columns id = id, dx_cols = icd9, pc_cols = , icdv = 09) # review results head(ccc_result_icd9) # view number of patients with each CCC sum_results <- dplyr::summarize_at(ccc_result_icd9, vars(-id), sum) %>% print.data.frame # view percent of total population with each CCC dplyr::summarize_at(ccc_result_icd9, vars(-id), mean) %>% print.data.frame ``` To run the PCCC classification on the ICD-10-CM codes: ```{r} # Run PCCC on ICD-10-CM codes ccc_result_icd10 <- ccc(pccc::comparability, # get id, dx, and pc columns id = id, dx_cols = icd10, pc_cols = , icdv = 10) # review results head(ccc_result_icd10) # view number of patients with each CCC sum_results <- dplyr::summarize_at(ccc_result_icd10, vars(-id), sum) %>% print.data.frame # view percent of total population with each CCC dplyr::summarize_at(ccc_result_icd10, vars(-id), mean) %>% print.data.frame ``` # References * Feudtner C, et al. Pediatric complex chronic conditions classification system version 2: updated for ICD-10 and complex medical technology dependence and transplantation, BMC Pediatrics, 2014, 14:199, DOI: 10.1186/1471-2431-14-199 * Feudtner C, et al. Pediatric deaths attributable to complex chronic conditions: a population-based study of Washington State, 1980-1997. Pediatrics. 2000;106(1 Pt 2):205-209.