Overview of the phecodeX data structure and differences from version 1.2. Panel (A) demonstrates how phecode v1.2 was designed using codes from the ICD-9; ICD-10s were later integrated into phecodes using the ICD crosswalk. PhecodeX adapted the overall structure of ICD-10s; codes from ICD-9 and -10 were mapped simultaneously to optimize the phecode structure for both ICD versions. Panel (B) provides an example of the expansion in phecode tree structure. The new system adds a tertiary level with up to three digits past the decimal place for increased phecode specificity. This example also demonstrates the difference in phecode labels and strings, including the introduction of a two-character category descriptor (e.g. CV for the Cardiovascular category) and the use of * after code description to denote codes that map to only ICD-10 billing codes. Panel (C) demonstrates the difference in the size of phecodeX versus v1.2, stratified by phecode category. Finally, panel (D) shows how new phecodes introduced in phecodeX improve coverage of phenotypes relevant to both complex and Mendelian disease. In this example, the two phecodes in v1.2 are expanded to reflect five phecodes in phecodeX. The column labeled Mendelian indicates the number of Mendelian disease genes linked to the phenotypes through the Human Phenotype Ontology. The column labeled GWAS indicates the number of unique genetic variants present in the GWAS catalog. This figure was created with BioRender.com.
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.