Reproducing sirrice/icd9

icdcodex recapitulates the functionality of sirrice/icd9 which has similar functionality, which is somewhat dated and does not support ICD-10

import networkx as nx
from icdcodex import hierarchy
G, codes = hierarchy.icd9()

A simple demonstration

From the read me

The library encodes ICD9 codes in their natural hierarchy. For example, “Cholera due to vibrio cholerae” has the ICD9 code 001.0, and is categorized as a type of Cholera, which in turn is a type of Intestinal Infectious Disease. Specifically, 001.0 has the following hierarchy: “Cholera due to vibrio cholerae” has the ICD9 code 001.0, and is categorized as a type of Cholera, which in turn is a type of Intestinal Infectious Disease.

We can find this hierarchy by using the shortest_path method

cholerae_icd_code = "001.0".replace(".", "")
root_node, *natural_hierarchy = nx.shortest_path(G, source="root", target=cholerae_icd_code)
natural_hierarchy
['Infectious And Parasitic Diseases',
 'Intestinal Infectious Diseases',
 'Cholera',
 '0010']

Using the library

Find top level codes

To find the top level codes, we can do a one layer traversal starting at the root.

from networkx.algorithms.traversal.breadth_first_search import bfs_tree
top_level_nodes = bfs_tree(G, source="root", depth_limit=1)
top_level_nodes.nodes()
NodeView(('root', 'Infectious And Parasitic Diseases', 'Neoplasms', 'Endocrine, Nutritional And Metabolic Diseases, And Immunity Disorders', 'Diseases Of The Blood And Blood-Forming Organs', 'Mental Disorders', 'Diseases Of The Nervous System And Sense Organs', 'Diseases Of The Circulatory System', 'Diseases Of The Respiratory System', 'Diseases Of The Digestive System', 'Diseases Of The Genitourinary System', 'Complications Of Pregnancy, Childbirth, And The Puerperium', 'Diseases Of The Skin And Subcutaneous Tissue', 'Diseases Of The Musculoskeletal System And Connective Tissue', 'Congenital Anomalies', 'Certain Conditions Originating In The Perinatal Period', 'Symptoms, Signs, And Ill-Defined Conditions', 'Injury And Poisoning', 'Supplementary Classification Of External Causes Of Injury And Poisoning', 'Supplementary Classification Of Factors Influencing Health Status And Contact With Health Services'))

Any arbitrary sub-nodes are obtained in a similar fashion

intestinal_infectious_disease_nodes = bfs_tree(G, source="Intestinal Infectious Diseases").nodes()
intestinal_infectious_disease_nodes
NodeView(('Intestinal Infectious Diseases', 'Cholera', 'Typhoid and paratyphoid fevers', 'Other salmonella infections', 'Shigellosis', 'Other food poisoning (bacterial)', 'Amebiasis', 'Other protozoal intestinal diseases', 'Intestinal infections due to other organisms', 'Ill-defined intestinal infections', '0010', '0011', '0019', '0020', '0021', '0022', '0023', '0029', '0030', '0031', '00320', '00321', '00322', '00323', '00324', '00329', '0038', '0039', '0040', '0041', '0042', '0043', '0048', '0049', '0050', '0051', '0052', '0053', '0054', '00581', '00589', '0059', '0060', '0061', '0062', '0063', '0064', '0065', '0066', '0068', '0069', '0070', '0071', '0072', '0073', '0074', '0075', '0078', '0079', '00800', '00801', '00802', '00803', '00804', '00809', '0081', '0082', '0083', '00841', '00842', '00843', '00844', '00845', '00846', '00847', '00849', '0085', '00861', '00862', '00863', '00864', '00865', '00866', '00867', '00869', '0088', '0090', '0091', '0092', '0093'))

Find all nodes by a search criteria

[n for n in G.nodes() if n.startswith("001")]
['0010', '0011', '0019']

Find all codes (i.e., leaf nodes) by a search criteria

cholerae_nodes = bfs_tree(G, source="Cholera").nodes()
[n for n in cholerae_nodes if G.degree[n] == 1]
['0010', '0011', '0019']

Get the description of a code

G.nodes()["0010"]
{'description': 'Cholera due to vibrio cholerae'}

Get a nodes parent and siblings

parent, = G.predecessors("0010")
print(f"parent: {parent}, siblings: {G[parent]}")
parent: Cholera, siblings: {'0010': {}, '0011': {}, '0019': {}}