icdcodex package¶
Subpackages¶
Submodules¶
icdcodex.datacleaning module¶
preprocess icd-10 hierarchy into a graphical structure that node2vec can use
-
icdcodex.datacleaning.
build_icd10_hierarchy
(xml_root: untangle.Element, codes: List[str], root_name: Optional[str] = None, prune_extra_codes: bool = True)[source]¶ build the icd10 hierarchy
Some codes are specified to be invalid by plain text, so they are pruned by comparing them to a specified set of codes.
- Parameters
xml_root (untangle.Element) – root element of the code table XML
codes (List[str]) – list of ICD codes
root_name (str, option) – arbitrary name for the root of the hierarchy. Defaults to “root.”
prune_extra_codes (bool) – If True, remove any leaf node not specified in codes
- Returns
icd10 hierarchy and ICD-10-CM codes
- Return type
Tuple[nx.Graph, List[str]]
-
icdcodex.datacleaning.
build_icd10_hierarchy_from_url
(code_desc_url, code_table_url, root_name: Optional[str] = None, return_intermediates=False)[source]¶ build the icd10 hierarchy by downloading from cms.gov
- Parameters
code_desc_url (str) – url to the “Code Descriptions in Tabular Order (ZIP)” file
code_table_url (str) – url to the “Code Tables and Index (ZIP)” file
root_name (str, option) – arbitrary name for the root of the hierarchy. Defaults to “root.”
return_intermediates (bool) – If True, return the untangle element and codes. Defaults to False.
- Returns
icd10 hierarchy and ICD-10-CM codes
- Return type
Tuple[nx.Graph, List[str]]
-
icdcodex.datacleaning.
build_icd10cm_hierarchy_from_zip
(code_desc_zip_fp, code_table_zip_fp, root_name: Optional[str] = None, return_intermediates=False)[source]¶ build the icd10 hierarchy from zip files downloaded from cms.gov
- Parameters
code_desc_zip_fp (Pathlike) – file path to the “Code Descriptions in Tabular Order (ZIP)” file
code_table_zip_fp ([type]) – file path to the “Code Tables and Index (ZIP)” file
root_name (str, option) – arbitrary name for the root of the hierarchy. Defaults to “root.”
return_intermediates (bool) – If True, return the untangle element and codes. Defaults to False.
- Returns
icd10 hierarchy and ICD-10-CM codes
- Return type
Tuple[nx.Graph, List[str]]
-
icdcodex.datacleaning.
build_icd9_hierarchy
(fp, root_name=None)[source]¶ build the icd9 hierarchy
- Parameters
fp (Pathlike) – Path to hierarchy spec, available at https://github.com/kshedden/icd9/blob/master/icd9/resources/icd9Hierarchy.json
root_name (str, option) – arbitrary name for the root of the hierarchy. Defaults to “root.”
- Returns
icd-9 hierarchy (nx.Graph) and ICD9 codes (List[str])
-
icdcodex.datacleaning.
build_icd9_hierarchy_from_url
(url='https://github.com/kshedden/icd9/blob/master/icd9/resources/icd9Hierarchy.json', root_name=None)[source]¶ build the icd9 hierarchy by downloading the hierarchy files
- Parameters
url (str, optional) – url to hierarchy spec. Defaults to “https://github.com/kshedden/icd9/blob/master/icd9/resources/icd9Hierarchy.json”.
root_name (str, option) – arbitrary name for the root of the hierarchy. Defaults to “root.”
- Returns
icd-9 hierarchy (nx.Graph) and ICD9 codes (List[str])
-
icdcodex.datacleaning.
traverse_diag
(G, parent, untangle_elem, extensions=None)[source]¶ traverse the diagnosis subtrees, adding extensions as appropriate
Seventh-character extensions may be specified as a child, sibling or uncle/aunt. Also, some diagnoses are non-billable because they are, parents to more specific sub-diagnoses.
- Parameters
G (nx.Graph) – ICD hierarchy to mutate
parent (str) – parent node
untangle_elem (untangle.Element) – XML element, from untangle API
extensions (List[Tuple[str,str]], optional) – Seventh character extensions and related descriptions. Defaults to None.
icdcodex.hierarchy module¶
deserialize icd hierarchies computed in datacleaning.py
-
icdcodex.hierarchy.
icd10cm
(version: Optional[str] = None) → Tuple[networkx.classes.graph.Graph, Sequence[str]][source]¶ deserialize icd-10-cm hierarchy
- Parameters
version (str, optional) – icd-10-cm version, including 2019 to 2020. If None, use the system year. Defaults to None.
- Returns
ICD-10-CM hierarchy and codes
- Return type
Tuple[nx.Graph, Sequence[str]]
icdcodex.icd2vec module¶
Build a vector embedding from a networkX representation of the ICD hierarchy
-
class
icdcodex.icd2vec.
Icd2Vec
(num_embedding_dimensions: int = 128, num_walks: int = 10, walk_length: int = 10, window: int = 4, workers=1, **kwargs)[source]¶ Bases:
object
-
fit
(icd_hierarchy: networkx.classes.graph.Graph, icd_codes: Sequence[str], **kwargs)[source]¶ construct vector embedding of all ICD codes
- Parameters
icd_hierarchy (nx.Graph) – Graph of ICD hierarchy
kwargs – arguments passed to the Node2Vec.fit
-
to_code
(vecs: Union[Sequence[Sequence], numpy.ndarray]) → Sequence[str][source]¶ decode continuous representation of ICD code(s) into the code itself
- Parameters
vecs (Union[Sequence[Sequence], np.ndarray]) – continuous representation of ICD code(s)
- Returns
ICD code(s)
- Return type
Sequence[str]
-
to_vec
(icd_codes: Sequence[str]) → numpy.ndarray[source]¶ encode ICD code(s) into a matrix of continuously-valued representations of shape m x n where m = self.num_embedding_dimensions and n = len(icd_codes)
- Parameters
icd_codes (Sequence[str]) – list of icd code(s)
- Raises
ValueError – If model is not fit beforehand
- Returns
continuously-valued representations if ICD codes
- Return type
np.ndarray
-