by Daniel Eklund

The CMS division of HHS produces a freely available algorithm for determining the diagnosis risk of cost for beneficiaries. The problem with this freely available algorithm is that it is written in a technology, SAS , that is not freely usable.1

We at Algorex Health have reimplemented the HCC algorithm in Python with two aims in mind. First, we want to promote truly free algorithms for value-based analytics, contributing to a community ethos of open-source population health libraries. Second, we want to use the HCC library as a testbed for technologies that try to tame the inherent complexity of healthcare.

To accomplish this second goal, we have embedded PyDatalog (a datalog with strong Python bindings) into our library to capture the indicator-based rules that trigger evaluating an individual within a linear predictive model.

The following post hopes to motivate the choice of this mixture of technologies, as well as to provide basic instructions for any developer in a health system wanting to use this library.

But first, we need to explain what the HCC risk adjustment algorithm is.

# What is Risk, Risk Adjustment, and the HCC?

Within the healthcare analytics world, the term risk refers to a numeric unitless scoring of a patient (or member of a plan or accountable care organization) of having some outcome. Risk adjustment is then the process by which large tranches of populations can be adjusted/normalized against a larger population so that providers and plans that carry an inherently riskier population are compared fairly.

Any risk calculation is therefore a mathematical model that takes in a set of variables and produces this numeric value.

Risk models can range for different purposes, from hospital readmission to mortality to cost, and will use many different techniques, and will creatively choose different inputs to value or drop based on the relevant use-case and data availability.2

The HCC algorithm for Medicare patients has been created by CMS to measure the risk of Medicare cost, and is based on the previous year’s cost per person. Loosely speaking, the HCC algorithm is a function of a single person’s demographics (age, sex) and that person’s set of morbidities (a term of art that can be understood as the set of all diagnosed diseases).

We can represent this mathematically/programmatically as:

``````hcc(person) → risk
``````

or more deeply

``````hcc(demographics,[disease]) → risk
``````

where risk is a real number from 0 to some number greater than 1, where 1 represents average risk.

# What is HCC (deeper-dive)?

The HCC is a set of models created by a linear regression that assigns coefficients (or weights) of input values to demographics and morbidities (and comorbidities) and sum them into a final score. Here is a visual showing this at work: An important point to emphasize at this point is that creating the HCC model is a separate process from using the model. The CMS creates the HCC risk model from the entire population of Medicare recipients. The rest of us use the model to score a patient or patients.

Another point worth emphasizing is that the precise statement that HCC is “a set of models” means that there are multiple functions (a mathematical model is the same thing as a function) and therefore multiple risk scores per person. HCC will have, year to year, different models for different macroscopic contexts (institutional scores versus community scores versus new enrollee scores), and it is up to the user of the HCC model-set to understand why a particular model is more relevant.

HCC uses the concept of a condition category (the CC in HCC) to aggregate a set of diagnoses into a larger grain causal variable. This effectively reduces dimensionality into the regression and reflects the prior expert knowledge of the physician community. Examples of condition categories are large grain morbidities like “Protein-Calorie Malnutrition” or “Dementia With Complications” or “Cystic Fibrosis” or “Major Head Injury”. The H in HCC stands for hierarchization and refers to the fact that some condition categories are related to each other by being less severe to more severe, and in the case that condition categories of similar morbidity but different levels of severity are detected, then only the most severe is considered and the lesser condition categories are dropped. As an example, if a set of diagnoses is rolled up and the condition categories labeled “Diabetes without Complication” and “Diabetes with Acute Complications” appear then only “Diabetes with Acute Complications” will remain as it represents the more severe of these two related condition categories. Choosing the more severe condition category over the lesser is hierarchization.

# Decoding the SAS Program

Not knowing the details of how the linear regression model was created, we had to carefully read the SAS code so that we could translate it to Python. Our investigation into the program taught us a couple of things:

1. SAS is an older language and does not easily reflect the semantics (or intent) of the problem domain as well as more modern languages can
2. The linear model is fairly simple in that it uses the idea of indicator variables to contribute the coefficient or not. In other words, there are no complex interaction of variables to scale the coefficients, they either contribute zero to the overall sum or 1 times the coefficient

This last point was of particular concern to us going in, as SAS is a fairly numeric-heavy technology and floating-point differences might crop up as we translated the algorithm into another language. However, as we got deeper we were able to prove to ourselves that the great bulk of numeric calculations in the SAS code were to effectively recreate boolean algebra, by overloading zero to mean false and non-zero integers to mean true.

You can watch a video (if you want) where we discuss the code in painstaking detail:

In the end, the overall algorithm for scoring a particular person X with diagnoses D is:

1. Edit this X person’s diagnoses {D} using a set of edit functions {EDIT} to clean them up according to some logic based on this X person’s demographics DEM.
2. From these cleaned-up diagnoses {CleanD} create all the condition categories {CC} that generalize specific diagnoses into conditions.
3. Create hierarchies {HCC} out of these condition categories {CC} (i.e. zero-out the less severe condition categories if more than one is present).
4. For each risk model (institutional, community, new-enrollee) {SCORE}
• Choose the appropriate set of input indicator functions {IND} and their associated coefficients {COEF}
• For each indicator functions in this particular model
• Evaluate if this function is triggered (i.e. returned true or 1)
• And if it is triggered, then add the associated coefficient COEF to the overall score SCORE

This algorithm seems relatively simple in this description, but the devil was in the details of the {EDIT} and {IND} functions. Not only were functions numerous, but we felt that their implementations would be highly susceptible to drift as CMS updated, overhauled, and refactored their model(s) in the future.

For this reason we chose to use Datalog. 3

# Python and PyDatalog Implementation

We have used a package called `pyDatalog` to capture the rules that may trigger an indicator function to contribute its coefficient to the ongoing risk score. For instance, consider the following code

``````indicator(B,'MCAID_Male_Aged') <=  medicaid(B) & ~disabled(B) & male(B)
indicator(B,'COPD_ASP_SPEC_BACT_PNEUM') <=  ben_hcc(B,CC) & ben_hcc(B,CC2) & copd_asp_spec_bact_pneum(CC,CC2)
``````

These two lines of pure Python code, are also an embedabble DSL for datalog that capture the logical rules that relate a beneficiary to this indicator.

You may read these rules as such, (in order):

• the MCAID_Male_Aged indicator is true for this beneficiary B `indicator(B,'MCAID_Male_Aged')` if `<=` it is true that this beneficiary is on medicaid `medicaid(B)` and that this beneficiary is not disabled `~disabled(B)` and this beneficiary is male `male(B)`
• the COPD_ASP_SPEC_BACT_PNEUM indicator is true if this beneficiary has two condition categories (CC and CC2) that are related to each other by the sub-rule `copd_asp_spec_bact_pneum(CC,CC2)` (i.e. this sub-rule returns true)

For programmers familiar with standard imperative techniques (or even functional), this might seem new as it encapsulates the logic of what declaratively and eschews an imperative how for the datalog engine.

This effective severing of knowledge from implementation can yield surprising smaller code which may have higher maintenance characteristics. Consider these rules which effectively capture the notion of hierarchicalization (the ‘H’ in HCC):

``````  beneficiary_icd(B,ICD,Type) <= (Diag.beneficiary[D] == B) & (Diag.icdcode[D]==ICD) & (Diag.codetype[D]==Type)
beneficiary_has_cc(B,CC) <= beneficiary_icd(B,ICD,Type)  & edit(ICD,Type,B,CC) & ~(excised(ICD,Type,B))
beneficiary_has_cc(B,CC) <= beneficiary_icd(B,ICD,Type)  & \
cc(ICD,CC,Type) & ~(edit(ICD,Type,B,CC2)) & ~(excised(ICD,Type,B))

has_cc_that_overrides_this_one(B,CC) <=  beneficiary_has_cc(B,OT)  & overrides(OT,CC)
beneficiary_has_hcc(B,CC) <= beneficiary_has_cc(B,CC) & ~( has_cc_that_overrides_this_one(B,CC))
``````

Though these rules depend on facts (cc and Beneficiary and Diagnosis) and other rules (excised,edit,overrides), these few lines capture all the logic in relating beneficiaries to ICDs, and to condition categories, and hierarchical condition categories.

In the end, we are left with a database (or knowledegebase of facts and rules) which are formal encapsulations of our problem domain. These facts and rules operate to answer queries on our data.