Patient attribution 101: tips for setting up a dynamic model

Many healthcare analyses involve the concept of “patient attribution”—in other words, linking a patient to the provider responsible for their care. For example, you might want to know which primary care physician or clinic has been managing a patient’s health to properly assign the rewards when that patient has a good outcome (e.g., good medication adherence, or lower-than-expected spending over some period) or the responsibility for a bad outcome (e.g., a readmission, or a preventable emergency room visit).

Here, we:

Describe the basic elements of a patient attribution exercise
Work through a detailed example
Discuss how one might implement such a system
Identify extensions of this concept to other healthcare analytic problems beyond the patient-provider relationship

Basic elements of patient attribution

The fundamental challenge of attributing a patient to a provider arises from the possibility that there might be multiple candidate “attributee” providers. So we must sort through the evidence in favor of each to choose just one. At its core, then, attribution represents a ranking system: a series of comparisons, resolved in order, until a clear winner is determined.

These ranking systems can become complicated as they seek to reach an appropriate conclusion across the breadth of real-world scenarios involving patients, providers, and their interactions. For example, there is often a tension between a model’s ability to quickly pivot to a new provider when appropriate and its capacity to be patient and avoid inappropriate pivots. In addition, it can be difficult to extend the algorithm to handle new corner cases while not disturbing its carefully calibrated handling of previously considered scenarios.

Finally, patient and provider circumstances change over time in ways that influence attribution. That is, when done properly, the “winning” provider in an attribution model is likely to change over time, and so we will want a way to model this relationship dynamically.

An example

An example will help illustrate these challenges and provide an opportunity to discuss solutions.

Consider an exercise to attribute patients to a single primary care provider (PCP) at any given moment in time. An initial sketch of the algorithm might look like this:

Step 1: For each patient, identify all their primary care visits in the last two years; if they have no visits, the patient is not attributed to any provider.
Step 2: If the visits are all with the same provider, attribute the patient to that provider.
Step 3: If the patient has had visits with more than one provider, attribute the patient to the provider with the most recent visit.

This seems sensible so far. But what if a patient’s regular provider is temporarily unavailable, so a colleague sees the patient instead? The rule above would switch patients after just one visit with a new provider, which is probably not what we would want to happen.

Therefore, we might want to add some inertia to a prior attribution so the patient isn’t bouncing between providers. A refined rule might incorporate the number of visits with each candidate provider as the first ranking criterion, then use the date of the most recent visit with each only as a tie-breaker.

But maybe this rule creates too much inertia. Consider a scenario in which the patient has a regular primary care physician, Dr. X:

In 2019, the patient sees Dr. X four times.
In 2020, the patient starts seeing a new physician, Dr. Y, visiting them three times.

By the end of 2020, the refined rule would still attribute the patient to Dr. X, which is likely not the desired interpretation.

We can get the correct result in this scenario by tweaking the attribution logic to add a cap to the count of visits used in the ranking logic. For example, using a cap of two visits, by the end of 2020 the patient would have accumulated two (countable) visits with both Dr. X and Dr. Y, and then the tie-breaker of most recent visit would attribute the patient to Dr. Y. In fact, this switch would happen after the second visit in 2020 with Dr. Y.

Now, if we had access to a registry of actively practicing PCPs in the local market, we would likely want to be even more aggressive with switching attribution away from departed providers. If the reason the patient in our example stopped seeing Dr. X was because Dr. X moved out of state, for example, we might want to dis-attribute the patient to Dr. X immediately upon the patient’s first visit with Dr. Y, rather than wait for a second one.

Particularly common with primary care, health plans may assign patients to providers. Depending on the circumstances, this administrative assignment can be anywhere from very weak to very solid evidence that the assigned provider is truly the appropriate attributee.

Let’s say that the administrative assignment of the patient in our example is purely pro forma—e.g., patients are randomly assigned a PCP at the time of their enrollment and face no financial penalty for seeing a different PCP—and so should carry very little weight in our attribution model. We’ll use it as a default attribution in the case where the patient has no visits with any providers.

Considering all of the above, the final rule might look something like the following:

Step 1: For each patient, identify all their primary care visits in the last two years; if the patient has no such visits, attribute the patient to the provider administratively assigned by their plan.
Step 2: If there is exactly one provider with one or more visits, attribute the patient to that provider.
Step 3: If the patient has one or more visits to multiple providers and exactly one provider is still actively practicing in the area, attribute the patient to that provider.
Step 4: If multiple providers are still practicing in the area and exactly one provider has a higher number of visits, capped at two visits, attribute the patient to that provider.
Step 5: If the providers have the same number of capped visits, attribute the patient to the provider with the most recent primary care visit.

Using timelines to handle dynamic data

An important characteristic of attribution is that it changes over time. In the above example, the patient was under the care of Dr. X for all of 2019. Any retrospective analysis of performance over 2019—even if that analysis is performed in 2021—should assign performance success or failure in 2019 to the patient’s physician at the time, Dr. X, and not to their “current day” physician, Dr. Y. All this makes the timeline data structure an excellent choice to represent patient attribution.

The following timeline structure represents the inputs and conclusions from the example above: patient-attribution-101

Implementation considerations

To implement an attribution rule like this—e.g., in a SQL query—it’s useful to think of three distinct categories of inputs:

Features of the patient (e.g., age, chronic disease status)
Features of the candidate provider (e.g., active vs. inactive status, specialty)
Interactions between the patient and the candidate provider (e.g., visits, periods of administrative assignment)

To start, for any given patient, at any desired moment in time, the set of interactions between that patient and providers can be used to generate the list of candidate providers.

Then, for each candidate provider, a record (e.g., a database table record) can be constructed with the individual pieces of information used in the attribution logic. So, sticking with our example above, we might want a record for each candidate provider with the following fields:

Is the count of primary care visits with the patient in the last two years > 0 (used in steps 1, 2 and 3)?
Is the provider the patient’s administratively assigned PCP (used in step 1)?
Is the provider in active practice in the patient’s market (used in step 3)?
Count of primary care visits with the patient in the last two years, capped at two (used in step 4)
Date of the most recent primary care visit with the patient in the last two years (used in step 5)

(Note that these records would be generated for each patient-provider-date triple, for each date on which we wanted to determine the patient’s attribution status.)

Once these records are collected, all that remains is to sort them in the correct order based on the attribution algorithm and select the top-ranked record as the “winning” provider.

Extending attribution logic for other uses

Finally, the process of attributing patients to exactly one of a set of candidate physicians can be generalized in two ways:

The attributed entity need not be a patient.
The candidate attributee need not be a physician.

The essential property is that the relationship is exclusive (i.e., there can be only one at any given time); beyond that anything can be swapped into the roles of attributed entity and candidate attributee. For example:

Determination of a patient’s stage of disease based on potentially conflicting diagnosis codes and/or lab values (attributed entity = patient, candidate attributee = stage of disease)
Determination of a physician’s “home” practice location (attributed entity = provider, candidate attributee = location)
Determination of a PCP group’s favored cardiology practice for referral (attributed entity = PCP group, candidate attributee = cardiology practice)

In other words, if you can frame the question as “Out of many possible options, which, at this moment, is the single best one?” there’s a good chance you’ve got a problem that an attribution mechanism can solve.