- 09 Nov 2022
- 14 Minutes to read
- Print
- DarkLight
Determine the sequence of Local Transform objects needed for each Natural object
- Updated on 09 Nov 2022
- 14 Minutes to read
- Print
- DarkLight
Background and Strategy
In the normal course of processing, after passing through semantic mapping and data mastering, data proceed to the Local Transform Layer. While the process of semantic mapping was focused on manipulating field-level data, the central purpose of Local Transform objects is to manipulate the grain size of the data. In particular, the grain size of the source data objects must be conformed to the grain size of the Natural objects receiving them.
This preparation can often be accomplished with just a single step – that is, a single Local Transform object bridging the path from one or more Semantic Mapping objects to each Natural object ultimately populated. But it is also not unusual to find that a sequence of two or more Local Transform objects is the optimal solution. (For example, breaking the transformation logic into distinct stages may allow processing to be more easily shared among branching data flows that flow into different Natural objects.)
This task discusses the process of identifying that optimal sequence of one or more Local Transform objects for each terminal Natural object.
Planning should start with an ennumeration of those terminal Natural objects. This list should already have been generated at an earlier stage of the integration. (See Identify the Natural objects impacted by each source data object.)
It can be useful at this point to divide these Natural objects into three groups:
1. Objects representing institutional, professional, or pharmacy claims or billing concepts
These often require complicated transformations to reconcile transactional financial data to final action status, and, consequently, a more elaborate sequence of Local Transform objects. Institutional and professional claims and bills also have hierarchical structures, which also contributes to the complexity.
It is also common for professional and institutional claims to be combined in the source data into a single table; therefore, a common pattern used to integrate transactional medical claims data is to first deal with the reconciliation to final action status before splitting out the institutional from professional claims, resulting in a sequence of multiple Local Transform objects. (See example case below for an illustration of this.)
2. Timeline objects, such as Patient-Plan Timelines of Plan Membership, Patient-Source Timelines of Data Coverage, etc.
These will require some special care in terms of cleaning and reshaping to ensure that the resulting data conforms to the timeline structure.
In brief, a timeline is a data structure that captures the features of an entity over time. Each record in a timeline has an entity identifier and a start and end date, and each record's field values describe the status of the entity at any point during the period bookended by those two dates. (For a deep dive into timelines, see How to track patient features over time for population health analytics.)
For our current purposes, the key quality of timelines that must be guaranteed by the Local Transform Layer logic is that the periods of the timeline are non-overlapping within the same entity. That is, a patient timeline cannot contain a record whose start date falls between the start and end date of a different timeline record for that same patient. That is, the following is allowed:
Patient ID | Period Start Date | Period End Date | Is Enrolled |
---|---|---|---|
100 | 1/1/2022 | 2/1/2022 | 0 |
100 | 2/1/2022 | 3/1/2022 | 1 |
...but this is not:
Patient ID | Period Start Date | Period End Date | Is Enrolled |
---|---|---|---|
100 | 1/1/2022 | 2/1/2022 | 0 |
100 | 1/15/2022 | 3/1/2022 | 1 |
(Note that the period between 1/15/2022 and 2/1/2022 is "double-booked", allowing for the possibility of contradictory features covering the same time period; indeed, we see that happening here, with the patient being documented as both enrolled and not enrolled in the second half of January.)
Ursa Studio has some tooling -- in particular, the Simple Timeline object type -- which should typically be used as part of a Local Transform sequence to ensure the non-overlapping requirement is met before pushing the data downstream into Natural Object Layer timelines.
3. Everything else: encounters, labs, patient communications, referrals, appontments, problem list entries, medication orders, etc.
Source data representing these concepts tend to not need very much coercion to match the grain size of the terminal Natural objects.
But note that even in these simpler cases it is a best practice to create at least one Local Transform object for each Natural object. This helps to keep source-specific logic out of the Natural object layer and provides a degree of implementation consistency that can be helpful for those tasked with maintaining the integration logic. (If database space is a concern, these objects can be set as views.)
Finally, it is not uncommon to find, halfway through the Local Transform Layer development work, that a different approach would have been better. (For example, if you find yourself repeating the same logic in multiple objects within the same source system integration, that's a good indicator that there might be a more efficient approach.) Don't be afraid to pivot to a better approach. Doing this difficult work properly will likely pay dividends over the long run in reduced risk of bugs and faster runtimes.
Detailed Implementation Guidance
(It is difficult to provide anything other than high-level, strategic advice on this task because source data packages can vary so widely. To help provide some more detailed guidance, a number of simple example cases are presented below that represent idealized scenarios that may serve as a useful starting point, with the caveat that real-life integrations tend to be messier, and will likely require extensions or variations on these solutions.)
Examples
Example 1: Transactional medical claims
Consider a source data file for transactional medical claims that has undergone semantic mapping. Coming out of semantic mapping, the object data might look like the following:
Source ID | Source Local Patient ID | Source Local Claim ID | Source Local Service Line Number | Is Claim Class Institutional | Is Claim Class Professional | HCPCS Code | Increase to Plan Paid Amount | Source Local Transaction Header ID | Source Local Transaction Service Line Item ID | Transaction Type Operational ID | Is Reversal Transaction | Transaction Sequence Number | Diagnosis 1 ICD-10-CM Code | Diagnosis 2 ICD-10-CM Code | Diagnosis 3 ICD-10-CM Code |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DS | MRN|1 | 10 | 1 | 0 | 1 | 99123 | 80 | 1001 | 100101 | O | 0 | 1 | A01.0 | Z00.01 | |
DS | MRN|1 | 10 | 1 | 0 | 1 | 99123 | -80 | 1002 | 100102 | R | 1 | 2 | A01.0 | ||
DS | MRN|1 | 10 | 1 | 0 | 1 | 99123 | 100 | 1003 | 100103 | A | 0 | 3 | A01.0 | ||
DS | MRN|1 | 10 | 2 | 0 | 1 | 76543 | 50 | 1001 | 100201 | O | 0 | 1 | A01.0 | Z00.01 | |
DS | MRN|1 | 10 | 2 | 0 | 1 | 76543 | -50 | 1002 | 100202 | R | 1 | 2 | A01.0 | ||
DS | MRN|1 | 10 | 2 | 0 | 1 | 76543 | 50 | 1003 | 100203 | A | 0 | 3 | A01.0 | ||
DS | MRN|1 | 10 | 3 | 0 | 1 | 12345 | 0 | 1001 | 100301 | O | 0 | 1 | A01.0 | Z00.01 | |
DS | MRN|1 | 10 | 3 | 0 | 1 | 12345 | 0 | 1002 | 100302 | R | 1 | 2 | A01.0 | ||
DS | MRN|1 | 10 | 3 | 0 | 1 | 12345 | 0 | 1003 | 100303 | A | 0 | 3 | A01.0 | ||
DS | MRN|1 | 11 | 1 | 1 | 0 | 42586 | 34 | 1101 | 110101 | O | 0 | 1 | A01.09 | ||
DS | MRN|1 | 11 | 2 | 1 | 0 | 87654 | 50 | 1101 | 110201 | O | 0 | 1 | A01.09 | ||
DS | MRN|1 | 11 | 1 | 1 | 0 | 42586 | -34 | 1102 | 110102 | R | 1 | 2 | A01.09 | ||
DS | MRN|1 | 11 | 2 | 1 | 0 | 87654 | -50 | 1102 | 110202 | R | 1 | 2 | A01.09 | ||
DS | MRN|1 | 11 | 1 | 1 | 0 | 42586 | 34 | 1103 | 110103 | A | 0 | 3 | A01.09 |
Starting here, a reasonable sequence of Local Transform Layer objects is presented below. The objects are organized into "tranches" of objects that are independent of each other within a tranche but downstream of objects in earlier tranches.
Tranche 1.
- Source DS Final Action Medical Claim Service Line Items (grain size = one record per medical claim service line item in final action status).
Patient ID | Claim ID | Claim Service Line Item ID | Source ID | Source Local Patient ID | Source Local Claim ID | Service Line Number | Is Claim Class Institutional | Is Claim Class Professional | HCPCS Code | Claim Service Line Plan Paid Amount | Claim Plan Paid Amount | Is Final Transaction Reversal | Diagnosis 1 ICD-10-CM Code | Diagnosis 2 ICD-10-CM Code | Diagnosis 3 ICD-10-CM Code |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DS|MRN|1 | DS|10 | DS|10|1 | DS | MRN|1 | 10 | 1 | 0 | 1 | 99123 | 100 | 150 | 0 | A01.0 | Z00.01 | |
DS|MRN|1 | DS|10 | DS|10|1 | DS | MRN|1 | 10 | 2 | 0 | 1 | 76543 | 50 | 150 | 0 | A01.0 | ||
DS|MRN|1 | DS|10 | DS|10|1 | DS | MRN|1 | 10 | 3 | 0 | 1 | 12345 | 0 | 150 | 0 | A01.0 | ||
DS|MRN|1 | DS|11 | DS|11|1 | DS | MRN|1 | 11 | 1 | 1 | 0 | 42586 | 34 | 34 | 0 | A01.09 | ||
DS|MRN|1 | DS|11 | DS|11|2 | DS | MRN|1 | 11 | 2 | 1 | 0 | 87654 | 0 | 34 | 1 | A01.09 |
Even though the institutional and professional claims are destined for two different sets of Natural objects, keeping them merged together in this object for now allows us to implement the potentially complicated final action reconciliation logic just once, since, most likely, that reconciliation logic is the same for both institutional and professional claims. (NB: an assumption like this should always be confirmed in a real implementation.)
Similarly, we would also use this object to get some other typical Local Transform Layer activities out of the way, e.g., looking up the mastered Patient ID, generating other data model keys from the source local keys, summing final action line-level amounts to obtain the claim-level amount, etc. (The next task in this workflow, , goes into detail on these kinds of items.) Implementing logic at this stage saves us from needing to do it multiple times later in different downstream Local Transform objects.
Tranche 2.
- Source DS Institutional Claim Headers (grain size = one record per non-reversed final action institutional claim header originating in source DS)
Restricts to institutional claims; restricts to claims with at least one non-reversed service line item; collapses to header grain; publishes only header-level fields (removing, e.g., Service Line Number, HCPCS Code, etc.). Example data:
Patient ID | Claim ID | Source ID | Source Local Patient ID | Source Local Claim ID | Claim Plan Paid Amount | Diagnosis 1 ICD-10-CM Code | Diagnosis 2 ICD-10-CM Code | Diagnosis 3 ICD-10-CM Code |
---|---|---|---|---|---|---|---|---|
DS|MRN|1 | DS|11 | DS | MRN|1 | 11 | 34 | A01.09 |
- Source DS Institutional Claim Service Line Items (grain size = one record per non-reversed final action institutional claim service line item originating in source DS)
Restricts to institutional claims; restricts to non-reversed service line items (note the record with HCPCS code 87654 was removed due to a final action status of reversed); publishes only service line item-level fields (removing, e.g., Claim Plan Paid Amount). Example data:
Patient ID | Claim ID | Claim Service Line Item ID | Source ID | Source Local Patient ID | Source Local Claim ID | Service Line Number | HCPCS Code | Claim Service Line Plan Paid Amount |
---|---|---|---|---|---|---|---|---|
DS|MRN|1 | DS|11 | DS|11|1 | DS | MRN|1 | 11 | 1 | 42586 | 34 |
- Source DS Professional Claim Service Line Items (grain size = one record per non-reversed final action professional claim service line item originating in source DS)
Restricts to professional claims; restricts to non-reversed service line items. Example data:
Patient ID | Claim ID | Claim Service Line Item ID | Source ID | Source Local Patient ID | Source Local Claim ID | Service Line Number | HCPCS Code | Claim Service Line Plan Paid Amount | Claim Plan Paid Amount | Diagnosis 1 ICD-10-CM Code | Diagnosis 2 ICD-10-CM Code | Diagnosis 3 ICD-10-CM Code |
---|---|---|---|---|---|---|---|---|---|---|---|---|
DS|MRN|1 | DS|10 | DS|10|1 | DS | MRN|1 | 10 | 1 | 99123 | 100 | 150 | A01.0 | Z00.01 | |
DS|MRN|1 | DS|10 | DS|10|1 | DS | MRN|1 | 10 | 2 | 76543 | 50 | 150 | A01.0 | ||
DS|MRN|1 | DS|10 | DS|10|1 | DS | MRN|1 | 10 | 3 | 12345 | 0 | 150 | A01.0 |
The institutional and professional claims have now been separated. These three objects above are the correct grain size to be loaded into the Institutional Claim Headers, Institutional Claim Service Line Items, and Professional Claim Service Line Items Natural objects, respectively.
Tranche 3.
Source DS Institutional Claim ICD Discharge Diagnoses (grain size = one record per ICD diagnosis associated with an non-reversed institutional claim originating in source DS).
Source DS Professional Claim ICD Diagnoses (grain size = one record per ICD diagnosis associated with a non-reversed professional claim service line item originating in source DS)
These objects take the wide-form ICD-10-CM diagnosis codes from the surviving tranche 2 objects and reshapes them into long-form data. (The benefit of building these objects from tranche 2 objects rather than the original tranche 1 object is that the logic identifying non-reversed claims and claim service line items does not need to be repeated in tranche 3.) These objects are now the correct grain to be loaded into the Institutional Claim ICD Discharge Diagnoses and Professional Claim ICD Diagnoses Natural objects, respectively.
Example 2: Monthly plan enrollment periods
Consider a source data file from a payor providing periods of enrollment for each member. Following semantic mapping, the data might look like the following:
Source ID | Source Local Patient ID | Source Local Payor ID | Source Local Plan ID | Period Start Date | Period End Date | Member Policy Number | Is Plan Coverage Type Medical | Is Plan Coverage Type Pharmacy |
---|---|---|---|---|---|---|---|---|
DS | 1 | PC | GOLD4000 | 1/1/2021 | 6/3/2021 | A123456 | 1 | 0 |
DS | 1 | PC | GOLD4000 | 6/1/2021 | 1/1/2022 | A123456 | 1 | 0 |
DS | 1 | PC | GOLD4000 | 1/1/2022 | 1/1/2023 | A123456 | 1 | 0 |
DS | 1 | PC | RX15 | 1/1/2022 | 1/1/2023 | A234567 | 0 | 1 |
Note the following:
- There is an overlapping period of enrollment in the same plan for this patient on 6/1/2021 and 6/2/2021. (Following the convention that period end dates are exclusive, the period end date of 6/3/2021 means the last day of enrollment indicated by that record is the full calendar day 6/2/2021, so 6/3/2021 is included only in the period associated with the second record.) This violates the standard of non-overlappingness demanded by the terminal natural object, Patient-Plan Timelines of Plan Membership.
- The last record above (with Source Local Plan ID = "RX15") overlapps completely over the period 1/1/2021 through 1/1/2022 with the second-to-last record (with Source Local Plan ID = "GOLD4000"); however, because these records are for different plans, there is no violation of the overlappingness requirement of Patient-Plan Timelines of Plan Membership.
Tranche 1.
- Source DS Patient-Plan Timelines of Membership (grain size = one record per non-overlapping period of patient-plan enrollment originating in source DS)
Running these data through a Simple Timeline object will ensure any overlapping periods are appropriately shortened to eliminate any periods of overlappingness. (As usual, this is also the stage at which data model key values such as Patient ID and Payor ID are generated.) Example data:
Patient ID | Payor ID | Plan ID | Source ID | Source Local Patient ID | Source Local Payor ID | Source Local Plan ID | Period Start Date | Period End Date | Member Policy Number | Is Plan Coverage Type Medical | Is Plan Coverage Type Pharmacy |
---|---|---|---|---|---|---|---|---|---|---|---|
DS|1 | DS|PC | DS|GOLD4000 | DS | 1 | PC | GOLD4000 | 1/1/2021 | 6/1/2021 | A123456 | 1 | 0 |
DS|1 | DS|PC | DS|GOLD4000 | DS | 1 | PC | GOLD4000 | 6/1/2021 | 1/1/2022 | A123456 | 1 | 0 |
DS|1 | DS|PC | DS|GOLD4000 | DS | 1 | PC | GOLD4000 | 1/1/2022 | 1/1/2023 | A123456 | 1 | 0 |
DS|1 | DS|PC | DS|RX15 | DS | 1 | PC | RX15 | 1/1/2022 | 1/1/2023 | A234567 | 0 | 1 |
This object is now correctly structured to be loaded into the Patient-Plan Timelines of Plan Membership object.
Tranche 2.
- Source DS Patient-Source Timelines of Data Coverage (grain size = one record per non-overlapping period of patient-source data coverage in source DS)
Now that the plan enrollment period data have been cleaned and any ambiguities related to overlapping periods resolved, we can use that Local Transform object to generate the slightly different patient-source timeline records describing the "data coverage" of source DS, as expected by the Patient-Source Timelines of Data Coverage Natural object.
The most straightforward way to achieve the desired timeline is to use a Complex Timeline object with the patient-plan periods of Source DS Patient-Plan Timelines of Membership treated as Potentially Overlapping Interval components. (Even though those records were just rendered non-overlapping for a given patient-plan pair, they may be overlapping for a patient-source pair, as they are in this example between 1/1/22 and 1/1/23 when the patient has both medical and pharmacy benefit plans active; therefore treating them as potentially overlapping when constructing the Complex Timeline object is appropriate.) Example data:
Patient ID | Source ID | Source Local Patient ID | Period Start Date | Period End Date | Is Institutional Claim Data Coverage | Is Institutional Claim Paid Amount Data Coverage | Is Institutional Claim Service Line Item Paid Amount Data Coverage | Is Professional Claim Data Coverage | Is Professional Claim Paid Amount Data Coverage | Is Professional Claim Service Line Item Paid Amount Data Coverage | Is Pharmacy Claim Data Coverage | Is Pharmacy Claim Paid Amount Data Coverage |
---|---|---|---|---|---|---|---|---|---|---|---|---|
DS|1 | DS | 1 | 1/1/2021 | 6/1/2021 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 |
DS|1 | DS | 1 | 6/1/2021 | 1/1/2022 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 |
DS|1 | DS | 1 | 1/1/2021 | 1/1/2022 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Again, the benefit of sequencing these two objects one after the other -- instead of building both directly off the relevant semantic mapping object -- is to save the database processing and additional maintenance burden of generating the same data model key values (e.g., Patient ID) in multiple objects, and to allow the second object, Source DS Patient-Source Timelines of Data Coverage, to benefit from the cleaning performed by the first object, Source DS Patient-Plan Timelines of Membership.