Legacy-to-FHIR Mapping
Legacy-to-FHIR Mapping
Purpose
Legacy-to-FHIR mapping is the disciplined process of translating a source system’s data model (tables, messages, documents, proprietary APIs) into FHIR resources while preserving meaning, provenance, and queryability.
Good mapping is more than “field-to-field transformation”:
- Structural: selecting the right resource types and elements.
- Semantic: mapping codes, units, statuses, and clinical intent.
- Identity: deciding what constitutes “the same” patient/encounter/observation across systems.
- Operational: defining validation, error handling, and versioned documentation.
This page focuses on practical mapping patterns you’ll encounter in ETL pipelines, integration engines, and FHIR server ingestion.
One-to-one mapping
One-to-one mapping is the simplest case: one source record maps to one FHIR resource of a single type.
Typical examples:
- Patient demographics row →
Patient - Provider directory row →
Practitioner/Organization - Lab result row →
Observation - Medication order row →
MedicationRequest
What to decide up front
| Decision | Why it matters |
|---|---|
| Target resource type | Prevents “almost right” resources that fail downstream queries |
| Stable identifiers | Determines idempotency and de-duplication behavior |
| Element-level mapping | Establishes which fields are authoritative vs derived |
| Terminology and units | Enables analytics and consistent downstream display |
| Missing/unknown data behavior | Avoids inconsistent null vs sentinel values |
Simple mapping table format
This table format is usually enough to make mapping testable:
| Source field | Target FHIR path | Transform | Notes |
|---|---|---|---|
patient.mrn | Patient.identifier[MRN].value | trim | Slice identifier by system |
patient.dob | Patient.birthDate | parse date | Use _birthDate extension only when needed |
lab.result_value | Observation.valueQuantity.value | parse decimal | Convert locale formats |
lab.unit | Observation.valueQuantity.unit | normalize | Prefer UCUM and/or code when possible |
One-to-many mapping
One-to-many mapping is when one source record produces multiple resources (often bundled together) because FHIR is more normalized than the source representation.
Common reasons:
- The source record contains a “header + line items” structure.
- The source record mixes concepts that FHIR separates into different resources.
- You need separate resources for references to work (e.g., Patient, Encounter, Observation).
Typical patterns
| Source shape | Output shape | Example |
|---|---|---|
| Encounter row with embedded diagnoses + procedures | Encounter + multiple Condition + multiple Procedure | Claims/EHR encounter exports |
| Lab panel row with multiple analytes | One Observation panel + many Observation components or separate observations | Lab systems |
| Medication order row with dispense history | MedicationRequest + MedicationDispense (+ optionally MedicationAdministration) | Pharmacy exports |
Bundle strategy (practical note)
When you generate multiple resources that reference each other, decide whether you:
- Write resources individually (requires careful ordering and retries), or
- Submit a transaction
Bundle(atomic create/update semantics in many servers).
Many-to-one mapping
Many-to-one mapping is when multiple source records contribute to a single FHIR resource, often because:
- The source system is highly normalized (many tables per concept).
- The FHIR resource is a consolidated “view” used for interoperability.
Examples:
- Patient demographics table + contact table + identifier table → one
Patient - Organization table + location table + department table → one
Organization+ related resources
Key risks
| Risk | What it looks like | Mitigation |
|---|---|---|
| Partial updates | Some contributing rows arrive later | Implement upsert strategy + “completeness” flags |
| Race conditions | Concurrent writers produce inconsistent state | Idempotent writes; serialize per entity key |
| Conflicting sources | Two systems disagree on the “truth” | Source of truth rules; provenance; tie-breakers |
Normalization tradeoffs
FHIR’s model is normalized in ways that improve reuse and interoperability, but normalization comes with costs during mapping:
- More resources and references to manage
- More joins during query and analytics
- More complex “merge” behavior when sources differ
Practical guideline
Prefer normalization when it improves interoperability and queryability (e.g., separate Patient and Observation). Consider controlled denormalization only when it is part of a clear downstream contract (e.g., a flattened analytics view), and document it explicitly.
Roundtripping and information loss
Roundtripping means: source → FHIR → source (or source → FHIR → another model). Information loss is normal unless you design for it.
Where information is often lost
- Source-specific flags that have no FHIR element equivalent
- Free-text nuances that are not captured as codes
- Provenance (who asserted the data and when)
- Inconsistent timestamps (event time vs recorded time)
Strategies to reduce loss
- Preserve stable source identifiers in
identifier[]and/ormeta.sourcewhere appropriate. - Use
Provenance(and/or provenance-related elements) for audit-critical flows. - When you must carry a source-only concept, prefer a published extension over “stuffing it into
note”. - Document all intentional drops: “not mapped because …” is part of the contract.
Terminology and semantics
Most interoperability failures are semantic. Mapping should include an explicit terminology plan:
| Problem | Example | What to do |
|---|---|---|
| Local codes | LAB123 means “Hemoglobin” | Map to LOINC (or document why you can’t) |
| Status mismatch | source uses C / F / X | Map to FHIR status codes; document defaulting |
| Unit mismatch | mg/dL vs mmol/L | Convert and preserve original when needed |
| Ambiguous text | “neg” / “positive” | Standardize into coded interpretations |
If you can’t standardize all codes, at least make it explicit which dimensions are “clean” vs “best effort”.
Identity and referencing
Identity decisions determine whether your pipeline is stable or chaotic:
- What key uniquely identifies a patient/encounter/order/result in the source?
- How do you prevent duplicates when ingesting incremental files or replaying events?
- How do you link records across multiple source systems?
Recommended practices
| Practice | Why |
|---|---|
Use stable identifier.system + identifier.value pairs | Enables deterministic matching across loads |
Treat id as an implementation detail unless agreed | Resource id often changes across environments |
| Create references from stable identifiers when possible | Avoid brittle “lookups by display name” |
| Plan for merges | Patient identity matching is never perfect |
If you’re building FHIR payloads, you’ll frequently map “foreign keys” into references (e.g., Observation.subject pointing to the right Patient). Getting this wrong breaks query joins and makes data look correct in isolation but useless in context.
Validation
Validation should be part of the mapping contract, not a late-stage activity:
- Structural validation: the resource is valid FHIR JSON.
- Profile validation: required elements and slicing rules (if you claim conformance).
- Terminology validation: required bindings, code systems, and units.
- Referential integrity: references point to resolvable resources (or accepted logical references).
Make validation failures actionable:
- Emit clear error reports (ideally including source record keys).
- Decide retry behavior and quarantine behavior.
- Track “known exceptions” separately from new regressions.
Documentation checklist
Use this checklist to keep mapping work maintainable over time:
- Mapping scope and target IG/profile(s) (if applicable)
- Source inventory (tables/messages, owners, refresh cadence)
- Entity keys and idempotency strategy
- Field-level mapping tables (source → FHIR path → transform → notes)
- Terminology decisions (code systems, value sets, unit conversion)
- Defaults and null handling (including “unknown” semantics)
- Reference rules (how IDs map to FHIR references)
- Validation rules and test cases (including edge cases)
- Error handling and reporting format
- Versioning and change log (when mappings change, what breaks)