Legacy-to-FHIR Mapping

Purpose

Legacy-to-FHIR mapping is the disciplined process of translating a source system’s data model (tables, messages, documents, proprietary APIs) into FHIR resources while preserving meaning, provenance, and queryability.

Good mapping is more than “field-to-field transformation”:

Structural: selecting the right resource types and elements.
Semantic: mapping codes, units, statuses, and clinical intent.
Identity: deciding what constitutes “the same” patient/encounter/observation across systems.
Operational: defining validation, error handling, and versioned documentation.

This page focuses on practical mapping patterns you’ll encounter in ETL pipelines, integration engines, and FHIR server ingestion.

One-to-one mapping

One-to-one mapping is the simplest case: one source record maps to one FHIR resource of a single type.

Typical examples:

Patient demographics row → Patient
Provider directory row → Practitioner / Organization
Lab result row → Observation
Medication order row → MedicationRequest

What to decide up front

Decision	Why it matters
Target resource type	Prevents “almost right” resources that fail downstream queries
Stable identifiers	Determines idempotency and de-duplication behavior
Element-level mapping	Establishes which fields are authoritative vs derived
Terminology and units	Enables analytics and consistent downstream display
Missing/unknown data behavior	Avoids inconsistent null vs sentinel values

Simple mapping table format

This table format is usually enough to make mapping testable:

Source field	Target FHIR path	Transform	Notes
`patient.mrn`	`Patient.identifier[MRN].value`	trim	Slice identifier by `system`
`patient.dob`	`Patient.birthDate`	parse date	Use `_birthDate` extension only when needed
`lab.result_value`	`Observation.valueQuantity.value`	parse decimal	Convert locale formats
`lab.unit`	`Observation.valueQuantity.unit`	normalize	Prefer UCUM and/or code when possible

One-to-many mapping

One-to-many mapping is when one source record produces multiple resources (often bundled together) because FHIR is more normalized than the source representation.

Common reasons:

The source record contains a “header + line items” structure.
The source record mixes concepts that FHIR separates into different resources.
You need separate resources for references to work (e.g., Patient, Encounter, Observation).

Typical patterns

Source shape	Output shape	Example
Encounter row with embedded diagnoses + procedures	`Encounter` + multiple `Condition` + multiple `Procedure`	Claims/EHR encounter exports
Lab panel row with multiple analytes	One `Observation` panel + many `Observation` components or separate observations	Lab systems
Medication order row with dispense history	`MedicationRequest` + `MedicationDispense` (+ optionally `MedicationAdministration`)	Pharmacy exports

Bundle strategy (practical note)

When you generate multiple resources that reference each other, decide whether you:

Write resources individually (requires careful ordering and retries), or
Submit a transaction Bundle (atomic create/update semantics in many servers).

Many-to-one mapping

Many-to-one mapping is when multiple source records contribute to a single FHIR resource, often because:

The source system is highly normalized (many tables per concept).
The FHIR resource is a consolidated “view” used for interoperability.

Examples:

Patient demographics table + contact table + identifier table → one Patient
Organization table + location table + department table → one Organization + related resources

Key risks

Risk	What it looks like	Mitigation
Partial updates	Some contributing rows arrive later	Implement upsert strategy + “completeness” flags
Race conditions	Concurrent writers produce inconsistent state	Idempotent writes; serialize per entity key
Conflicting sources	Two systems disagree on the “truth”	Source of truth rules; provenance; tie-breakers

Normalization tradeoffs

FHIR’s model is normalized in ways that improve reuse and interoperability, but normalization comes with costs during mapping:

More resources and references to manage
More joins during query and analytics
More complex “merge” behavior when sources differ

Practical guideline

Prefer normalization when it improves interoperability and queryability (e.g., separate Patient and Observation). Consider controlled denormalization only when it is part of a clear downstream contract (e.g., a flattened analytics view), and document it explicitly.

Roundtripping and information loss

Roundtripping means: source → FHIR → source (or source → FHIR → another model). Information loss is normal unless you design for it.

Where information is often lost

Source-specific flags that have no FHIR element equivalent
Free-text nuances that are not captured as codes
Provenance (who asserted the data and when)
Inconsistent timestamps (event time vs recorded time)

Strategies to reduce loss

Preserve stable source identifiers in identifier[] and/or meta.source where appropriate.
Use Provenance (and/or provenance-related elements) for audit-critical flows.
When you must carry a source-only concept, prefer a published extension over “stuffing it into note”.
Document all intentional drops: “not mapped because …” is part of the contract.

Terminology and semantics

Most interoperability failures are semantic. Mapping should include an explicit terminology plan:

Problem	Example	What to do
Local codes	`LAB123` means “Hemoglobin”	Map to LOINC (or document why you can’t)
Status mismatch	source uses `C` / `F` / `X`	Map to FHIR status codes; document defaulting
Unit mismatch	`mg/dL` vs `mmol/L`	Convert and preserve original when needed
Ambiguous text	“neg” / “positive”	Standardize into coded interpretations

If you can’t standardize all codes, at least make it explicit which dimensions are “clean” vs “best effort”.

Identity and referencing

Identity decisions determine whether your pipeline is stable or chaotic:

What key uniquely identifies a patient/encounter/order/result in the source?
How do you prevent duplicates when ingesting incremental files or replaying events?
How do you link records across multiple source systems?

Recommended practices

Practice	Why
Use stable `identifier.system` + `identifier.value` pairs	Enables deterministic matching across loads
Treat `id` as an implementation detail unless agreed	Resource `id` often changes across environments
Create references from stable identifiers when possible	Avoid brittle “lookups by display name”
Plan for merges	Patient identity matching is never perfect

If you’re building FHIR payloads, you’ll frequently map “foreign keys” into references (e.g., Observation.subject pointing to the right Patient). Getting this wrong breaks query joins and makes data look correct in isolation but useless in context.

Validation

Validation should be part of the mapping contract, not a late-stage activity:

Structural validation: the resource is valid FHIR JSON.
Profile validation: required elements and slicing rules (if you claim conformance).
Terminology validation: required bindings, code systems, and units.
Referential integrity: references point to resolvable resources (or accepted logical references).

Make validation failures actionable:

Emit clear error reports (ideally including source record keys).
Decide retry behavior and quarantine behavior.
Track “known exceptions” separately from new regressions.

Documentation checklist

Use this checklist to keep mapping work maintainable over time:

Legacy-to-FHIR Mapping

Legacy-to-FHIR Mapping

Purpose

One-to-one mapping

What to decide up front

Simple mapping table format

One-to-many mapping

Typical patterns

Bundle strategy (practical note)

Many-to-one mapping

Key risks

Normalization tradeoffs

Practical guideline

Roundtripping and information loss

Where information is often lost

Strategies to reduce loss

Terminology and semantics

Identity and referencing

Recommended practices

Validation

Documentation checklist

See also