Healthcare Data Semantics

Healthcare Data Semantics

Two systems can exchange syntactically valid FHIR — well-formed resources, correct REST interactions, passing schema validation — and still fail to interoperate. If one system codes diagnoses in SNOMED CT and the other codes them in ICD-10-CM, the receiving system cannot reliably interpret the data without translation. If one lab uses a local code for a serum sodium test and another uses LOINC 2823-3, no downstream system can aggregate the results without a mapping table.

This is the semantic layer: the agreement on what codes mean and which code system to use for which data type. It is an architectural decision, not an implementation detail. Getting it wrong compounds over time — every consumer of your data inherits the semantic choices you made at the source.

This article covers what the major clinical vocabularies are, when to use each, and how to approach mapping between them. For FHIR-specific mechanics — CodeSystem, ValueSet, ConceptMap, terminology server operations — see FHIR Terminology.


SNOMED CT

SNOMED CT (Systematized Nomenclature of Medicine — Clinical Terms) is the broadest clinical terminology in healthcare. Where other code systems classify concepts for administrative or billing purposes, SNOMED CT is designed to represent clinical meaning — what a clinician actually observed, did, or documented.

Concept model

Every SNOMED CT concept has:

  • A concept identifier (numeric, e.g., 44054006 for Type 2 diabetes mellitus)
  • One or more descriptions: a Fully Specified Name (FSN, unique and unambiguous), one Preferred Term, and optional synonyms
  • Relationships to other concepts: IS A (hierarchy), finding site, associated morphology, causative agent, and dozens of others
  • A definition status: primitive (not fully defined by its relationships) or fully defined

Relationships make SNOMED CT a terminology, not just a code list. A system can infer that 44054006 | Type 2 diabetes mellitus | IS A 73211009 | Diabetes mellitus | IS A 362969004 | Disorder of endocrine system |, and use this hierarchy for subsumption queries: “find all patients with any form of diabetes.”

Hierarchy

SNOMED CT has nineteen top-level hierarchies. The most commonly used in clinical integration:

HierarchyDescriptionExamples
Clinical findingObservations, findings, diagnosesHypertension, fracture, Type 2 diabetes
ProcedureClinical activitiesAppendectomy, blood pressure measurement, vaccination
Body structureAnatomical locationsLeft femur, hepatic artery, cerebral cortex
OrganismPathogens and organismsStaphylococcus aureus, influenza virus
SubstanceDrugs, chemicalsPenicillin, glucose, ethanol
Pharmaceutical / biologic productMedicinal productsAmoxicillin 250mg capsule
Observable entityWhat can be measuredSerum sodium concentration, body temperature
Situation with explicit contextFindings with context modifiersFamily history of cancer, no known allergies

Expressions and post-coordination

Simple SNOMED codes (single concept identifiers) are pre-coordinated — the full meaning is built into the concept definition. Post-coordination combines multiple concepts in an expression to represent nuanced meaning that no single concept captures:

64572001 |Disease| : 363698007 |Finding site| = 368209003 |Right arm structure|

This expression means “disease of the right arm structure.” Post-coordination is powerful but increases implementation complexity. Most real-world systems use pre-coordinated codes; support for post-coordination should be an explicit design decision.

Release cadence and licensing

SNOMED International releases SNOMED CT twice yearly, in January and July. Access requires a license through a National Release Center. Most countries have national release programs (the US NRC is at NLM). Clinical software deployed in a member country can access SNOMED CT at no additional cost through the national license. Verify your jurisdiction’s terms before deployment.

When to use SNOMED CT

SNOMED CT is the right choice for:

  • Problem lists and clinical diagnoses in provider-facing systems
  • Clinical decision support (subsumption queries require a proper terminology)
  • Procedures in clinical (non-billing) contexts
  • Clinical findings and observations that do not have a LOINC code

SNOMED CT is harder to implement than ICD-10 — it requires a terminology server capable of subsumption and expression queries. For billing submissions, ICD-10 is mandated regardless of what your clinical system uses internally.


LOINC

LOINC (Logical Observation Identifiers Names and Codes) is the standard for laboratory tests, clinical measurements, and clinical observations. Where SNOMED CT represents findings and diagnoses, LOINC represents what was measured and how. The division is: SNOMED codes the result concept; LOINC codes the observation method.

LOINC is produced by the Regenstrief Institute in Indianapolis and is free to use — download the full database from loinc.org. There is no licensing fee.

The six-part naming convention

Every LOINC code is defined by six axes. Together, they constitute a unique, fully specified observation. No two LOINC codes share the same combination of all six parts.

AxisNameDescriptionExample
1ComponentWhat is measuredSodium
2PropertyCharacteristic of what is measuredSCnc (substance concentration)
3Time AspectInterval or point in timePt (point in time)
4SystemSpecimen or systemSer (serum)
5ScaleHow the result is expressedQn (quantitative)
6MethodHow the measurement was made(optional — often blank for general methods)

Applying this to serum sodium: Component=Sodium, Property=SCnc (substance concentration), Time=Pt, System=Ser (serum), Scale=Qn (quantitative), Method=(unspecified) → LOINC 2951-2 Sodium [Moles/volume] in Serum or Plasma.

The six-part structure is why LOINC codes are unique: 2951-2 (serum sodium) is distinct from 2952-0 (urine sodium), because the System axis differs.

LOINC code categories

CategoryCount (approx.)Use Cases
Laboratory~85,000Blood chemistry, hematology, microbiology, pathology, urinalysis
Clinical (vital signs)~2,000Blood pressure, heart rate, temperature, oxygen saturation, BMI
Survey instruments~7,000PHQ-9, GAD-7, AUDIT, Apgar, PROMIS scales
Document ontology~5,000Document type codes used in CDA and FHIR DocumentReference
Imaging~2,000Radiology study types (used in DICOM metadata and FHIR ImagingStudy)

Why lab interoperability fails without LOINC

Every laboratory has a local test catalog. The local code for “Complete Metabolic Panel” at Hospital A is different from Hospital B’s local code for the same panel. Without a common code, a receiving EHR cannot know that both systems are reporting the same test.

LOINC solves this: both labs map their local codes to the corresponding LOINC code. The receiving system understands LOINC and can aggregate, trend, and alert on results regardless of which lab generated them.

This mapping is not trivial. A lab may have hundreds or thousands of local codes, some of which map to a single LOINC code, some that require different LOINC codes depending on the specimen type, and some for which no exact LOINC match exists. This is real work that requires a LOINC expert reviewer.

LOINC for order vs. result codes

LOINC distinguishes order codes from result codes. The order code represents what the clinician requested; the result code represents what was actually measured. For a BMP (Basic Metabolic Panel), the order code might be 24320-4 (Basic metabolic panel) and the result components include individual codes for sodium, potassium, chloride, bicarbonate, BUN, creatinine, and glucose.

When building FHIR Observation resources, use the result LOINC code on the Observation, not the order code. The order belongs on the ServiceRequest.


RxNorm

RxNorm is the US standard terminology for drugs. It is produced and distributed by the National Library of Medicine (NLM), is free to use, and is updated weekly. RxNorm provides normalised names and codes for medications, and — critically — links across the different drug coding systems used in US healthcare.

Concept types (Term Types / TTY)

RxNorm organises drug concepts in a hierarchy of Term Types:

TTYNameDescriptionExample
INIngredientActive ingredient onlyMetformin
PINPrecise IngredientSpecific salt or ester formMetformin hydrochloride
MINMultiple IngredientsCombination ingredient setMetformin / sitagliptin
BNBrand NameTrade nameGlucophage
SCDSemantic Clinical DrugIngredient + strength + dose formMetformin 500 MG Oral Tablet
SBDSemantic Branded DrugBrand + strength + dose formGlucophage 500 MG Oral Tablet
GPCKGeneric PackGeneric multi-packMetformin 500 MG Oral Tablet [60 Tablets]
BPCKBranded PackBranded multi-packGlucophage 500 MG Oral Tablet [60 Tablets]

Which TTY to use

For prescriptions and medication orders: use SCD (Semantic Clinical Drug) for generic prescribing or SBD for brand-required prescribing. SCD and SBD encode the information a pharmacist needs to dispense: the drug, the strength, and the form.

Using IN (ingredient only) for a prescription is insufficient — “Metformin” doesn’t tell the pharmacy what to dispense. Using NDC for a prescription is also incorrect — NDC is package-specific and changes when a manufacturer reformulates or repackages, which breaks medication history comparisons.

For dispensing records: NDC (National Drug Code) is appropriate because it identifies the specific packaged product dispensed. RxNorm provides mappings from NDC to RxNorm, enabling normalisation.

For medication history and reconciliation: SCD or IN level codes provide stable identifiers that aggregate across brand changes and reformulations.

NDC relationship

The NDC is a 10- or 11-digit code that identifies a specific drug product from a specific manufacturer in a specific package size. Multiple NDCs map to the same SCD — the same active ingredient, strength, and form from different manufacturers or in different package sizes all map to one SCD.

RxNorm is the normalisation layer that makes “Metformin 500mg tablet from Manufacturer A, 100-count bottle” and “Metformin 500mg tablet from Manufacturer B, 60-count bottle” comparable in clinical and analytics systems.

FHIR system URI

Use http://www.nlm.nih.gov/research/umls/rxnorm as the CodeSystem URI when coding medications in FHIR resources.


ICD-10

ICD-10 is the tenth revision of the International Classification of Diseases, produced by the World Health Organization. In the United States, two variants are used:

  • ICD-10-CM (Clinical Modification): diagnoses — used by all care settings for billing and reporting
  • ICD-10-PCS (Procedure Coding System): inpatient procedures — used by hospitals for inpatient billing

ICD-10-CM replaced ICD-9-CM in the US on 1 October 2015.

ICD-10-CM structure

ICD-10-CM codes are 3–7 characters. The structure encodes clinical specificity hierarchically:

E11.649
│  │└── 9 = Without specified complication
│  └─── 64 = Hypoglycemia
└────── E11 = Type 2 diabetes mellitus

Position 1 is alphabetic (A–Z, not I or O, to avoid confusion with digits). Positions 2–3 are numeric. Position 4 onward (after the decimal point, which is implicit in electronic systems) adds specificity: etiology, manifestation, laterality, severity, encounter type.

Code lengthLevel of specificityExample
3 charactersCategory (broadest)E11 — Type 2 diabetes mellitus
4 charactersEtiology / body systemE11.6 — Type 2 diabetes mellitus with other specified complications
5–7 charactersManifestation, laterality, severityE11.641 — Type 2 diabetes mellitus with hypoglycemia with coma

In electronic systems, always use the most specific (longest) applicable code. Submitting a 3-character code when a more specific code exists will result in claim rejection.

ICD-10-PCS structure

ICD-10-PCS codes are exactly 7 alphanumeric characters. The structure is:

Section | Body System | Root Operation | Body Part | Approach | Device | Qualifier

Every character has meaning. 0FB43ZX decodes as:

  • 0 = Medical and Surgical
  • F = Hepatobiliary System and Pancreas
  • B = Excision
  • 4 = Gallbladder
  • 3 = Percutaneous
  • Z = No Device
  • X = Diagnostic

ICD-10 for billing, not clinical systems

ICD-10 was designed for administrative classification and billing. It is excellent for that purpose. It is not designed for clinical precision — multiple distinct clinical conditions may share an ICD-10 code, and the code does not carry the semantic relationships needed for clinical decision support.

The implication: ICD-10 is mandatory for claims and encounter reporting. For the clinical problem list, SNOMED CT is preferable. When building FHIR Condition resources, you often need both: SNOMED for the clinical system of record, ICD-10 for billing-facing systems. See Clinical Data Mapping for how to handle dual coding on a single Condition resource.


CPT

CPT (Current Procedural Terminology) is the standard for coding outpatient and physician procedures in the United States. It is maintained and licensed by the American Medical Association (AMA). Unlike SNOMED CT, LOINC, and RxNorm, CPT is not free — use requires a license from the AMA. Most EHR vendors include a CPT license in their software agreements, but redistribution and use in custom applications require separate licensing.

CPT categories

CategoryDescriptionExample Range
Category IProcedures and services00100–99499
Category IIPerformance measurement tracking codes0001F–9007F
Category IIIEmerging technology, services, and procedures0001T–0780T

Category I codes are what appear on claims. Category II codes are supplementary tracking codes, not billable as primary. Category III codes are temporary codes for new and experimental procedures — they may be promoted to Category I or retired.

CPT codes are updated annually on 1 January.

CPT for billing, not clinical characterisation

A single CPT code can map to multiple distinct clinical procedures. 99213 (Office or other outpatient visit, established patient, moderate complexity) covers an enormous range of encounters. A CPT code tells you what was billed; it does not describe what happened clinically with the precision that SNOMED CT provides.

For clinical analytics and decision support, SNOMED CT codes for procedures carry the semantics. For billing, CPT is mandated by Medicare and virtually all commercial payers. When building Procedure resources in FHIR, model both when both are present — the CPT code for billing context, the SNOMED code for clinical context.


Code system comparison

Code SystemMaintained ByCostPrimary UseScopeFHIR System URI
SNOMED CTSNOMED International / national release centersFree via national licenseClinical findings, diagnoses, proceduresGlobal; clinical precisionhttp://snomed.info/sct
LOINCRegenstrief InstituteFreeLab tests, observations, document typesGlobal; laboratory and clinical measurementshttp://loinc.org
RxNormNLMFreeDrug terminology, normalizationUS; links NDC, brand, generichttp://www.nlm.nih.gov/research/umls/rxnorm
ICD-10-CMWHO / CDC (US version)FreeDiagnosis coding for billingUS clinical modification; billinghttp://hl7.org/fhir/sid/icd-10-cm
ICD-10-PCSCMSFreeInpatient procedure codingUS inpatient procedures; billinghttp://www.cms.gov/Medicare/Coding/ICD10
CPTAMALicensedOutpatient procedure and service codingUS; billinghttp://www.ama-assn.org/go/cpt

Mapping strategy

When to map

Cross-system semantic mapping is required whenever data crosses a vocabulary boundary: sending ICD-10 diagnoses to a clinical system that expects SNOMED, receiving lab results with local codes that must be normalised to LOINC, aggregating medication data from systems using different drug terminologies.

Mapping is not free. Every mapping table must be created, validated, maintained, and versioned. A mapping created against ICD-10-CM 2023 may be incorrect for ICD-10-CM 2025 if codes were added, revised, or retired. Budget mapping maintenance as an ongoing operational cost, not a one-time project.

Mapping relationship types

RelationshipDescriptionExample
Equivalent (1:1 exact)Source and target have the same meaningSNOMED 44054006 ↔ ICD-10-CM E11 (approximate; see below)
Narrower than (1:many expansion)Source is broader; multiple target codes requiredOne LOINC panel code maps to multiple LOINC component codes
Broader than (many:1 aggregation)Multiple source codes collapse to one targetMultiple SNOMED finding codes → one ICD-10-CM billing code
Inexact / relatedApproximate; loss of meaning acknowledgedFree-text diagnosis mapped to closest ICD-10 code
UnmappableNo target equivalent; must be documentedLocal proprietary code with no standard equivalent

Note that SNOMED and ICD-10 are not semantically equivalent at any level of granularity. SNOMED 44054006 | Type 2 diabetes mellitus | and ICD-10-CM E11 are related, but the SNOMED concept is more precise and the ICD-10 category broader. Claiming these are exact equivalents is incorrect. The HL7 ConceptMap resource represents these relationships with the appropriate relationship type — use equivalent, narrower, or broader accurately.

NullFlavor for unmappable concepts

When a source concept cannot be mapped to any target code, do not omit the coding or substitute a placeholder code. Use a NullFlavor code to explicitly document that a value exists but cannot be represented:

NullFlavor CodeMeaning
UNKUnknown — a value exists but is not known
OTHOther — a value exists but is not in the target code system
NASKNot asked — the data was not collected
ASKUAsked but unknown — asked, but patient/source could not provide

In FHIR, represent unmappable concepts using a dataAbsentReason extension or by including a text-only coding with no code element. Document every unmappable concept in your mapping specification — they represent semantic gaps that accumulate as technical debt.

Free-text to code is a different problem

Mapping between code systems is structured-to-structured translation. Mapping free text to a code (e.g., “patient has hypertension” → SNOMED 38341003) requires natural language processing (NLP). NLP-based coding is probabilistic, not deterministic, and requires human review workflows for clinical use. Do not conflate the two problems or assume that a mapping table solves free-text inputs.


Value set governance

A value set is a curated set of codes drawn from one or more code systems, assembled for a specific clinical purpose. The problem list value set defines which SNOMED codes are valid for problem list entries. The vital signs value set defines which LOINC codes are valid for vital sign observations.

Value sets are where vocabulary policy meets implementation. A FHIR profile binds a data element to a value set with a binding strength (required, extensible, preferred, example). A “required” binding means the system must use a code from that value set — deviating causes validation failures. An “extensible” binding means the value set is preferred but local codes are permitted when no match exists.

Governance failures cause interoperability failures

Governance failure happens when:

  • A value set is not versioned, so different systems pin different snapshots
  • A value set includes retired codes that downstream systems reject
  • A value set excludes legitimate codes, forcing systems into unmappable situations
  • Value set updates are not distributed to trading partners

The plumbing can work perfectly — messages are delivered, validated, processed — and the integration still fails because the codes in those messages are not mutually understood.

VSAC

The Value Set Authority Center (VSAC) is the authoritative US repository for value sets used in quality measures, clinical decision support, and regulatory programs. Value sets used in eCQMs (electronic Clinical Quality Measures), CCDA templates, and ONC-endorsed programs are published in VSAC. Access requires a UMLS license (free).

When building integrations for US clinical programs, check VSAC for authoritative value sets before defining your own. Using a VSAC-published value set rather than a locally defined one increases interoperability with other systems implementing the same measures or programs.


Cross-reference

For the FHIR mechanics of terminology — how to represent CodeSystem, ValueSet, and ConceptMap resources; how to use $expand and $validate-code; how to structure coded elements in resources — see FHIR Terminology.

For how vocabulary choices affect Observation, Condition, and Procedure mapping specifically — including the coding anti-patterns most commonly encountered in clinical data — see Clinical Data Mapping.

Section: interop Content Type: standard Audience: technical
Interoperability Level:
Semantic
Published: 30/06/2023 Modified: 11/01/2026 16 min read
Keywords: SNOMED CT LOINC RxNorm ICD-10 CPT clinical terminology healthcare vocabulary semantic interoperability value set governance VSAC