Learning Objectives
By the end of this lesson you will be able to:
- Identify the key sections of a PDM mapping template
- Complete a mapping entry for a given source-to-target field mapping
- Explain how mapping templates feed into the ETL build
The Mapping Template
The PDM mapping template captures all five mapping rule types for each source-to-target field pairing. A well-completed mapping is the primary specification document for the ETL build - if the mapping is complete and correct, the ETL engineer has everything needed to implement the extraction, transformation, and load.
Structure of a Mapping Entry
A mapping template entry covers:
Extraction rules
- Source LDS and table/entity
- Navigation path from the root entity to the field being mapped
- Selection criteria: which records are in scope
- Timing: when is the extract taken (point in time, or ongoing)
In PDM, the normal English language approach applies: “Extract all of X, Excluding those with condition Y.” For example: “Extract all open invoices, Except those with a value of less than £1.” The Extract is the extraction rule. The Except is the exclusion rule.
Exclusion rules
- Records explicitly excluded from migration
- The rule or policy that drives the exclusion (e.g. “customers inactive for more than 3 years”)
- Reference to the DQR that manages fallout (records that fail validation and cannot be migrated)
- There may be a DQR that covers the exclusion, depending on whether it came as part of the data requirements or from the DQR process
Transformation rules
- Look-up tables used (e.g. legacy status code to target status code)
- External data enrichments (e.g. postcodes from a reference file)
- Parsing rules (e.g. splitting “First Last” into separate fields)
- Combining rules (e.g. merging two address fields)
- Data type conversions
- Default values for fields with no legacy equivalent
Loading rules
- Target table and field
- Load sequence (dependencies on other load steps)
- Load method (insert, update, or insert-or-update)
- Null handling
Data lineage rule
- The audit record: source LDS, source record identifier, date migrated, version of mapping applied
The Role of Business Input
Several transformation rules cannot be determined without business input:
- Code translations: A legacy status of “1” means “Active” in System A and “Pending” in System B. The correct mapping to the target code requires a business decision.
- Naming conventions: When two LDS have different names for the same thing, the canonical name for the target is a business choice.
- Default values: When a mandatory target field has no legacy equivalent, the default value will either be specified on the data requirements sheet or via the DQR process. It cannot be left solely to the business to decide without a structured process.
These are documented as DQR items if they cannot be resolved immediately, or as mapping annotations if they can be confirmed in a mapping review session.
Mapping Reviews
Completed mappings should be reviewed in a structured session involving:
- The migration analyst (who completed the mapping)
- Business Domain Experts for the relevant LDS
- Technical representatives from the target system
The purpose is to catch business-context errors before they reach the ETL build. A mapping error found in review costs an hour; the same error found in testing costs days.
Version Control
Mapping documents are configurable items, they must be under version control. If a DQR changes the transformation rule for a field, the mapping version increments, and the release document records which mapping version was used. Note that each release of the ETL does not necessarily correspond to a new version of the mapping: ETL releases can include changes to other aspects of the build (performance fixes, load sequencing) that leave the mapping unchanged.
Key Takeaways
- The mapping template covers all five rule types: Extraction, Exclusion, Transformation, Loading, Data Lineage
- Business input is required for code translations, naming decisions, and default values
- Mapping reviews before ETL build are a low-cost quality gate
- Mapping documents are configurable items - version control is essential
Book Reference
Practical Data Migration by Johny Morris (BCS, The Chartered Institute for IT):
- Chapter 11 - Gap Analysis and Mapping