Learning Objectives
By the end of this lesson you will be able to:
- Define a Unit of Migration from both a business and technical perspective
- Explain the significance of Units of Migration for rollback planning
- Describe how Units of Migration are defined for different KBDAs
What Is a Unit of Migration?
A Unit of Migration is defined in PDM as:
The lowest level of data granularity for which rollback occurs.
Or equivalently, from a business perspective:
The objects we are migrating.
The distinction matters. From a technical perspective, a Unit of Migration defines the atomicity boundary for the migration, the smallest unit that either migrates completely or not at all. From a business perspective, it defines what the business is committing to migrate.
There is a natural tension here: the business wants the Unit of Migration as large as possible, the technologists want it as small as possible. Resolving it explicitly is the point of the exercise.
Why Units of Migration Matter
Defining Units of Migration explicitly resolves several common sources of conflict on data migration projects:
-
Partial migration ambiguity: without a defined unit, it is unclear what constitutes a successful migration of a record. A customer record with missing contact details - is that migrated or not?
-
Rollback scope: if a migration run fails, what rolls back? The answer must be defined before the run, not discovered during a crisis at 3am.
-
Acceptance criteria: a Data Owner accepting a migration is accepting that the defined units have been migrated to the defined quality. Vague units make acceptance impossible.
Defining Units of Migration for DHGS
The DHGS case study illustrates how Units of Migration are defined in practice:
Customer (CRM Functionality) Each unique current customer, with their orders from the last three years. Customers who have not ordered for more than three years are excluded. Only summary order values are migrated, not full order detail. Customers must be de-duplicated before migration (a DQR is raised to manage this).
Product and Supplier The unique Product. Where a product is supplied exclusively by one supplier, at least one supplier record must be present. Additional supplier details can be added post-migration.
Order Only complete orders where every line is valid. Only orders that have reached “Shipped” status. Historic orders from the last three years only. The Product and Customer migrations must have completed before Orders can be migrated.
Equipment Equipment is categorised by type. Large capital items and safety-critical equipment must be complete. Smaller items are migrated with a lower completeness threshold. Work Registers schedule work against Equipment, so Equipment must migrate first. Because a single piece of equipment can appear on more than one Work Register, there is a “daisy-chain” effect, and the Activity Based Costing policy requires maintenance practices to be standardised, making this the most complex unit to define. The DQR sessions are structured around the Work Register to ensure each one is ready to migrate.
Workforce Workforce is Employees plus their linking data (holidays, qualification types, and so on). Some items must be complete - date of birth, sex, work qualifications and licences - while others can be partial. DHGS uses the National Insurance number as the unique identifier for an Employee, linking to the finance applications. The precise definition of “complete” Workforce data is settled later, at Gap Analysis & Mapping.
Vehicle All vehicles, leased and owned, migrate with their service history, drivers, and ownership status. A Supplier record is only essential for leased vehicles. Employees must be migrated first, and the employee references in the vehicle records must match the HR Employee data set - a DQR is raised to enforce this.
Financial and History Data Financial data can only be loaded after a year end (a go-live restriction recorded in the System Retirement Plan). The precise definition of a complete financial record is investigated and documented there.
Transitional Business Processes
For any Unit of Migration that has in-flight transactions at go-live - orders raised but not shipped, equipment maintenance in progress - Transitional Business Processes must be defined. These are the manual or semi-manual procedures for handling the gap between when the legacy system closes and when the target system has the data it needs to operate.
Transitional Business Processes are captured in the Business Transformation Realisation (BTR) documents, covered in the analysis phase.
Units of Migration in the ETL
Units of Migration have a direct technical implementation in the ETL:
- The unit defines the extraction selection criteria
- Records that do not form a complete unit either fall out (exclusion rule) or trigger a DQR
- The load sequence implements the dependency order: Customer before Order, Equipment before Work Register
Key Takeaways
- A Unit of Migration is the lowest level of granularity for which rollback is defined
- From a business view it is the object being migrated; from a technical view it is the atomicity boundary
- Units must be defined explicitly before the migration run, not discovered during it
- In-flight transactions at go-live require Transitional Business Processes, captured in the BTR
Book Reference
Practical Data Migration by Johny Morris (BCS, The Chartered Institute for IT): Chapter 7, “Metadata and Key Business Data Areas”.