Learning Objectives
By the end of this lesson you will be able to:
- Describe the end-to-end technical design approach used for the DHGS migration
- Explain the ETL architecture choices and how they reflected the data landscape
- Understand how the System Retirement Plan was constructed for DHGS
- Describe the fallback design and what conditions would trigger it
From Analysis to Design
The analysis phase produced a detailed picture of the data and its problems. The technical design phase translated that picture into a plan for how data would actually move - extracted from source, transformed according to the agreed rules, and loaded to the target.
For DHGS, the technical design had three main components:
- End-to-end design - the overall architecture of the migration, from sources to target
- ETL design - the detailed design of the extraction, transformation, and load processes
- Decommissioning and fallback design - the plan for retiring the legacy systems and for what would happen if the go-live had to be reversed
These are covered in sequence in this lesson.
End-to-End Architecture
The end-to-end architecture for DHGS was straightforward in principle, though not in execution. Data would be extracted from the legacy sources, processed through a transformation layer that applied the agreed business rules, and loaded to the target system in the Unit of Migration sequence established during scoping.
The architecture placed an explicit staging layer between extraction and load. This was a deliberate choice, not a convenience. The staging layer served several purposes:
- It provided a snapshot of source data at a defined point in time, independent of ongoing changes in the live source system
- It allowed the transformation rules to be run and re-run against stable data without touching the source
- It gave the test and reconciliation process something to work with that was not the live source
The staging layer also made it possible to run multiple load iterations - trial migrations - without needing to re-extract from source each time. Given the volume of the DHGS equipment dataset, re-extraction was not trivial.
ETL Design
The ETL design document for DHGS specified, for each data entity and each field:
- The source system and field
- The extraction method
- The transformation rule
- The target field
- The validation check to be applied after load
For the equipment data, the transformation rules were the most complex part of the design. They needed to encode:
- The equipment type remapping (from the DQR0012 mapping table)
- The status derivation logic (combining multiple source fields into a single target status)
- The location code standardisation (converting free-text descriptions to structured codes where a match existed)
- The business transformation changes from the BTR (reclassifications, merges, and renames)
- The migration policies from the MSG (age thresholds, site assignment requirements)
Each transformation rule was specified precisely enough that it could be implemented by a developer who had not been involved in the analysis - and tested by someone who had not been involved in the development. Ambiguous rules are one of the most reliable sources of defects in migration ETL, and the DHGS team invested time in making the specifications unambiguous.
The ETL Content Matrix was the document that held all of this - a structured record of every source-to-target mapping, its transformation rule, its validation logic, and its current status. This was a living document through the build phase, updated as rules were refined and as testing surfaced issues that required rule changes.
Version and Release Management
One of the practical challenges in the DHGS ETL design was managing versions. As the analysis refined the transformation rules, and as business decisions resolved DQR items, the rules in the ETL Content Matrix changed. The ETL code had to match the current version of the matrix at all times.
The team applied formal version control to both the ETL Content Matrix and the ETL code. Each version was tagged and dated. Every run of the migration - whether a trial run or a final execution - was recorded against the version of the rules it used.
This mattered because it made defects traceable. When a test load produced unexpected results, the team could determine whether the issue was in the data, in the transformation rules, or in the code. Without version control, that diagnosis is guesswork.
System Retirement Plan
The DHGS System Retirement Plan (SRP) was the formal document setting out how each legacy system would be decommissioned after go-live.
For each legacy system, the SRP recorded:
- The system name and owner
- The data it held that was in scope for migration
- The data it held that was not being migrated (and what would happen to it)
- The dependency - what needed to happen in the target system before decommissioning could begin
- The decommissioning date
- The data retention requirement - how long the system needed to remain accessible in read-only mode after it ceased to be the operational system of record
- Who was responsible for executing the decommissioning
The SRP was produced in two versions: a pre-migration version that recorded the plan, and a final version updated after go-live to record what actually happened.
For DHGS, the decommissioning of the KP Equipment Table was the most significant activity. The system was used operationally by field teams who needed time to transition their working practices to the new target. The SRP specified a transition period during which both systems would be maintained, with the target as the system of record and the legacy system accessible in read-only mode. At the end of the transition period, the legacy system was switched off.
The SRP also addressed data that was not migrating. Archive records that had been excluded by the migration policies were retained in the legacy system under a separate data retention policy. This was documented explicitly in the SRP so that the decommissioning team knew what they could and could not delete.
Fallback Design
Every migration that involves a hard cutover - a point at which the new system goes live and the old system ceases to be operational - needs a fallback plan. The DHGS fallback design addressed the question: if we go live and discover a critical problem, what do we do?
The fallback design defined:
- Fallback triggers - the conditions under which a decision to fall back would be made. For DHGS, these included: failure of the load process to complete within the cutover window; failure of critical validation checks at a defined threshold; a business-critical function being found non-operational within the first hours of go-live.
- Fallback decision authority - who had the authority to call a fallback. This was agreed in advance because in the stress of a go-live, there is no time for the decision-making process to be unclear.
- Fallback procedure - the sequence of steps to revert to the legacy system if fallback was triggered. This included confirming that the legacy system had been kept live and up to date through the cutover window (it had), switching operational processes back to the legacy system, and communicating to users.
- Fallback window - the period after go-live within which a fallback was feasible. Beyond that window, the legacy system’s data would have diverged from the target to the point where reverting would create more problems than it solved.
The fallback design was not a sign that the team expected to fail. It was a risk management measure that gave the project sponsor and the business the confidence to proceed with a hard cutover, knowing that a safe route back existed if it was needed.
Key Takeaways
- The DHGS end-to-end architecture used a staging layer to decouple extraction from transformation and load, enabling repeated trial runs and stable test data
- The ETL design for equipment records encoded the equipment type remapping, status derivation logic, location standardisation, business transformation changes, and migration policies in a single ETL Content Matrix
- Formal version control of both the rules and the ETL code made defects traceable and prevented version drift between the specification and the implementation
- The System Retirement Plan specified how each legacy system would be decommissioned, who was responsible, what data would be retained, and the timeline for doing so
- The fallback design defined trigger conditions, decision authority, the reversal procedure, and the window within which fallback was feasible - giving the business the confidence to commit to a hard cutover