Data Migration Process
The tools and data migration process for each project will differ depending on the complexity and scale. The correct approach will be identified from the outset setting expectations of time and costs. Our preferred approach is laid out below and often mitigates full project software such as Informatica, Talend or Pandora.
1. Landscape Analysis
This is the discovery, review and documenting of the legacy data stores, including their linkages, data quality and key data stakeholders. This should include every possible legacy data store allowing an understanding of the legacy environment and what data to migrate across by field and table analysis.
This will involve initial meetings to discuss and create the brief. This will define the size and life cycle of project and ultimately where CCR can add value.
2. Gap Analysis and Mapping
This is a reality check –This is the opportunity to identify internal gaps between the source and target systems and to understand what is business critical. This will include gap in : Data, Topology, Target model, Legacy Data to Migration Model with suggestions for the correct Mapping identifying The Demilitarised Zone (DMZ – see later).
3.Migration Design & Execution (MDE)
This will identify what form the Migration will take ( Big bang, Phased, Parallel, Always up) with a Fall-back plan – What is the contingency should the migration not happen. This will identify the Extract, Transform and Load phase.
This is the extraction of the data from the source system/s. This normally consolidates data from a number of different systems and areas, each may well use a different data format.
The goal of the extraction phase is to convert the data into a single format which is appropriate for the transformation processing. Poor data quality can delay the data migration, so data cleansing is a vital part of the process. The cleansing of the extracted data is followed by quality control to ensure that the data meets the expected pattern or structure. If not, the data may be rejected entirely or in part.
The transform stage applies a series of rules or functions to allow the extracted data load into the end target. Some data sources will require very little or even no manipulation. In other cases, one or more transformation types may be required to meet the business and technical needs. Data cleansing and data enrichment can also take place at this stage.
The load phase inputs the data into the end target database. This process can vary widely depending on the agreed requirements. This may overwrite existing data with cumulative information. Updating can be done live, hourly, daily, weekly or monthly.
7. System Retirement Plan
These are the requirements for the legacy data store to be decommissioned.
We don’t just leave once the data has been imported – we need to make sure that it’s all working as it should, so we carry out a full testing, debugging and handover routines.
These processes are supported by:-
Data Quality Rules
These are a centralised list of all of the rules about particular fields, subsets criteria and transformations. Generally the first DQR identifies the problem, the second analyses the data and the third fixes it. They form a contract between the business and the technicians about what constitutes quality data and how to go about securing it.
Demilitarised Zone (DMZ) and key Data Stakeholders Management
The DMZ is the dividing line between the responsibilities of parties from the project side to the supplier side. There can be shared activities between the two parties for example end to end testing and DQR’s. Data owners must be contactable, empowered to make judgements and decisions.