Data validation ensures that the migrated data is accurate, complete, and consistent with the source system. It helps detect data loss, corruption, or transformation errors before final deployment.
* Verify that all records from the source exist in the target system.
* Compare row counts in source and destination databases.
* Example SQL query for row count comparison :
SELECT COUNT(*) FROM source_table;
SELECT COUNT(*) FROM target_table;
* If numbers don’t match, investigate missing or extra records.
* Ensure that data values in the target system match the source.
* Check for truncated fields, missing characters, or altered data types.
* Sample SQL to verify specific field values :
SELECT id, column_name FROM source_table
EXCEPT
SELECT id, column_name FROM target_table;
* Use checksum or hash functions to compare datasets :
SELECT MD5(string_agg(column_name, ',')) FROM source_table;
SELECT MD5(string_agg(column_name, ',')) FROM target_table;
* Validate relationships and foreign keys between tables.
* Ensure referential integrity (e.g., no orphaned records).
* Example SQL query for foreign key validation :
SELECT child_table.id
FROM child_table
LEFT JOIN parent_table ON child_table.parent_id = parent_table.id
WHERE parent_table.id IS NULL;
* Compare totals, sums, or averages of financial and numerical data.
* Test if queries on the target system perform as expected.
* Compare response times between old and new systems.
* Identify slow queries that may indicate indexing or schema issues.
* Involve end-users to test real-world scenarios in the migrated system.
* Validate that reports, dashboards, and applications function correctly.
* Check UI-based data retrieval for applications.
* ETL Testing Tools: Informatica, Talend, Apache Nifi
* Database Comparison Tools: dbForge, Redgate SQL Data Compare
* Data Validation Scripts: Python, SQL queries
* Cloud-Based Tools: AWS Glue, Google Dataflow.