Eliminated Data Latency and Optimized Cost by migrating Teradata, Hadoop and AbInitio ETL to Databricks on Microsoft

By January 11, 2024Case Study
Home » Case Study » Eliminated Data Latency and Optimized Cost by migrating Teradata, Hadoop and AbInitio ETL to Databricks on Microsoft

Challenges: The client was facing data challenges while running their legacy Teradata and Hadoop environments with high operational costs.

Datametica Solutions Pvt. Ltd | Eliminated Data Latency and Optimized Cost by migrating Teradata, Hadoop and AbInitio ETL to Databricks on Microsoft

Our client, a Fortune 500 US-based global technology company, faced multiple challenges with their existing on premise Teradata and Hadoop systems. They were determined to address these challenges while simultaneously migrating to a modern and cost-effective Databricks on Azure environment. The primary challenges that the client was facing were:

  • Operational Costs: The client was facing very high operational costs on legacy Teradata and Hadoop systems and wanted to migrate to a modern, scalable, and cost-efficient Databricks Azure environment to gain better control over operational costs.
  • Data Availability: The client was facing data availability challenge and wanted to ensure that data was consistently available and accessible to meet the evolving analytical needs of the business.
  • Data Latency: This was one of the major concerns of the client, as the data latency in the legacy environment was multiple hours. So, the client wanted to ensure that data was readily available in real-time or near-real-time for analysis and decision-making.
  • Data Quality: The aim was also to improve the overall quality of data by eliminating inconsistencies and errors in data sources, ensuring accurate insights.

Solutions:

Datametica stepped in to address the client’s challenges by orchestrating a seamless migration of their on-premise Teradata and Hadoop environments to Databricks on Azure. Here’s a breakdown of the comprehensive solution provided:

  • Datametica leveraged Eagle – the data assessment and migration planning product, to perform detailed analysis of the existing Teradata and Hadoop data warehouses. This assessment helped in understanding the data pipelines, complexities, data flow patterns, table structure, data lineage, and the overall design of the existing system.
  • Eagle also helped in the detailed migration planning strategy with the migration approach, milestones, and testing strategy.
  • Automated code conversion service – Raven automated the conversion of current database objects, including 18000 tables, 15000 views, 4,000 store procedures, 3,000 Hive Jobs, 200 Impala Jobs, and 5,000 other scripts like BTEQs and Functions to Databricks and Azure Synapse.
  • Raven also automatically converted 7000 Ab Initio jobs, including 208 Ab Initio PSET, 86 Ab Initio Graph, and 21 Ab Initio Plan, to Databricks/Azure native components.
  • Datametica setup, orchestration, and scheduling for workloads converted by Raven using Autosys and Azure Data Factory. Existing job orchestrations, dependencies, and scheduling were implemented as-is to maintain continuity in data analytics and reporting.
  • Connectivity with Power BI was established to meet the client’s data consumption requirements. This integration enabled the client to continue using their preferred reporting and analytics tool seamlessly in the new environment.
  • Datametica’s AI-powered automated data validation technology, Pelican, was used for automated cell-level data validation. Pelican ensured that data integrity and accuracy were maintained throughout the migration process.
  • Parallel runs were conducted during the migration using Datametica’s Pelican. This approach allowed for continuous validation of data, ensuring that any discrepancies or issues were addressed promptly, minimizing disruption in the cloud migration process for the client’s operations.

Client Benefits: Improved data quality, operational cost optimization, seamless reporting, efficient data transformation

Datametica’s expertise in migrating Teradata and Hadoop to Databricks on Azure not only resolved the client’s existing challenges but also set them up for a modern, scalable, and cost-effective data warehousing solution with improved data quality and availability. This transformation also helped the client optimize their operational costs.

  • 100% resolved data latency, data quality, and data availability issues by moving to the modern Databricks on Azure cloud environment.
  • Significant improvement in controlling the operational costs, effectively optimizing running costs for the client.
  • 100% efficiency in data migration and a 40% faster ramp to the cloud with Datametica’s automated migration suite.
  • Achieved 100% confidence in the accuracy and integrity of the migrated data with Pelican’s data validation.

Tools Used

Datametica Solutions Pvt. Ltd | Eliminated Data Latency and Optimized Cost by migrating Teradata, Hadoop and AbInitio ETL to Databricks on Microsoft

Datametica Products Used

Datametica Solutions Pvt. Ltd | Eliminated Data Latency and Optimized Cost by migrating Teradata, Hadoop and AbInitio ETL to Databricks on Microsoft

Recommended for you

Migrating to the Google Cloud Platform
Case Study
Telecom company drives performance by migrating to the GCP
Healthcare Insurer's Data Migration to Google Cloud Platform
Case Study
Healthcare Insurer’s Data Migration to Google Cloud Platform
Netezza to GCP
Case Study
Netezza to Google Cloud Platform (GCP) Migration

subscribe to our case study

let your data move seamlessly to cloud