Unified Data Warehouse Implementation & Oracle and Hadoop Migration to GCP

By December 17, 2021April 4th, 2022Case Study
Home » Case Study » Unified Data Warehouse Implementation & Oracle and Hadoop Migration to GCP

Challenges:

One of the security services providers in the US faced challenges with their on-premise Oracle and Cloudera Hadoop environment, which led to excessive execution time for a query/code and higher costs of operation.

Unified Data Warehouse Implementation & Oracle and Hadoop Migration to GCP

Objective:

The client wanted to build a Unified Data Warehouse on GCP that leverages cloud-native technologies, lowers operational costs and offers better risk mitigation than current on-prem based solutions. To facilitate this client also wanted to migrate the existing Data Warehouse functionality from On-premise Oracle and Cloudera Hadoop to GCP.

Also, another objective was to enable end-users to run SQL Queries on the Google BigQuery platform even prior to the actual application migration.

Solutions

  • Datametica helped the client migrate their historical and Incremental load of On-prem  Oracle and Cloudera Hadoop databases to the Google Cloud Platform.
  • One time history load was done by Datametica in which the data was loaded to the GCS bucket from the source system and then the data was moved within the Google BigQuery.
  • Converted Oracle SQL code to Google BigQuery SQL code using Datametica’s Raven – Automated Workload Conversion Tool that automated the SQL conversion.
  • Finally, the Google BigQuery data were compared using Datametica’s Pelican– The Automated Validation tool with source oracle data to ensure data integrity and quality assurance.
  • Pelican Tool was also used to validate the data loaded into Google BigQuery by doing a field-level comparison with the data present in the on-prem Hadoop Platform.
  • Implemented the existing Data Base model with optimal configuration of Partition & Cluster keys for cost optimization,
  • Implemented best practices suited to the Google BigQuery platform for efficient and performance execution,
  • Performed comparative analysis of execution times in Google BigQuery with on-prem Oracle execution times.
  • Datametica established IAM roles, firewalls, VPC for GCP for the projects.
  • Leveraged various Google Native tools like transfer appliance, Google Storage, Google BigQuery, Cloud DataProc, Cloud Composer, Stackdriver, Cloud Dataflow and etc to deliver successful complete Google Cloud implementation.

The Result

  • Reduce scan cost by implementing partitioning and clustering
  • Reduced the execution time by 85% that is the execution time was reduced to 8 minutes from 70 minutes earlier.
  • Cloud Migration cost was reduced by 65%
  • Ensured 100% data accuracy via Pelican

Datametica Products Used

Datametica Products used

GCP Products Used

Google products used

Recommended for you

pelican

Case Study
Pelican Contributed 90% Cost Savings in Data Validation

Learn how Google Cloud helps Retailers leverage their data

Webinar
Learn how Google Cloud helps Retailers leverage their data

Netezza to GCP

Case Study
Netezza to Google Cloud Platform (GCP) Migration

subscribe to our case study

let your data move seamlessly to cloud