Data Warehouse and Analytics Migration to Microsoft Azure Cloud

Customer

The end Customer is a North-American energy corporation focusing on oil and gas production.

Challenge

Though having a robust analytical system on-premises, the end Customer was looking for a solution that would allow them to process and analyze the continuously growing amount of data at high speed and eliminate the burden of maintaining costly hardware.

Thus, the end Customer was searching for an experienced vendor, who could migrate their data analytics solution to Microsoft Azure maintaining identical to their on-premises solution architecture quickly and with no interference to their business workflows.

Solution

The end Customer’s data analytics solution consisted of the following components:

  • Data sources (Microsoft SQL Server 2012, Oracle Database, Excel files, etc.).
  • ETL layer (more than 600 SQL Server Integration Services (SSIS) packages).
  • Data landing area (storing raw data).
  • Data staging area (storing pre-processed data).
  • Data warehouse (storing processed data ready for analysis).
  • Analytical server (12 Microsoft SQL Server Analysis Services (SSAS) OLAP cubes).
  • End-user layer (Excel and SQL Server Reporting Services (SSRS)).

ScienceSoft’s experts started the migration with converting SSIS packages into Azure Data Factory (ADF) pipelines and moving data from on-premises data sources to the Azure Synapse Analytics landing area in the ‘as is’ state. To ensure high speed and throughput while data migration, the developers used the Azure PolyBase technology, which required moving data to the temporary repository (Azure Blob Storage) first and then migrating data to the landing area.

After the landing area, the data was moved to the Azure Synapse Analytics staging area undergoing certain transformation and cleansing processes. Then, the data was migrated to the central data repository – Azure Synapse Analytics. To maintain the maximum similarity with the on-premises solution architecture, ScienceSoft’s experts re-engineered the code for the ETL processes to facilitate creating separate schemas to store data from different on-premises databases in one data warehouse.

The next step was to move 12 SSAS OLAP cubes to Azure. As all OLAP cubes had the multidimensional semantic data model, which Azure Synapse Analytics did not support, the SSAS cubes were stored on Azure Virtual Machines.

Finally, ScienceSoft’s database developers set up the data loading processes, identical to those applied on-premises, to facilitate automatic incremental data loading with scheduled triggers as well as running ADF pipelines on-demand manually.

Results

The end Customer got their on-premises data analytics solution moved to Microsoft Azure within the set time and budget. With the implemented solution, they obtained a cost-effective, scalable and easy to maintain data analytics solution.

Technologies and Tools

Microsoft SQL Server, Microsoft SQL Server Analysis Services, Azure Synapse Analytics, Azure Data Factory, Azure Blob Storage, PolyBase, Azure DevOps, Azure Virtual Machines, U-SQL, T-SQL, Python, .NET, JSON.

MORE CASE STUDIES
COVID-19 – An update to our clients
In the uncertain time of Coronavirus (COVID-19) outbreak, I want to assure you that ScienceSoft remains fully operational and dedicated to supporting the continuity of our customers’ businesses. Most of ScienceSoft’s employees work remotely, and we’re equipped to provide our services in new conditions, with no impact on the quality of service or communication.
In the uncertain time of Coronavirus (COVID-19) outbreak, I want to assure you that ScienceSoft remains fully operational and dedicated to supporting the continuity of our customers’ businesses. Most of ScienceSoft’s employees work remotely, and we’re equipped to provide our services in new conditions, with no impact on the quality of service or communication.
Stay safe and healthy,
Nikolay Kurayev,
Chief Executive Officer at ScienceSoft