Enterprise Data Warehouse: Overview

Enterprise Data Warehouse (EDW)

ScienceSoft has been rendering a full range of data warehousing services since 2005.

Enterprise data warehouse: core concepts

An enterprise data warehouse (EDW) is a central structured subject-oriented repository for all business data of a company consolidated and stored for subsequent data analysis and reporting.

Data warehouse vs Enterprise data warehouse

DWH

  • Stores data for particular business units
  • Answers department-specific questions

Enterprise DWH

  • Consolidates and stores data for all business units
  • Answers enterprise-level and department-specific questions

Enterprise data warehouse architecture

An EDW is a core element of a BI solution, which is structured with:

A data source layer

– data from internal and external data sources – ERP, CRM, accounting and financial software, IoT devices, social media, etc.

A staging area

– an intermediate storage area of temporary nature for data processing under the extract, transform and load (ETL) process. ETL consolidates data from multiple sources and transforms it into a modeled format suitable for storing in the enterprise DWH. Cloud-based EDWs, due to their scalability, use ELT (extract, load, transform), which means that the transformation step is performed after data loading into an enterprise data warehouse.

Data storage layer

– a centralized storage where data is made accessible for analytics (querying, reporting) and sharing.

Analytics and BI

– analytics, data mining, data reporting and visualization tools.

Architecture of an enterprise data warehouse with a staging area:

Enterprise data warehouse architecture - ScienceSoft

EDW key features

Core features

  • Enterprise-wide data integration from internal and external data sources.
  • Controlled data loading and data management procedures.
  • Storing of historical and non-volatile data.
  • Subject-oriented data repository.
  • Integration with analytics and reporting software.

Advanced storage

  • Big data integration.
  • Real-time data warehousing (integration of sensor data, log file data, social media data, etc.).
  • Storing raw data.
  • Storage in multiple environments (cloud, on-premises, hybrid).

Operations

  • Instant scalability.
  • Automation of EDW maintenance tasks – backups, replication, patching, etc.
  • Granular access control
  • Federated Queries
Need to Consolidate Your Corporate Data?
ScienceSoft is ready to establish a highly effective enterprise data warehousing solution for you to integrate disparate data sources under one roof and enhance your decision-making with company-wide analytics.

Important EDW integrations

To achieve maximum effectiveness, EDW should integrate data from the company’s business-critical systems and external data sources, including:

Important EDW integrations - ScienceSoft

To get an understanding of the value a company gets by integrating all-rounded data from various sources, have a look at one of ScienceSoft’s projects. We helped a producer of phytotherapy products consolidate their disintegrated data sources into the unified central storage to get company-wide reporting, benchmark their performance, etc.

Factors leading to EDW Success

Success factor

Clear link of the EDW solution with business objectives, economic justification of EDW capabilities in terms of their business value.

Success factor

Architecture flexibility for further EDW evolution without compromising the EDW performance.

Success factor

Automation of EDW maintenance and administration tasks (ETL monitoring, managing data quality and data security, etc.) to decrease operational costs.

Success factor

EDW stability and availability for quickly accessing business-critical data in a centralized location for analysis and reporting to reduce time-to-insight and accelerate data literacy expansion across the enterprise.

Success factor

EDW high security and data protection standards.

Success factor

Out-of-the-box integrations with data sources; SDKs in most common programming languages – to reduce the development costs.

EDW key benefits

  • Reduced time to insight due to consolidating a company’s data and making it ready for analysis.
  • Increased business users’ productivity due to quick and easy access to structured and high-quality data ready for analysis and reporting.

EDW platforms we recommend

The solutions we selected are recognized as leaders in enterprise data warehousing solutions (Forrester Wave, Gartner Magic Quadrant) and are in full compliance with the key criteria for an enterprise-scale DWH: almost instant scalability of compute and storage resources (due to the cloud-based nature), high performance and availability (up to 99.99% uptime), advanced security, etc.

Azure Synapse Analytics
Description

A scalable data warehousing solution with a node-based architecture, which employs parallel query processing to achieve fast query response time and high query throughput. Azure Synapse unifies the Azure Data Lake storage and the SQL data warehouse to allow direct querying of raw data and combining relational and non-relational data for deeper analytics insight.

Data security

Dynamic data masking, built-in authentication, authorization, data encryption, etc.

Pricing
  • Data storage $122.88 per TB/month ($ 0.17/TB/hour). The data storage size includes your DWH data and 7 days of incremental snapshot storage.
  • Query performance pricing depends on the service level and region.
Amazon Redshift
Description

A scalable data warehousing service, which achieves great performance due to such features as massively parallel processing, columnar data storage, query optimizer, result caching, etc. With the Redshift Spectrum feature it is possible to query data directly from Amazon to enable data lake analytics.

Data security

End-to-end encryption, granular access controls, network isolation, etc.

Pricing

The price is charged according to the amount of stored data and the number of nodes. The on-demand pricing option starts from $0.25/hour (hourly rate based on the type and number of nodes in the cluster).

Google BigQuery
Description

A scalable data warehousing solution backed up with the Dremel technology designed to instantly run queries on massive structured datasets.

Data security

Data encryption, Google’s virtual private cloud policy controls, etc.

Pricing

Storage costs$0.02/GB/mo ($0.01/GB/month for long-term storage).

Streaming inserts: $0.01/200 MB.

For query performance, 2 subscription options are available:

  • Pay-as-you-go ($5/TB, 1st TB/month is free).
  • Flat-rate pricing (from $10,000/ month for a dedicated reservation of 500 processing units).

EDW implementation

Having 15+ years of hands-on experience in delivering DWH solutions, partnerships with global technology leaders (including Microsoft, Amazon and Oracle), we know how to deliver tailored EDW solutions that help our clients meet their tactical and strategic business objectives.

EDW consulting and implementation

To help you establish an EDW solution, we cover:

  • Business needs analysis and requirements elicitation.
  • EDW implementation strategy design.
  • EDW configuration and development.
  • EDW integration.
  • Data management procedures.
  • User training.
  • EDW support and administration (if required).

EDW as a Service

For you to avoid EDW development, implementation and management, we customize an enterprise data warehouse and rent it out to you on a subscription fee basis.

About ScienceSoft

ScienceSoft is a global IT consulting and IT service company headquartered in McKinney, TX, US. Since 2005, we render a full range of data warehousing services, including consulting, implementation, migration and DWaaS to support our clients’ agile and data-based decision-making.