Healthcare Data Warehouse
Overview
ScienceSoft has been providing a full range of data warehousing services since 2005.
Data Warehouse in Healthcare: the Fundamentals
A healthcare data warehouse is a centralized repository for healthcare organization’s data retrieved from disparate sources, processed and structured for analytical querying and reporting. The healthcare data warehouse integrates with a data lake, ML and BI software. Implementation costs for the healthcare DWH start from $200,000 for a midsize healthcare company.
A data source layer
– healthcare data from internal and external data sources (ERP, EHR/EMR, CRM, claims management system, pharmacy management systems, etc.).
A staging area
– intermediate temporary storage, where healthcare data undergoes the extract, transform and load (ETL) or the extract, load and transform (ELT) process.
Data storage layer
– includes centralized structured storage. It may also have data marts – healthcare DWH subsets oriented to a specific business line (HR, accounting, etc.) or department (radiology, intensive care, pediatrics, etc.).
Analytics and BI
– business analytics, data mining, data reporting and visualization tools.
The functionality of medical data warehouse solutions ScienceSoft delivers differs from customer to customer. Here, we’ve outlined the features commonly requested by healthcare organizations we work with:
Data integration
- Ingesting structured, semi-structured, unstructured healthcare data (from EHR systems, ERP, HR management systems, public medical databases, claims management systems, etc.).
- ETL/ELT-based healthcare data integration.
- Full and incremental healthcare data extraction/load.
- Controlled healthcare data loading/management.
- Healthcare data transformation of varying complexity (data type conversion, summarization, etc.).
- Healthcare data loading and querying using SQL.
- Big data ingestion.
- Streaming data ingestion.
Data storage
- Integrated, historical, summarized, subject-oriented healthcare data storage.
- Protected Health Information (PHI) storage.
- Metadata storage.
- Options of healthcare data storage environments (cloud, on-premises, hybrid).
Database performance and reliability
- Elastic scaling of storage and compute resources.
- High-performance query processing due to healthcare data indexing, materialized view support, result-caching.
- ML capabilities to dynamically manage performance and concurrency.
- Automated data backup across various regions and zones within the cloud environment for fault tolerance and disaster recovery.
Security and compliance
- Granular row and column level security control.
- Multi-factor authentication.
- Healthcare data encryption at rest and in transit (including backups and network connections).
- Dynamic healthcare data masking.
- Ongoing threat detection and vulnerability assessment.
- Compliance with healthcare regulations (HIPAA, FDA, HITECH, etc.).
To maximize the value and cost-efficiency of the healthcare data warehouse, ScienceSoft recommends setting up the following integrations:
A data lake
Data lakes serve as a cost-effective repository of semi-structured and unstructured healthcare data at any scale (radiology images, audio/video recordings, streaming healthcare data from wearables and devices, etc.). The data lake keeps data before it is queried by the data warehouse, which stores only highly structured healthcare data ready for analysis. Data stored in the data lake can be further used to develop ML models (for example, for medical imaging diagnosis).
Machine learning (ML) software
This integration enables training ML models on structured healthcare data from the data warehouse. ML-powered advanced analytics helps predict clinical outcomes, deliver personalized healthcare recommendations, improve appointment scheduling, make grounded decisions about hospital spending, etc.
Healthcare data structured in the data warehouse is visualized and reported in immersive reports (hospital annual report, average hospital stay report, etc.) and interactive dashboards (patient demographics, physician allocation, insurance claims, etc.). Self-service BI tools facilitate shorter time-to-insight.
|
|
|
|
|
ScienceSoft’s Healthcare IT Consultant Alena Nikuliak shares her experience: "For value-based care model, analysis of various information stored in a data warehouse provides important insights. The solution helps measure patient outcomes (e.g., based on length of hospital stay, time to return to work, readmissions), care costs (e.g., used resources, medical staff time), and assess the care value." |
|
|
|
Consider Professional Services for Healthcare DWH Development
Since 2005, ScienceSoft has offered IT solutions to healthcare organizations and provided a full range of data warehousing services to help them build robust healthcare DWHs and support the decision-makers with high-quality healthcare data.
Healthcare DWH consulting
- Eliciting requirements for a future healthcare DWH solution.
- Designing a healthcare DWH implementation/migration strategy.
- Outlining the optimal healthcare DWH vendors, technology stack and its configurations.
- Advising on healthcare data integration and data quality procedures.
- Conducting admin training
Healthcare DWH implementation
- Healthcare data storage needs analysis and DWH solution architecture design.
- Healthcare data sources (ERP, EHR/EMR, CRM, claims management system, pharmacy management systems, etc.) integration into a healthcare DWH.
- DWH platform integration into the data environment (a data lake, big data platform, BI tools, etc.).
- Set up data management and metadata management procedures.
- Conduct healthcare data cleaning and data migration.
- User training.
- DWH software maintenance and adaptation.
- DWH evolution (if required).
ScienceSoft as a Reliable DWH Consulting Partner
When we first contacted ScienceSoft, we needed expert advice on the creation of the centralized analytical solution to achieve company-wide transparent analytics and reporting. After a series of interviews, ScienceSoft’s consultants analyzed our workloads, documentation, and the existing infrastructure and provided us with a clear project roadmap.
They stayed in daily contact with us, which allowed us to adjust the scope of works promptly and implement new requirements on the fly. Additionally, the team delivered demos every other week so that we could be sure that the system aligned with our business needs.
Heather Owen Nigl, Chief Financial Officer, Alta Resources
Factors That Determine Medical Data Warehousing Success
Relying on 17+ years of experience in designing and implementing data warehousing solutions, ScienceSoft’s consultants have defined a set of factors, which if covered, help maximize ROI for the DWH projects.
Healthcare DWH scalability and flexibility
To instantly upload any type (structured, semi-structured, unstructured) and amount of healthcare-related data to efficiently address new data analytics objectives.
Security and healthcare data protection measures
We at ScienceSoft point out the following best practices:
- Store and process sensitive patient data within highly secure environments (AWS, Microsoft Azure, Google Cloud or private servers).
- Ensure all-time data encryption and dynamic data masking, restrictive data access, multi-factor user authentication.
- Conduct healthcare DWH vulnerability assessment and penetration testing, etc.
Well-established data quality management
To ensure the high quality of data delivered from diverse data sources, ScienceSoft recommends conducting a comprehensive data warehouse system analysis and designing robust data governance practices. It will help deal with such common data quality challenges as different encoding formats, attribute measurements of different data source systems, conflicting key fields, etc.
Healthcare Data Warehouse Investments
Healthcare data warehouse key cost drivers:
- Number of healthcare data sources (ERP, EHR/EMR, CRM, claims management system, pharmacy management systems, etc.).
- Healthcare data disparity (for example, difference in data structure, format, and use of values) across various source systems.
- Complexity of healthcare data (for example, big data, streaming data).
- Volume of healthcare data to be processed.
- Healthcare data security requirements.
- Number of healthcare data tables and columns used for analysis.
- Healthcare data warehouse performance requirements (velocity, scalability, fault tolerance, etc.).
Based on ScienceSoft’s experience in data warehouse software implementation, the approximate timeframes for the medical data warehouse implementation are from 3 to 12 months, and the project cost varies for healthcare organizations of different size as follows:
|
|
|
*Monthly software license fee and other regular fees are NOT included. |
|
|
Ballpark timelines for each stage of healthcare data warehouse implementation
A typical ScienceSoft's project on healthcare data warehouse software implementation covers the following stages and timelines:
- Healthcare data warehouse goals elicitation: 3-20 days.
- Healthcare data warehouse solution conceptualization and tech stack selection: 2-15 days.
- Business case and project roadmap creation: 2-15 days.
- System analysis and healthcare data warehouse architecture design: from 15 days.
- Healthcare data warehouse solution development and stabilization: from 2 months.
- Healthcare data warehouse solution launch: from 2 days.
- After-launch support, maintenance, and evolution: as requested.
HIDE
|
Improve health outcomes. |
|
Accelerate data-driven labor management. |
|
Improve healthcare resource management. |
|
Personalize care delivery and improve patients’ experience. |
|
Decrease healthcare operating costs. |
|
|
DWH Platforms ScienceSoft Recommends for Healthcare
Our list of healthcare data warehouse platforms features leaders in Gartner’s Magic Quadrant and Forrester’s Wave for Data Management Solutions for Analytics, which makes them suitable for the majority of mid-sized and large healthcare organizations.
Amazon Redshift
Best for: big data warehousing
Features
- Integration of all healthcare data types (structured, semi-structured, unstructured) for storing and SQL-querying.
- Integrations with the AWS ecosystem (including S3, AWS Glue, Amazon EMR) and third-party tools (Power BI, Tableau, Informatica, Qlik, Talend Cloud).
- Federated queries support.
- ML capabilities for optimized performance under varying workloads.
- Separate scaling of compute and storage.
- Healthcare data encryption and fine-grained access control.
- HIPAA-compliant.
Pricing
- On-demand pricing – $0.25 – $13.04/hour.
- Reserved instance pricing offers saving up to 75% over the on-demand option (a 3-year term).
- Data storage (RA3 node types): $0.024/GB/month.
Azure Synapse Analytics
Best for: advanced data analysis
Features
- SQL-querying of structured, semi-structured, unstructured healthcare data.
- Native integrations with a data lake, operational databases, BI and ML software.
- Integration with third-party BI tools, including Tableau, SAS, Qlik, etc.
- Separate billing for compute and storage.
- Healthcare data encryption, dynamic healthcare data masking, column- and row-level security.
- HIPAA-compliant.
Pricing
- Compute on-demand pricing – $1.20–$360/hour.
- Compute reserved instance pricing allows saving up to 65% over the on-demand option (a 3-year term).
- Data storage: $122.88/TB/month.
Oracle Autonomous Data Warehouse
Best for: hybrid healthcare DWHs
Features
- Querying across multiple healthcare data types (structured, semi-structured, unstructured).
- Built-in connectivity to Oracle Cloud Infrastructure Object Storage, Azure Blob Storage, Amazon S3.
- Integration with Oracle Analytics Desktop and third-party BI tools (Microsoft Power BI, Tableau, MicroStrategy, Qlik, etc.).
- Healthcare data encryption, privileged user and multifactor access control.
- Independent scaling of storage and compute.
- HIPAA-compliant.
Pricing
- Compute costs: $1.3441/CPU/hour
- Data storage: $118.40/TB/mo (in the public cloud).
About ScienceSoft
ScienceSoft is a global IT consulting and IT service vendor headquartered in McKinney, TX, US. Since 2005, we have provided a full range of data warehousing services to help healthcare organizations build from scratch or enhance their existing data warehouse platforms within the set timeframes and with minimal investments. Being ISO 13485 certified, ScienceSoft designs, develops and tests high-quality medical IT solutions according to the requirements of the FDA and the Council of the European Union.