5 Best Big Data Databases

Top 5 Big Data Databases - ScienceSoft

With 8 years in big data services, ScienceSoft assists companies with selecting and implementing proper software for their big data initiatives.

Big data databases: the essence

Big data databases are flexible repositories for storing big data. They are mostly NoSQL databases built on a horizontal architecture without rigid schemas, which enables quick and cost-effective processing of huge volumes of structured, semi-structured and unstructured data.

Features of big data databases

Data storage

  • Storing petabytes of data.
  • Storing unstructured, semi-structured and structured data.
  • Distributed schema-agnostic big data storage.

Data model options

  • Key-value.
  • Document-oriented.
  • Graph.
  • Wide-column store.
  • Multi-model.

Data querying

  • Support for multiple concurrent queries.
  • Batch and streaming/real-time big data loading/processing.
  • Support for analytical workloads.

Database performance

  • Horizontal scaling for elastic resource setup and provisioning.
  • Automatic big data replication across multiple servers for minimized latency and strong availability (up to 99.99%).
  • On-demand and provisioned capacity modes.
  • Automated deleting of expired data from tables.

Database security and reliability

  • Big data encryption in transit and at rest.
  • User authorization and authentication.
  • Continuous and on-demand backup and restore.
  • Point-in-time restore.

Best big data databases for comparison

Description

A leader among Big Data NoSQL databases in the Forrester Wave Report.

  • Support for key-value and document data models.
  • ACID (atomicity, consistency, isolation, durability) transactions.
  • Integrations with AWS S3, AWS EMR, Amazon Redshift.
  • Microsecond latency with DynamoDB Accelerator.
  • Real-time data processing with DynamoDB Streams.
  • On-demand and provisioned read/write capacity modes.
  • End-to-end big data encryption.
  • Point-in-time recovery and on-demand backup and restore.

best for

Operational workloads, IoT, social media, gaming, ecommerce apps.

Pricing

Database operations:

  • On-demand request units (RU): $1.25/million write RU and $0/25/million read RU.
  • Provisioned capacity unit (CU): $0.00065/write CU and $0.00013/read CU.

Storage: first 25 GB/month – free, $0.25/GB/month thereafter.

Description

A leader among Big Data NoSQL databases in the Forrester Wave Report.

  • Support for the multi-model data schema.
  • Open-source APIs for SQL, MongoDB, Cassandra, Gremlin, etc.
  • Integration with Azure Synapse Analytics for real-time no-ETL analytics on operational data.
  • Support for ACID transactions.
  • On-demand and provisioned capacity modes.
  • Big data encryption (in transit and at rest) and access control.
  • 99.999% availability.

best for

Operations management, ecommerce, gaming, IoT apps.

Pricing

Database operations:

  • Provisioned throughput: 100 request units/second, single-region write account - $0.012/hour (autoscale) and $0.008/hour (manual).
  • Provisioned throughput reserved capacity: up to 65% savings.
  • Serverless (bills for the request units (RU) used for each database operation) – $0.25 for 1,000,000 RU.

Storage: 1GB consumed transactional storage (row-oriented) – $0.25/month.

Description

  • Support for Apache CQL API code, Cassandra-licensed drivers and developer tools for running Cassandra workloads.
  • Big data encryption at rest and in transit.
  • On-demand and provisioned capacity modes.
  • Integration with Amazon CloudWatch for performance monitoring.
  • Continuous backup of table data with point-in-time recovery.
  • 99.99% availability within AWS Regions.
  • Integration with AWS Identity and Access Management for database access control.

Best for

Fleet management, industrial maintenance apps.

Pricing

Database operations:

  • On-demand throughput: $1.45/million write RU, $0.29/million read RU.
  • Provisioned throughput: write RUs - $0.00075/hour, read RUs - $0.00015/hour.

Storage: $0.30/GB/month.

Description

  • MongoDB compatibility.
  • Support for the ACID transactions.
  • Migration support (e.g., MongoDB databases on-premises to Amazon DocumentDB) with AWS Database Migration Service.
  • Support for role-based access with built-in roles.
  • Network isolation.
  • Instance monitoring and repair.
  • Cluster snapshots.

Best for

User profiles, catalogs, and content management.

Pricing

  • On-demand instances: $0.277- $8.864/instance-hour consumed (Memory Optimized Instances Current Generation).
  • Database I/O: $0.20/1million request.
  • Database storage: $0.10/GB/month.
  • Backup storage: $0.021/GB/month.

Description

  • Flexible database management platform for big data querying with SQL.
  • Automated infrastructure provisioning.
  • On-demand and provisioned capacity modes.
  • Amazon Redshift Spectrum to query big data in the data lake (Amazon S3).
  • Federated queries support for operational data querying.
  • Big data encryption (in transit and at rest).
  • Network isolation.
  • Row- and column-level security.

best for

BI and real-time operational analytics on business events.

Not suitable for Online Transaction Processing (OLTP) in milliseconds.

Pricing

  • On-demand pricing: $0.25/hour (dc2.large) - $13.04/hour (ra3.16xlarge).
  • Reserved instance pricing allows saving up to 75% over the on-demand option.
  • Managed storage pricing (for RA3 node types) $0.024/GB/month.

Big data database implementation

Big data consulting

We offer:

  • Big data storage, processing, and analytics needs analysis.
  • Big data solution architecture.
  • An outline of the optimal big data solution technology stack.
  • Recommendations on big data quality management and big data security.
  • Big data databases admin training.
  • Proof of concept (for complex projects).

Big data database implementation

Our team takes on:

  • Big data storage and processing needs analysis
  • Big data solution architecture.
  • Big data database integration (integration with big data source systems, a data lake, DWH, ML software, big data analysis and reporting software, etc.).
  • Big data governance procedures setup (big data quality, security, etc.)
  • Admin and user training.
  • Big data database support (if required).