en flag +1 214 306 68 37
Big Data Consulting to Improve the Performance of Apache Cassandra Database

Big Data Consulting to Improve the Performance of Apache Cassandra Database

Industry
Energy
Technologies
Big data

Customer

The Customer is a European decentralized energy company partnering with hundreds of distributed electricity producers nationwide.

Facing a Decline in Energy Analytics App Performance

To balance energy supply and demand and avoid over- and underproduction, the Customer performs multiple time-series calculations related to the energy volume produced and stored by its partners.

The Customer’s analytics application automatically queries the Apache Cassandra database for 10,000+ energy production and consumption values every 5 minutes (for short-term monitoring) and every 24 hours (for daily reports). The company employees also use the database to perform ad hoc analytics.

As more partners joined the Customer’s energy network, the data load increased, and the app wasn’t returning calculation results at the desired speed anymore. The Customer’s IT team identified low-performing Java functions and attributed the issue to incorrect Cassandra configurations. The Customer needed a professional Cassandra consultancy to confirm these assumptions and improve database performance.

Detecting Inefficiencies in Cassandra Configurations

ScienceSoft appointed a DevOps engineer and a senior data engineer to the project. The DevOps engineer checked the configurations of the application and its infrastructure and made sure there were no issues at that level. Meanwhile, the data engineer examined the database and spotted several Cassandra inefficiencies that could be causing poor app performance:

  • The energy consumption and production readings were grouped by 5-minute intervals, leading to additional calculations whenever the Customer needed to check combined values for a particular hour or 24 hours.
  • The excessive partition size (over 100MB) made the app process large files in search of a single small value, resulting in slower analytics output.

Cassandra Optimization Measures

ScienceSoft’s data engineer provided the Customer with several database optimization recommendations that would help increase the analytics performance:

  • Adding an extra table field (in the format year–month–day–hour) that would allow the application to immediately get energy consumption and production data for a certain hour instead of performing the calculation based on multiple 5-minute timestamps.
  • Reducing the partition size to the optimal range of 10MB to 100MB to ensure the queries address smaller, easy-to-process data chunks.

The data engineer also updated the calculation query code to match the new table structure.

Increasing Query Return Speed

Within just five days, the Customer received an expert audit of its application infrastructure and Cassandra database, complete with recommendations on how to optimize the database structure. Once implemented, the changes will allow the Customer to increase the performance of its energy analytics application and receive timely calculations, which is critical for informed decision-making.

Technologies and Tools

Apache Cassandra, Cassandra SQL.

Have a question to our team or need help with your project?

Our team is ready to provide client references, estimate your project, or answer any other question related to your IT initiative.

Upload file

Drag and drop or to upload your file(s)

?

Max file size 10MB, up to 5 files and 20MB total

Supported formats:

doc, docx, xls, xlsx, ppt, pptx, pps, ppsx, odp, jpeg, jpg, png, psd, webp, svg, mp3, mp4, webm, odt, ods, pdf, rtf, txt, csv, log

More Case Studies