Data Science Consulting for Electric Energy Consumption Analysis and Forecasting

Customer

The Customer is an international company providing managed software solutions and consulting services for businesses operating in the energy sector.

Challenge

The Customer initiated the development of a cloud-based data analytics software product for electric power companies, which could facilitate electric energy consumption analysis, deliver accurate electric energy consumption forecasts (hourly, daily, and weekly), and become the basis for load forecasting and price determination.

Their project became stalled as the Customer needed a third-party review for the already developed part of software to define its strengths and weaknesses and get detailed recommendations on enhancing its analytical capabilities and designing the required machine learning (ML) models.

Solution

ScienceSoft’s team of data scientists and data engineers started with the analysis of the Customer’s business objectives and requirements for the future software product. After that, they reviewed the existing software architecture and suggested the enhanced architecture (Figure 1) in accordance with the Customer’s strategic and tactical goals.

Enhanced software architecture

Figure 1. Enhanced software architecture.

ScienceSoft’s experts continued with the review of ML code and suggested creating ML models according to the following process: 

  • Data acquisition. 
  • Exploratory data analysis to investigate data for patterns and anomalies, test hypothesis and check assumptions.
  • Configuration of parameters and hyperparameters for training models.
  • Data preprocessing (dealing with missing values, categorical values, etc.).
  • Model exploration.
  • Model training.
  • Model evaluation and tuning.
  • Model retraining based on new data.

To ensure the high accuracy of ML models, the consulting team recommended that the Customer: 

  • Use the full potential of exploratory data analysis (EDA) to examine the interrelationships between features, discover interesting subsets, and determine correlations between predictor and the target variables.
  • Employ pure time series models, including LSTM-based neural networks like Seq2Seq models.
  • Use classical ML models like LightGBM or XGBoost for time series forecasting and revealing the time-dependent data nature.
  • Incorporate additional data (public holidays, local events, day length, geographical position, etc.) to move model forecasts to the next level of precision. 
  • Employ online machine learning to train models on newly coming data.

Results

The Customer obtained high-level software architecture and detailed recommendations on how to create ML models for accurate forecasting. Delivered software would enable electric power companies to get accurate short-term and mid-term forecasting about electric energy consumption, improve load management and price determination processes.

Technologies and Tools

Google Cloud Platform, Microsoft SQL Server, Pandas, Python, Scikit-learn, TensorFlow, NumPy, Jupyter.

MORE CASE STUDIES
COVID-19 – An update to our clients
In the uncertain time of Coronavirus (COVID-19) outbreak, I want to assure you that ScienceSoft remains fully operational and dedicated to supporting the continuity of our customers’ businesses. Most of ScienceSoft’s employees work remotely, and we’re equipped to provide our services in new conditions, with no impact on the quality of service or communication.
In the uncertain time of Coronavirus (COVID-19) outbreak, I want to assure you that ScienceSoft remains fully operational and dedicated to supporting the continuity of our customers’ businesses. Most of ScienceSoft’s employees work remotely, and we’re equipped to provide our services in new conditions, with no impact on the quality of service or communication.
Stay safe and healthy,
Nikolay Kurayev,
Chief Executive Officer at ScienceSoft