en flag +1 214 306 68 37
Data Analytics Solution to Explore Equipment Testing Data across 300 Parameters

Data Analytics Solution to Explore Equipment Testing Data across 300 Parameters

Industry
Manufacturing
Technologies
Azure, MS SQL Server, Power BI

Customer

The Customer is a leading producer of fuel cell stacks for commercial and specialized vehicles, power units, and vessels.

Manual Data Analysis Was Inefficient and Time-Consuming

The Customer has hundreds of testing equipment units that validate the performance and safety of fuel cell stacks across nearly 300 parameters. Prior to the project, the testing results data was stored in different formats and locations and used to be analyzed manually. This approach was inefficient and time-consuming, so the Customer decided to build a reliable analytics solution that would enable centralized storage, visualization, and analysis of the testing results.

Defining Database Structure and Ensuring Accurate Data Transformation

ScienceSoft assigned a solution architect and two data engineers to the project. The team started with the audit of the testing results files (CSV tables) to understand the data specifics and define the structure of the future consolidated database. Based on the results, the team created the following database column structure: tests metadata, tests results, and calculated metrics (e.g., averaged and standard deviations).

The raw CSV files include test-specific parameters and have multiple variations in their structure (e.g., in the number of columns, the number of rows in the column header), which could become an obstacle to accurate data consolidation in the database. It was crucial to prevent data integration errors; at the same time, ScienceSoft’s engineers wanted to avoid describing each possible CSV table structure and its transformation rules. If done manually, this would delay the solution delivery and cause additional efforts for the Customer in the future, when new tests are added and have to be described the same way.

To solve this problem, ScienceSoft’s data engineers created several setup file formats that reflect the most frequent structures of the source CSV tables. The team coded the analytics solution to compare the processed files to the setup formats and perform data distribution based on the comparison results. Such an approach ensures correct data distribution even when the source table has a nonstandard structure.

Centralized Data Storage and Processing Mechanisms

ScienceSoft’s team designed and implemented an Azure-based solution that supports the following data management steps:

Raw data ingestion and storage

Test CSV files from the testing systems are ingested into the raw data storage (Azure Blob Storage). A special trigger puts every new file in the message queue.

Data processing and uploading to the database

The database management system (Microsoft SQL Server) checks the raw storage for new files every five minutes. Once a new file is detected, it is uploaded to the database (Azure SQL Database) through an LTE (load–transform–extract) job and removed from the queue. Azure Functions compares the incoming table structure to the setup formats and distributes the data across the database.

Data access

The Customer’s employees can access the consolidated testing data and use it in the following ways:

  • Get self-service analytics reports by accessing the database via Excel or Power BI.
  • Turn to centralized data storage for manual exploration of source files.

The Customer can grant its employees access to the database via SQL Server Management Studio.

Building Interactive Power BI Reports for Test Monitoring

ScienceSoft’s data engineers created Microsoft Power BI reports with dynamic visuals for different types of tests. During each test, the equipment records multiple readings every 5 seconds. The reports illustrate these readings (e.g., cells voltage, anode- and cathode-related values) in interactive graphs. When users change the time interval, the graphs change accordingly to illustrate the readings within the required time range: from any particular five-second interval to the entire test period.

Automated Analytics and Complete View of Testing Results

The Customer was fully satisfied with ScienceSoft’s services. Within six months, it received a reliable analytics solution that accumulated disparate, multi-source data into a single point of truth. The solution eliminated manual work for the Customer’s employees and accelerated data analysis speed. The Customer also received dynamic Power BI reports with interactive visuals that enable analysis of testing results at any required level of granularity.

Technologies and Tools

Azure, Azure SQL Database, Microsoft SQL Server, Microsoft Power BI.

Have a question to our team or need help with your project?

Our team is ready to provide client references, estimate your project, or answer any other question related to your IT initiative.

Upload file

Drag and drop or to upload your file(s)

?

Max file size 10MB, up to 5 files and 20MB total

Supported formats:

doc, docx, xls, xlsx, ppt, pptx, pps, ppsx, odp, jpeg, jpg, png, psd, webp, svg, mp3, mp4, webm, odt, ods, pdf, rtf, txt, csv, log

More Case Studies