en flag +1 214 306 68 37
Big Data Consulting and Team Augmentation to Assist a Jewelry Company in Enterprise Data Warehouse Development

Big Data Consulting and Team Augmentation to Assist a Jewelry Company in Enterprise Data Warehouse Development

Industry
Manufacturing, Retail
Technologies
Big data, Spark, Python

Customer

The Customer is a large jewelry manufacturer and retailer that distributes its products in online and offline stores across the US.

Challenge

The Customer had a legacy Informatica solution for data analytics, which started showing subpar performance as the company’s business grew. As the legacy solution was unable to handle the growing data volumes, the Customer initiated in-house development of a new Incorta-based enterprise data warehouse. The new DWH was to enable enterprise-wide analytics of the data coming from the company’s business-critical systems (e.g., CRM, ERP, SCM) to facilitate informed decision-making for the company’s management.

When implementing the Incorta Spark layer, the Customer faced the lack of big data skills in its in-house team. The team members needed expert guidance to boost the rebuilding of legacy ETL processes on the new Incorta platform. The project had strategic importance for the Customer as each ETL pipeline migrated to Incorta allowed the company to run tens to hundreds of reports much faster than in the legacy system. To speed up the project and ensure full reliability of the new ETL pipelines, the Customer started looking for a reliable big data consultant.

Cooperation

Trusting ScienceSoft’s nine years in big data services and a solid portfolio of successful big data projects, the Customer chose us as the consultant for the project.

ScienceSoft’s senior data engineer conducted an in-depth analysis of the solution under development. Also, he interviewed the Customer’s team about the difficulties they faced when rewriting the ETL business logic for the new solution. He discovered that the developers were highly proficient in SQL but lacked Python and Spark skills, which was slowing down the project pace.

To address the skill gaps, our big data expert started working together with three of the Customer’s in-house developers, both individually and in a group. Considering the high professionalism of the in-house team, ScienceSoft’s specialist fostered practical collaboration rather than formal training sessions dedicated to a particular topic. For instance, our expert helped the developers troubleshoot their Python, SQL, and Bash code whenever they faced an issue. On a daily basis, they met in Zoom to investigate the arising problems and come up with pragmatic, future-proof solutions. This approach significantly sped up ETL coding, testing, and deployment.

Within just three weeks of collaboration with ScienceSoft, the Customer’s team noticed great improvement in their ETL building skills. Satisfied with the efficiency of our consulting services, the Customer asked ScienceSoft’s senior data engineer to join the project as a developer and continue guiding the in-house team.

As of December 2022, our expert has spent over six months on the Customer’s team, providing practical recommendations on secure ETL migration and Spark tuning. As a developer, he built 7 ETL pipelines that the Customer is already using to get crucial insights for inventory management optimization, marketing campaigns planning, tendering, and more.

In parallel with big data consulting and ETL implementation, ScienceSoft’s expert worked out several modifications for the current one-tier EDW architecture that would unlock the full potential of Spark and help achieve significant cost savings in the long run. In particular, he suggests creating a separate layer for the ETL pipelines to avoid storing exabytes of data in Incorta’s RAM, which can be extremely costly. The Customer highly appreciates our expert’s proactiveness and is considering the implementation of the suggested modifications in the future.

Results

By reaching out to ScienceSoft, the Customer significantly sped up the delivery of the new ETL pipelines for its enterprise data warehouse. Thanks to the expert guidance and knowledge transfer from ScienceSoft’s big data consultant, the Customer’s in-house developers are showing significant improvement in their Python and Spark skills and confidently rewriting the business logic of ETL pipelines for the new EDW solution.

The Customer is already utilizing the 7 ETL pipelines built by our developer to improve the efficiency of inventory management, marketing campaigns planning, and tendering. Appreciating our expert’s pragmatic suggestions for modification of the existing EDW architecture, the Customer is considering long-term collaboration with ScienceSoft.

Technologies and Tools

Development: Incorta, Python, Apache Spark, PySpark, SQL, Bash.

Have a question to our team or need help with your project?

Our team is ready to provide client references, estimate your project, or answer any other question related to your IT initiative.

Upload file

Drag and drop or to upload your file(s)

?

Max file size 10MB, up to 5 files and 20MB total

Supported formats:

doc, docx, xls, xlsx, ppt, pptx, pps, ppsx, odp, jpeg, jpg, png, psd, webp, svg, mp3, mp4, webm, odt, ods, pdf, rtf, txt, csv, log

More Case Studies