How to Make Big Data Implementation a Success: Roadmap and Best Practices to Follow

Data Analytics Researcher, ScienceSoft

Published:
5 min read

If you’d like to experience some suspense, let it be while you’re watching an action movie, not while your company is implementing some promising initiative like a big data project. To save you from any unexpected turns there, ScienceSoft’s team summarized their 6-year experience in providing big data services to share with you an implementation roadmap for a typical big data project. We also chose three real-life examples from our project portfolio for you to follow some best practices.

A step-by-step roadmap to big data implementation

For a typical big data project, we define 6 milestones:

big data implementation

Turning business needs into use cases

A big data project always starts with eliciting business needs. The main goal of this stage is to look beyond the needs that stakeholders explicitly voice out and spot even those they might have not even acknowledged yet. Once business needs are identified, they should be translated into use cases (i.e., 360-degree customer view, predictive maintenance or inventory optimization) that a future big data solution is to solve.

Designing a big data architecture

At the end of this milestone, you should have the main components of your future big data solution, i.e., a data lake, a big data warehouse, and an analytics engine, identified. You should also decide on what technologies to base all the architecture components. Besides, you should formalize your data sources (both existing and potential), as well as data flows to have a clear picture of where data comes from, where it goes further and what transformations it undergoes on the way.

Integrating big data with existing applications and systems

Results obtained during big data analysis can become a valuable input for other systems and applications. To benefit from the synergy and leverage existing applications and processes, you need to identify the applications that should be integrated with the big data solution, as well as implement all the required APIs.

Working on data quality

Consider 5 main big data characteristics and find a trade-off between the quality level you find acceptable and the costs, efforts, and time required to achieve this level. Besides, while devising data quality rules for your big data solution, make sure they won’t ruin the solution’s performance.

Turning the design into code

At the end of this milestone, you have your big data architecture deployed either in the cloud or on premises, your applications and systems integrated, and your data quality process running. If your big data solution is powered with data science, you’ll also have your machine learning models designed and trained at this stage.

Training users

Plan dedicated training sessions, which can take the form of workshops with Q&A sessions or instructor-led training. This will help various user groups understand how to use the solution to get valuable and actionable insights.

3 big data implementation projects by ScienceSoft + A bonus project from PepsiCo

This section is all about best practices. We briefly describe the use cases that three our customers solved with their big data solutions, the technologies that were chosen in each case, as well as share some specifics of the projects.

Retail and hospitality industry: sales, customer and employee analytics based on Microsoft tools

For a multibusiness corporation, ScienceSoft designed and implemented a big data solution that was to provide a 360-degree customer view and analytics for both online and offline retail channels, optimize stock management, and measure employee performance.

The solution’s architecture was classic in terms of the required components, still complex in terms of implementation. Thus, ScienceSoft designed and implemented a data hub, a data warehouse, 5 online analytical processing cubes, and a reporting module. All the components were based on Microsoft technologies.

To make use of the data previously locked within 15 diverse sources, including the legacy CRM and ERP systems, as well as other applications specific to the customer’s business directions, we put significant efforts into data integration.

To read the full story, including data quality, data security, and support activities, follow the link: Data analytics implementation for a multibusiness corporation.

Market research industry: Advertising channel analysis with Apache tools

Early enough, a market research company recognized that their analytics solution, which perfectly satisfied their current needs, would be unable to store and process the future data volumes. The forward-looking company turned to ScienceSoft to get a new solution that relied on the classic mix of Apache technologies: Apache Hadoop – for data storage, Apache Hive – for data aggregation, query and analysis, and Apache Spark – for data processing. After migrating to the new solution, the company was able to handle the growing data volume. Besides, they processed their data on the use and effectiveness of advertising channels for different markets up to 100 times faster.

Read the full story here: Big data implementation for advertising channel analysis in 10+ countries.

Telecom industry: Customer behavior analysis based on Amazon tools

For a telecom company, ScienceSoft designed and implemented a big data solution that allowed running insightful analytics on the plethora of data, such as users’ click-through logs, tariff plans, device models, and installed apps. Besides, with the help of the solution, the company was able to identify the preferences of a certain user and make predictions on how a user would behave.

As to the technology side, the solution was mainly Amazon-based: it was deployed in the Amazon cloud, Amazon Simple Storage Service and Amazon Redshift were used for a data landing zone and a data warehouse correspondingly.

Get all the project’s details here: Implementation of a data analytics platform for a telecom company.

A bonus project from PepsiCo

Though Pep Worx, PepsiCo’s big data platform, is not the project by ScienceSoft, we still mention this case as a bonus point, and for a simple reason: very few companies disclose real figures while describing the results achieved after implementing big data, and PepsiCo is one of them. Here’s what Jeff Swearingen, Senior Vice President of Marketing at PepsiCo said:

“We were able to launch the product [Quaker Overnight Oats] using very targeted media, all the way through targeted in-store support, to engage those most valuable shoppers and bring the product to life at retail in a unique way. These priority customers drove 80% of the product’s sales growth in the first 12 weeks after launch.”

Make your project outshine even the most successful projects described

We hope that the roadmap and best practices we shared will help you achieve stunning results. If you need a helping hand in creating a comprehensive list of big data use cases specific to your business or you are searching for an experienced consultancy to implement your big data solution, ScienceSoft will be happy to have your success story in our project portfolio.

Implement Big Data Successfully

Big data is another step to your business success. We will help you to adopt an advanced approach to big data to unleash its full potential.