How to Develop an AI Software
Project Roadmap, Team, Sourcing Models
ScienceSoft applies 33 years of experience in software development and data science to develop software with artificial intelligence (AI) capabilities.
The Essence of Developing Software with AI Capabilities
Development of software with AI capabilities implies building new software or evolving existing software to output AI analytics results to users (e.g., demand prediction) and/or trigger specific actions based on them (e.g., blocking fraudulent transactions).
Supported by AI, an application can automate business processes, personalize service delivery and drive business-specific insights. According to Deloitte, 94% of business leaders agree that “AI is critical to success over the next five years”.
ScienceSoft helps both enterprises and product companies plan and build full scale AI solutions for 30+ different industries, including manufacturing, healthcare, energy, retail and wholesale, professional services,financial services, and telecommunications.
Use cases for software with AI capabilities
Business process automation
- Search engines
- Automated document generation
- Optical character recognition engine for data extraction from paper documents
- Job candidates screening and shortlisting
- Predictive maintenance
- Demand and throughput forecasting
- Process quality prediction
- Production loss root cause analysis
- Sentiment analysis
- Customer behavior prediction
- Sales forecasting
- Counterparty risk analytics
- Potential damage prediction
- Fraud detection
Supply chain management
- Demand forecasting
- Lead time forecasting
- Inventory optimization
Personalized service delivery
- Customer segmentation
- Recommendation engines
The duration and sequence of the development stages will depend on the scale and the specifics of both basic software functionality and artificial intelligence you want to enrich it with. Below we present a generalized process outline based on ScienceSoft’s 33-year experience in software development and data science.
Duration: 1 month
- Outlining high-level software requirements (in case of new software).
- Creating a proof of concept (PoC) for AI to check the technical and economic feasibility of enriching software with it, estimate the scope of work, timeline, budget, and risks.
- Calculating ballpark ROI of AI implementation.
A best practice from ScienceSoft’s PMs: Pavel Ilyusenko, Head of PMO, says:
“To save on time and budget resources and increase the ROI of AI, we deliver a PoC to uncover possible AI-related roadblocks, such as low-quality data, data silos, data scarcity.”
Business analysis to elicit AI requirements
Duration: 1-6 weeks
ScienceSoft defines detailed functional and non-functional requirements to AI, such as the required level of AI accuracy (in some cases, the value can be driven already with 65-80% of accuracy), explainability, fairness, privacy, and the required response time.
ScienceSoft's best practice: When choosing a machine learning model AI will leverage, we carefully consider the trade-offs between requirements to AI (as, for example, some models can be less accurate but more explainable and fair).
Solution architecture design
Duration depends on the overall complexity of software functionality
ScienceSoft selects integration patterns and procedures. We design the architecture of the solution with integration points between its modules, including integration with an AI module.
Business processes preparation (in case of software development for internal use)
Duration: 1-3 months
Launching an initiative of integrating AI in business-critical software may require organizational changes to increase the chances for its successful implementation and adoption. ScienceSoft discusses with the business:
- Shifts in data policies to break down data silos across the departments to enable easy access to data and avoid duplicated or contradicting data that decreases AI accuracy.
- Determining a plan on adapting employees’ workflows to the use of updated (or new) software (e.g., user training and refreshed user guides and policies).
- Promoting continuous collaboration between business and tech stakeholders.
Software development (non-AI part)
Duration: 3-36 months
ScienceSoft develops the front end and the back end of software (the server side and APIs, including necessary APIs for AI module integration). We also run all necessary QA procedures throughout the development process to validate software quality.
AI module development
1. Data preparation
Duration: 1-2 weeks (this process can be reiterated to increase the quality of AI deliverables)
- Consolidating data from relevant data sources (internal and external, which can be acquired via one-time purchase or a subscription).
- Performing exploratory analysis on data to discover useful patterns in it, detect obvious errors, outliers, anomalies, etc.
- Cleansing data: standardizing, replacing missing or deviating variables, removing duplicates, and anonymizing sensitive data.
- The resulting data is split into training, validation and test sets.
ScienceSoft's best practice: To significantly streamline this time-consuming stage, we use automation tools (e.g., Trifacta, OpenRefine, DataMatch Enterprise, as well tools within leading AI cloud platforms – Amazon SageMaker, Azure Machine Learning, Google AI Platform).
2. ML model training
Duration: 1-4 weeks (depending on the model’s complexity)
ScienceSoft selects fitting machine learning algorithms and builds ML models. We trained the models with training data and test against a validation dataset, then we increase their performance by fine-tuning hyperparameters. The most high-performing models can be combined into a single model to decrease the error rate of separate models. The final ML model is validated against a test dataset in the pre-production environment.
Duration: 2-4 weeks
The configuration of the AI deployment infrastructure and approach to integrating AI into software depends on how AI should output results:
- In batches: AI outputs are cached according to pre-scheduled time intervals. Targeted software retrieves AI outputs from the data storage it is connected with. Higher latency is acceptable.
- As a web service: near-real-time outputs triggered by a user or a system request via API. Low latency is required.
At ScienceSoft, we often start with pilot deployments to a limited number of software users to verify the smoothness of AI integration with target software and compatibility with the infrastructure (latency, CPU and RAM usage) and run user acceptance tests to handle possible issues before a full-scale rollout.
ScienceSoft's best practice: To accelerate the AI deployment, in our projects we leverage leading AI cloud platforms – Amazon SageMaker, Azure Machine Learning, Google AI Platform.
Maintenance and evolution of AI-powered software
ScienceSoft tracks and fixes software bugs and issues of integration with AI, optimizes software performance and enhances UI based on user feedback, develops new features and extends AI-enabled functionality drawing on evolving business or user needs.
Maintenance of AI is a separately controlled process. It includes monitoring of ML model performance to detect a ‘drift’ (decreasing accuracy and increasing bias when the data that AI processes grows and starts deviating from the initial training data).
In case of the drift, models should be retrained with new hyperparameters or newly engineered features reflecting shifts in data patterns. They can also be replaced by challenger models with higher performance (identified during A/B testing).
Consider Professional Services for Development of AI-Powered Software
ScienceSoft applies 33-year experience in software development and data science to create solid software with AI capabilities.
Consulting: software development with AI capabilities
- A feasibility study on integrating AI into your software (potential benefits, risks, and costs).
- A risk management strategy to mitigate AI-related risks.
- A development, deployment and integration plan.
- Choosing an optimal sourcing model.
- An efficient technology stack for software and its AI part.
Outsourced development of software with AI capabilities
- Feasibility study (including PoC).
- Eliciting requirements for software and AI.
- Software development and testing.
- AI development: data preparation, ML model building, training and tuning.
- AI integration and testing.
- User training.
- Software maintenance and evolution.
Why choose ScienceSoft to deliver software powered by AI?
- In software development since 1989.
- In data science and data analytics since 1989.
- In business intelligence and data warehousing since 2005.
- In big data services since 2013.
- In image analysis consulting and development services since 2013.
- Average experience of our PMs, BAs, solution architects, developers, data analysts, and other IT professionals: 7-20 years.
- ISO 9001 certified quality management.
- Guaranteed security of the customers’ data we access proved by the ISO 27001 certificate.
Software powered by AI: ScienceSoft’s success stories
Facial Recognition Software for Retail
- Preprocessing with specifically processed input images to make customers' faces recognizable.
- Calculating landmarks, i.e. some particular facial features.
- Face recognition using calculated landmarks and special criteria.
Software for Remote Monitoring of LED Displays
- Detecting and classifying problems with displaying images
- Reporting the results to the web server through HTTP requests in real time.
Defect Recognition Software for Oil Drilling Equipment
- Monitoring and analysis of the drilling equipment.
- Timely detection of drill bits’ defects.
- Recommendations on required drill bit replacement and maintenance.
Software for Defect Recognition in Polyurethane Film
- Detecting and reporting on film defects in real time.
- Running root cause analysis of defects and providing tailored recommendations for product quality increase.
Brain Tumor Localization Application
ScienceSoft developed a CNN-based application to:
- Analyze brain MRI scans.
- Define each brain tissue type.
Software for Remote Monitoring of Oil Storage Tanks
- Remote detection of the liquid level in oil tanks.
- Timely detection of oil leaks.
ScienceSoft as a reliable AI consulting and development partner
Two years ago, we commissioned ScienceSoft to audit and upgrade our partially developed AI-based software for clay pigeon shooting tracking.
ScienceSoft ramped up a development team consisting of two C++ developers, two data scientists, and a UI design expert to fulfill the project. The team identified core errors, which didn’t allow efficient solution operation, and implemented high-speed convolutional neural networks to fix them. As a result, the system could track a flying target in a real-life outdoor environment and faultlessly detect shooter’s performance.
Simen Løkka, CEO, Travision AS
To outline a project roadmap, manage the software & AI development life cycle, and foster collaboration between business and tech stakeholders.
To analyze business and user needs and translate them into technical requirements for software, AI, and integration between them.
To cleanse data for AI and engineer features; to build, train, test, and validate ML models. Domain experience is preferred.
To deploy AI and monitor it in production.
UX and UI designers
To design wireframes, create user stories and UI prototypes for AI-driven software, following the principles of user-centricity.
To build the software back end and front end and build and implement APIs necessary for integration with AI, and further evolve software.
To design and implement a test strategy to validate software quality.
All resources are in-house
Full control over the project, however, the lack of the required skills in AI is likely. Growing in-house AI capabilities can be a strategic decision if the development of software with AI functionality is a part of company-wide adoption of AI technologies.
All resources are in-house, except for data scientists
High control over the project and access to competencies unavailable in-house. If you’re looking to grow an end-to-end in-house team in the future, look for a resource vendor who provides knowledge sharing.
Non-AI part is developed in-house, while the AI part is outsourced
Optimal resource usage and access to competencies unavailable in-house. However, establishing smooth team collaboration may pose a challenge.
PM and BA are in-house, all technical resources are external
Sufficient control over the project and better process transparency, no problems with resource utilization after the project. There should be properly qualified PM and BA in-house.
Access to rare talent and the latest technologies, which results in faster development and lower costs but higher vendor risks. Thus, we suggest requesting PoC from a chosen vendor.
Benefits of AI-Powered Software Development with ScienceSoft
- Data protection. Always obtaining informed consent for personal data collection and processing.
- Strong data security. Creating a protected environment (including DevSecOps practices and tools) for data processing and storage.
- Compliance. AI-powered solutions are fully compliant with industry and legal requirements (HIPAA, GLBA, GDPR, etc.).
- Data quality. Certified data engineers and data scientists, a wide set of tools to automate data validation, cleansing, reduplication processes.
- Guaranteed value of the AI solution. Starting with a PoC, increasing the accuracy of the output by using a combination of white box and black box AI models.
- Tracked quality of analytics insights. Output quality KPIs: insights by value (high / average / low); forecast accuracy; missing alerts; business result-related KPIs, etc.
AI platforms help quickly set up, automate and manage each stage of the AI module development with pre-configured infrastructure and workflows. ScienceSoft recommends considering platforms by major cloud providers: Amazon, Microsoft, and Google. All of them are leaders in Gartner’s Magic Quadrant for Cloud AI Developer Services and offer integrated development environments (IDEs) with the following capabilities:
Some of the platforms’ distinctive features are outlined below:
- Powerful, enterprise-ready infrastructure offered by AWS (e.g., Amazon EC2 and Amazon S3-based) to support AI-related projects.
- Pre-configured data labeling workflows, access to pre-screened vendors offering data labeling services.
- One-click data import, 300 pre-configured data transformations, data visualization capabilities.
- A unified repository to store, organize and reuse ML features.
- Marketplace with pre-built ML algorithms and models.
Enterprise-scale AI integration initiatives.
Payment for compute and storage resources consumed. Pricing depends on the region, the services used within the platform, their configuration and hours of usage.
Azure Machine Learning Services
- Drag-and-drop UI for low-code model development.
- Data labeling service to manage and monitor labeling projects and automate iterative tasks.
- Flexible deployment options offered by Azure, including the hybrid cloud.
- Cost management with workspace and resource level quota limits.
Flexible AI deployment (on-premises/hybrid cloud).
Payment for compute and storage resources consumed. Pricing depends on the region, the services used within the platform, their configuration and hours of usage.
Google AI Platform
- Accelerated AI performance due to integrated proprietary Tensor Processing Unit (TPU).
- Advanced support of Kubernetes orchestration.
- Integration with BigQuery (Google’s hyperscale data warehouse) datasets.
- Data labeling service that connects companies with human labelers.
- Support of TensorFlow Enterprise.
- Pre-configured virtual machines and optimized containers for AI based on deep learning.
Integration of resource-intensive deep learning AI into software; startup-friendly.
Pricing depends on the region, the services used within the platform, their configuration (type and number of instances) and hours of usage.
The cost of software powered by artificial intelligence can vary greatly:
The estimates rely heavily on the specifics of AI module development:
- Data volume used for AI and the number of data sources to process.
- Data type (unstructured data is more expensive to work with than structured).
- Data origin (there may be a need to buy external data) and whether it needs labeling (tagging data samples with the desired output).
- Data quality (issues in data require more resources for cleansing).
- Required accuracy rate for AI (the higher it is, the more time-consuming and expertise-demanding ML model tuning will be).
- Complexity of ML algorithms.
- Deployment type (AI outputs are in batches or in near-real-time).
- AI maintenance costs (AI operating in a changeable data environment, e.g., feeding on dynamic user data, needs regular retraining).
- Infrastructure costs.
ScienceSoft is an international IT consulting and software development company headquartered in McKinney, TX. Relying on 33-year practice in software development and data science for 30 industries, including manufacturing, healthcare, financial services and retail, we develop software enhanced with AI to optimize workflows and reduce operating costs, improve decision-making, and increase customer engagement.