Itransition delivered an ML-powered solution for intelligent brand tracking and analytics that can identify brands in images and generate bespoke reports.
Our customer is a global event sponsorship valuation company delivering business insights to clients in the sports and entertainment domains using data-driven software. The scope of their offering included a solution for identifying brands (image and text logos) in sports-related images, including photographs from games, tournaments, sports-related events, and social media. It also had features for collecting and aggregating data on various metrics, like logo size and quantity, and generating reports. The solution was targeted toward brands, sporting event organizers, and marketing agencies, helping them assess brand performance and their marketing campaign effectiveness.
However, the customer had serious issues with the solution in various aspects:
With time, the customer decided to replace the legacy solution with more up-to-date software. Our customer was looking for an IT provider who would develop the new system, design its architecture and infrastructure, create a new UX/UI, and set up DevOps practices.
The customer chose Itransition for the project due to our experience in developing analytics solutions and providing machine learning services, as well as broad expertise in the sports and entertainment domain.
Itransition created a new ML-based brand recognition solution to identify brands in images and generate reports. By combining several ML and non-ML models, we enabled the solution to not only recognize logos in media but also determine where the logo is located exactly (billboard, sports uniform, interview board, etc.). For the solution, we developed an AWS microservices infrastructure and set up a CI/CD pipeline. To make the solution more user-friendly and ensure good user adoption, our team designed a new interface with simple navigation capabilities.
At the project’s start, Itransition analyzed the customer’s existing system and its capabilities and limitations. Our team also communicated with the customer to learn users’ pain points and business needs and identify areas for improvement.
For image recognition, the legacy solution utilized a brute force method. This is a non-ML algorithm that identifies a brand by comparing a particular image with all brand examples in the system. The process was slow because the solution had to sort through all available images to identify one logo.
Having clarified the initial requirements, we suggested building an ML-powered solution as it would provide higher recognition speed and accuracy compared to the brute force method. To showcase the system’s viability, we suggested starting with a PoC development and creating a fully-functioning part of the ML model.
For the PoC, our team opted for the development of an ML detector that automatically finds image segments that possibly contain a brand logo. It is complemented by other ML models that identify what brand is in the segment. After our team had successfully developed the PoC, our customer decided to continue with the project and develop a full-scale brand recognition solution.
The brand recognition pipeline includes various ML models and can be broken into several stages:
However, sometimes even when leveraging the ML model, the system can’t recognize the brand due to an unconventional logo position or lighting. To counter the issue, Itransition’s team applied a brute force algorithm similar to the legacy solution’s one that users can manually activate when reviewing brand recognition results.
For the ML model to identify a particular brand logo in images, users first need to select the necessary brand from the menu in the brands functionality. The list of available brands is aligned with our customer’s internal solutions to preserve brand consistency. When adding a brand, users also need to upload images and text examples, like a brand logo, advertising slogan, or alternative brand names.
The ML model leverages these examples to run the recognition process in the detector and classifier sections. However, to ensure the image classifier works properly, we developed a vectorization functionality that creates image vector representations based on brand image examples. Vector representations are then utilized for image recognition in the image classifier.
Users can configure and start the recognition process, review its results, and download reports through workspaces. In addition to creating a workspace for a particular project (e.g., a series of sports matches), they can also add images, select brands, and configure parameters for location determination.
First, the images are processed with the detector that marks rectangle areas possibly containing a brand image.
When developing the detector, one of the challenges we faced was creating a unified segment out of a complex multi-item shape, for example, a logo that contains an image and a brand name. We couldn’t mark both the image and the name with one large rectangle as the extra empty space at the corners would make the brand appear larger. Alternatively, if we marked the logo image and the brand name with two separate rectangles, the system started to perceive one logo as two separate entities, which affected brand quantity.
To solve the issue, Itransition’s team developed the merging feature. First, the system marks the complex logo with several rectangles, marking each rectangle as a separate logo. To automatically merge them into one, we created a system of rules and parameters to help the solution determine whether multiple rectangles constitute one logo. As a result, we enabled the system to accurately process complicated brand shapes without compromising on the customer’s formulas for assessing brand presence.
After the detector has processed the images, the identified segments are processed by the text classifier, which consists of two parts. First, the text detector in the text classifier identifies precise text location within the marked segment. After text detection, the ML model activates the text classifier, which identifies the actual text and compares it to text examples from the brands page. For the text classifier, our team used the Hamming Distance principle, which helps the system see the difference between significant and minor changes to the brand’s name or slogan. For example, an advertising slogan can have an extra exclamation mark, and the solution would still accurately recognize the brand.
If the system is unable to identify a brand using text recognition (for instance, a brand logo does not have text in it), it activates an image classifier. Thus, the detector and the text classifier are only necessary for brand recognition stages. We made the image classifier stage optional, as we had discovered during the ML model training that the text classifier produced highly accurate results, and additionally activating the image classifier didn’t substantially affect recognition quality.
Different ML models, even if trained on the same dataset, produce different results. Thus, to enhance recognition accuracy, our team combined multiple ML models for the image classifier. When developing the image classifier, our ML engineers did not only select several neural networks but also used different datasets and parameters and altered the training process to ensure the maximum variety of neural networks’ output.
Another challenge when developing the solution was creating a unified system to identify any brand. Typically, a neural network is trained to recognize certain sets of objects. Thus, to have the solution identify new types of objects, it would have to be fully retrained and reconfigured, which is time- and effort-consuming. However, our customer needed a versatile tool that would enable users to add a new brand the system had not seen before, with the new logo not decreasing brand recognition accuracy.
To eliminate the need for constant ML model retraining, we suggested deleting the last layer responsible for output creation from the image classifier because it is most affected by new objects addition, creating the need for system reconfiguring. Without the last layer, the image classifier converts the segments identified by the detector into vectors and compares them to the vectors of brand examples. Then, the image classifier compares the vectors using the k-nearest neighbor algorithm. The algorithm enables each neural network to formulate three conclusions about a brand in the marked segment based on the greatest similarities between the vectors.
To assess the conclusions without the last layer, our team designed a confidence score algorithm that takes into account various results combinations to determine which brand should the ML model consider the final output. For example, if two neural networks determine the same brand as their top choice, the system will consider this brand the final conclusion. The same will happen if two networks have two similar brands as their secondary choices.
We also had to insure satisfactory image processing speed. Initially, the system processed images one by one, starting with the detector and on to the image classifier. Our team acknowledged it created a workload imbalance because when one part of the ML model was activated, others were waiting. To accelerate the image processing, our team established asynchronous processing, so that all images would first undergo the detector, then the text classifier, and so on. As a result, we increased the file processing speed by 50% up to 0.5 FPS, with the current speed fully matching the customer’s business needs.
A location determination functionality was one of our customer’s key requests since the legacy solution could not identify where a brand logo is located (on a billboard, a T-shirt, etc.). Consequently, when reviewing the results of the recognition process, users had to manually assign the location to each identified brand logo. Having analyzed our customer’s requirements, Itransition created the feature based on the algorithmic approach.
We enabled users to create complex rules for determining the brand location by leveraging various parameters, such as logo size, color, etc. Each workspace can have a unique set of rules for certain brands. For example, users can enter information about the direct link between a particular color in a brand and its location. Thus, if the system identifies this specific color, the logo must be located on a T-shirt. After the system has run the brand recognition process, it checks all the rules users set up and applies algorithms to determine the exact brand location.
In addition to editing reports with custom scripts in the legacy solution, users had to manually check large volumes of data to find errors and add missing data and content to the reports.
When developing a new solution, Itransition’s developers created a reporting functionality that generates different types of bespoke reports with statistics and calculations drawing on brand recognition results. We automated the reporting process by implementing the mathematical formulas the customer uses for calculating brand presence on particular images. We also designed report templates that fully meet the need of the customer’s clients for comprehensive information, making sure our customer does not need to manually edit the reports anymore.
Examples of report types:
As part of ongoing long-term collaborations with some of their clients, the customer performed recurrent brand recognition tasks. For example, an event organizer could have requested our customer to monitor a championship with the already fixed number of brands. However, the legacy solution couldn’t schedule the brand recognition process, so when new images appeared, users manually set up and launched the recognition process.
We developed a task manager feature that enables recognition scheduling, prioritizing, monitoring, and automatic processing. We also configured the system so that users could set up a task for the ML model once and the system would automatically activate the recognition process, retrieve the images from the object storage and process them, generate reports, and put them in the selected folder. When setting up the task, our team enabled users to select a workspace, image folder, time, report type, report target folder, and more.
Our team also partially automated adding new images to the workspace, setting up the system to routinely check the selected image folder on AWS file storage and run the brand recognition process if there are new files. Furthermore, to make the solution more user-friendly, we created a task overview list with scheduled tasks and their statuses, allowing users to easily monitor the recognition process.
Since the solution would be strictly for internal use and the customer wanted to accelerate its release, we prioritized intuitiveness, ease of use, and seamless integration between frontend and backend parts for UX/UI design development.
After eliciting the customer’s design requirements, our business analyst created preliminary wireframes, which we adjusted based on the customer’s feedback to make the system easier to navigate.
We suggested leveraging Ant Design, an enterprise-class UI design language, and a React UI library with high-quality ready-made components like layouts, menus, and buttons. We ended up using only the ready-made components and didn’t have to design new elements, accelerating interface development and significantly lowering the cost of UX/UI. Moreover, the chosen design approach simplified the integration with the frontend part, because all assets in the library had already been tested and fine-tuned.
We utilized Java language and Spring framework for the backend. The frontend part is written in React and communicates with the backend and the ML model through the REST API. We selected MySQL for the database and established a CI/CD deployment pipeline using Teamcity.
The detector is powered by YOLO v4 and we chose it because it is the fastest ML-based object detection tool, capable of recognizing graphic objects in images. Because Python is the default option when working with YOLO, we used it for customizing, configuring, and training the ML model. Our team also used the OpenCV library which includes programming functions for working with computer images.
To develop the text classifier, we used vgg_15 and SATRN, which allowed the customer to achieve impressive text recognition accuracy and speed. To enable the highest recognition quality in the image classifier, we chose neural networks like EfficientNet, EfficientNet_augm75, and Xception and used ImageNet, an image database tailored for visual object recognition, to train them.
In addition to the standard image database, we also utilized images and brand examples provided by the customer to emulate the exact processing conditions. To train the ML model, our team leveraged PyTorch, an open-source machine learning framework.
When developing the solution’s architecture, we suggested opting for microservices, as it would have allowed greater flexibility and easier scalability compared to the monolith architecture. Furthermore, our team wanted to deploy all the ML features on separate microservices so that changes to other parts of the system would not affect the ML model configuration. For the same reason, we also decided to distribute various ML models among separate microservices:
Our team also developed the fifth microservice for all the non-ML features like reporting and task manager. Additionally, the fifth microservice unites all other microservices, monitors the recognition process and task manager, and provides the frontend API.
Since our customer did not have IT specialists to manage local servers, Itransition’s team suggested hosting the new platform on AWS as an easier, more scalable and cost-efficient option.
Amazon Route 5
Domain name assignment to the load balancer
Storage of images and brand recognition examples
A computing instance for application components and security groups instance for incoming traffic control
Application logs collection
AWS Backup for S3
Prevention of accidental file deletion or damage and file backup with a 14-day retention period
Auto Scaling Group
EC2 instances management for performance and cost optimization
Database setup, management and scaling and automated nightly system backup
Amazon S3 File Gateway
Streamlined image upload to the cloud
AWS Elastic Load Balancing
Integrated with Auto Scaling Group to optimize the load
To automate the infrastructure’s maintenance and support, our team used Terraform scripts that contain information about the services and their configurations, reducing time and effort for manual settings adjustment. Moreover, the Terraform scripts preserve infrastructure version history, which enables the team to roll back to older versions if necessary.
Itransition’s team consisted of a project manager, a business analyst, a UX/UI designer, a tech lead, an ML engineer, a frontend developer, backend engineers, and a QA specialist.
To work out the project’s vision, our project manager and business analyst collaborated with the customer’s product owner, responsible for strategy and decision-making.
We and the customer’s team created an environment that facilitated efficient two-way collaboration. Also, to ensure maximum transparency and predictability in the project timeline and deliverables, we had a fixed-time, fixed-budget project. Prior to starting the solution’s development, our team collaborated with the product owner to create a high-level list of technical and functional requirements and then break them into tasks.
We suggested adhering to the Scrum methodology because it offered the necessary planning tools to ensure flexible low-level scope definition during the development process. We used all the typical Scrum elements, including two-week sprints with bi-weekly planning sessions, retro meetings, daily sync-ups, and demos with stakeholders.
Itransition delivered an ML-powered brand recognition solution for identifying brands and their locations and generating bespoke reports based on configurable recognition parameters. We developed a comprehensive ML model for image recognition and processing and achieved a 50% increase in image processing speed. Our team also established an AWS-based microservice infrastructure to streamline the solution’s management and created an easy-to-navigate UI.
Get ML consulting and development services from machine learning experts. Proven expertise in artificial intelligence, machine learning, and data science.
Itransition presents the ultimate list of machine learning statistics: market data, ML adoption by industries, economic impact, investments, and other facts.
Learn more about a PoC of an ML-based plankton detection and classification solution we developed, proving the suggested approach.
Find out how Itransition’s dedicated team helped AiBUY release their innovative machine learning-driven shoppable video platform.
Learn how Itransition delivered retail BI and deployed an ML-based customer analytics solution now processing 10TB of data.
Learn how we developed a PoC of a secure clinical data exchange app for paramedics which has already passed beta testing and attracted 30+ new clients.
Learn how Itransition developed an RPA bot that automates addition of new candidates to the HRM, significantly increasing process efficiency and reliability.