One of the pioneers in SaaS BI planned to launch a brand new software for predictive and sentiment analysis that would overcome the major challenges of big data processing.
Business Intelligence remains a key investment area for many companies. Today, to survive in a highly competitive environment, organizations have to deal with bulk business-critical data that previously was not even recognized as important.
The nature, volume, and dynamics of that data often make traditional BI tools ineffective or useless at all. That is why businesses expand their BI facilities to leverage data mining techniques to identify valuable information, support decision making and enhance business intelligence in general.
Such smart BI solutions require expensive investments into both software implementation and hardware infrastructure, which is hardly acceptable for enterprises and becomes almost unavailable for smaller businesses. And this is where cloud-based SaaS becomes a perfect option.
Our customer — a pioneer in SaaS BI — was inspired by the idea of providing big data analytics on demand.
Our customer planned to launch a brand new software for predictive and sentiment analysis that would overcome the major challenges of big data processing. It was targeted to assist retail and logistics companies in testing statistical models, revealing dependencies between various business parameters, and forecasting decision impact.
Time-to-market goals were vital for our client to stay ahead of the competition in a rapidly changing BI domain and cutting-edge cloud technologies. The project had tight time frame thus requiring intensive development, agile project management and precise coordination.
The company engaged Itransition as a mature technology partner to deliver SaaS product that would enable analytical processing of bulk data uploaded online. Assuming that the application deals with huge data arrays in on-demand mode, high performance and scalability were crucial requirements. The objective should have been resolved by carefully designed solution architecture optimized for cloud development.
Business users are often frustrated by the deployment cycles, costs, complicated upgrade processes and IT infrastructures demanded by on-premises BI solutions. SaaS- and cloud-based BI is perceived as offering a quicker, potentially lower cost and easier-to-deploy alternative, though this has yet to be proven
Keeping in mind best OPD methodologies and practices Itransition developed a software product that serves as an analytical platform providing users with multiple options to process bulk data to receive predictive analysis results.
The solution is designed to work on Amazon EC2 and comprises 3 major blocks: data uploading module, processing kernel, and visualization module.
Users are able to upload multiple files containing data via web based interface with the drag-and-drop feature. The software supports various file formats such as SVM, CSV, ARFF, etc. Service subscribers have the opportunity to manage cloud infrastructure and identify folders/area to use down to every single file. To enable automatic data uploading Itransition built simple and versatile API that allows integration with various data sources/applications.
C&RT abbreviation stands for Classification and Regression Tree
Generally, the software allows users to process data using C&RT Methods. The platform provides a set of tools to build, train and test appropriate statistical models. Users can also configure what data files are to participate in model building and identify testing methods for different datasets and files.
The analytical output is supported with comprehensive data visualization assisting in results recognition and interpretation.
Due to specifics of big data processing and modeling, Itransition utilized special plotting library to enable fast and accurate data visualization.
Amazon Cloud is used as a deployment foundation to provide high on-demand scalability. The objective required from Itransition special cloud development techniques and architectural approach. Amazon EC2 enabled the hosted application to scale up and down within minutes according to the volume of uploaded datasets. The system user can exploit hundreds of server instances simultaneously.
Java was identified as the most relevant technology to deliver the required functionality scope, proper level of scalability and performance. Leveraging its strong Java skills Itransition delivered easy to maintain and expand modular application completely compatible with Amazon EC2.
HDFS abbreviation stands for Hadoop Distributed File System
Hadoop MapReduce is a programming model and software framework that provides the solution with the ability to rapidly process vast arrays of data in parallel with large clusters of compute nodes. HDFS was selected in consideration of its ability to store extremely large data sets, and to stream data at high bandwidth to user applications.
Modular architecture minimizes overhead by only starting GlassFish Server modules that are required to serve running application. It supports application clusters high availability and scalability.
Akka billing mechanism tracks and summarizes all performed operations per session and generates a custom bill based on resources consumed (depending, in turn, on processed data volumes and complexity of the models
Project release was delivered in accordance with the Agile methodology that brought rapid, incremental, and efficient application development approach. The major part of the application was assembled in less than a year as the customer intended to launch the platform at the earliest possible date to speed up the return of investments.
Right after the platform production release the client requested a permanent active development support to sophisticate the application and make its functionality more attractive from a business standpoint. A 3.5-year long productive collaboration is still in progress