Five Cornerstones of Using Big Data to Make Business Decisions

7 min.

According to Forrester Blogs, the majority of enterprises leverage only 40% of structured data and 31% of unstructured data to make data-driven decisions. If effective use of data is tantamount to gain competitive edge, then why are enterprises still lagging behind? Today I want to look at five cornerstones of using big data to make business decisions

1. Get data anywhere you can get data

All data you are able to find (customer information, sales, follower trends, subscriptions, returning customers, the frequency of their repeated purchases and even word of mouth data) should be used to drive decision making. Don’t be discouraged by the fact that you will have to deal with big data from within and outside the enterprise, data that comes in a variety of forms and can be either structured or unstructured.

2. Know your data types before you start data processing

We work with any type of data. The point of our involvement as a custom software service provider is to write a driver for each type of data. Having a driver helps us work with any data type, regardless of where or who it came from and in which format it was stored initially. By developing drivers and subsequent system processing we arrive at an expected easy-to-digest result in a universally accepted format. It’s also vital to know how to work with different data sources. The first step in data processing is to determine the type of data source.

  • CSV. Most of the time we utilize a data management tool which maps out the data path and informs us of the data type. We don’t expect that the type will change in the process. When we work with CSV data, for example, the CSV driver launches, loading the data in a format most convenient for us and the next step starts.
  • Binary data. We have data coming from clients in a binary format/view. This will require a custom driver since binary data is hard to parse. By accepting binary data we already know its structure beforehand.
  • Databases. Cases where we get a whole database to work with are similar to CSV data processing, only we don’t load data from a CSV but from tables of other databases.
  • Custom formats. At times business owners purchase data in custom formats. Knowing the structure of received files, we write custom drivers for data processing.
  • Data from APIs. In theory we can also load data from APIs. Data processing is similar: we write a driver that converts data into convenient formats. We will be able to build reports, join this data with other data, and give clients processed data for further steps (data verification and validation, inconsistency reporting, and so on).

3. Visualize and centralize your data

Build a strong front end where data processing output results have a uniform look, easily consumable by the average business user and readily available to the right players with power to transform business decisions into successfully met goals. Make sure decision-makers have real-time access to this front-end via any device and mobile network. The more you visualize, the faster data is to consume.

Below is an image describing Big Data Processing by Itransition. The client uses different data sources (from Data Vendor 1, 2, 3, etc.) in varying formats. Itransition utilized a custom-written automated decision-maker tool called Driver Selector. The Driver Selector further directs data to the appropriate Driver. From there on data is centralized in an Microsoft SQL Server database, verified, prepared for output and finally delivered to the front end. (The complex process of data verification will be explained in detail in the next point.)

Big Data Processing by Itransition

4. Data verification

Sometimes when data is not coming from a reliable source, it has to be verified. This is partially an automated process, where we have ready-made algorithms for different types of data sources. For example, if you take data by country, one way to catch an inconsistency is to pay attention to what geographical divisions this particular country is divided into. If we are supposedly working with data on Russia, but we get tables with information on states, we reject this data since it is most probably data on the USA.

System of approval. Steps:

  • Level 1. Data is always loaded into a database but each type of data has a different approval level. First it is brought into the system, checked by our automated tests, and in case it’s verified, it passes the first level of approval. This data is not yet ready to be visible on the front end website.
  • Level 2. Reports are made and checked by a responsible party who confirms the validity of data.
  • Level 3. Then data goes to the analytics who work with this type of data source, and if they legitimize it, we change the approval type for this data and show it to end users on the website where it is finally available for further action.

Below is an image describing Data Verification by Itransition. Clients present raw data to us, which has to be processed by our Automated Data Verification System. The system consists of three aspects: checking consistency, automated guess and data adjustments. At the check consistency stage, data can be either validated and immediately prepared for output, or analyzed for inconsistencies. A blatant inconsistency can lead to automatic data rejection. When further information is needed, Itransition consults the client, adds the necessary changes to the algorithm, relaunches it and receives adjusted data prepared for output. Next time the same inconsistency is discovered, it will be included in the algorithm, speeding up the entire data verification cycle.

Data Verification by Itransition

It is very important to know your data verification challenges. We are often faced with text and number changes that need to be verified or rejected. If the configuration changes dramatically we alert the client that the structure has changed, which is sometimes approved, but leads to rewrites of the algorithm in other cases.

Below is an image describing an example of verifying data with the client based on a country name inconsistency. Data sources 1 and 2 have country names in the correct format, and are therefore labelled as valid data and sent to output. Data source 3 needs to be verified. Once the client confirms that “S. Afr.” means “South Africa”, this information is added to the system, data is adjusted and prepared for output.

Example of Data Verification

5. Focus your support

Many enterprises struggle when it comes to choosing a strategy to work with big data. Some opt for ready-made solutions for each data source, snow balling their support for each unstructured source into a state of utter chaos, where the same bugs have to be fixed 15 times adding to costs and stretching out schedules. With a generic customized solution written by a reliable provider you get faster updates, faster fixes, less time spent on support and more time dedicated to new features because you are just catering to one development branch.

Even though managing big data is never a walk in the park, using the tips mentioned above and teaming up with a reliable service provider can really help you reap all the benefits it has to offer.