Even though insight-driven businesses are known to have higher chances of gaining commercial success, only 32% of the interviewed C-suite executives claim their companies can create value from data according to the Human Impact of Data Literacy report by Accenture. These are poor results, particularly in light of all heavy investments made in business analytics initiatives across the industries surveyed. Among the underlying reasons are:
- BI initiatives remain purely IT projects that are not linked to business priorities
- Low adoption rates due to bad user experience and reluctant potential adopters
- Lack of executive support
- Technology constraints despite high BI market maturity
Such a state of things has naturally forced many companies to resort to business intelligence services in search of ways to build a business intelligence architecture, which would be perfectly aligned with business needs and therefore help make effective decisions.
What is BI architecture?
Business intelligence (BI) architecture is the infrastructure a company deploys to support all the stages of the BI process – from data collection, cleansing, structuring, storage, and analysis to reports and dashboard delivery and insight operationalization.
BI architecture is a part of the enterprise architecture, which defines the IT environment (the structure) and business processes (operation of the company) for the company to achieve its strategic and tactical goals.
5 components to build a business intelligence architecture
1. Data sources
Anything that produces the digital information BI systems consume is considered a data source. Data sources can be internal, with the information captured and maintained within a company, and external, when the information is generated outside the organization.
|Internal data sources
|External data sources
As new business requirements are formulated, data sources provide the BI solution with different types of information (structured, semi-structured, or unstructured), coming in streams or batches, varying in volumes from kilobytes to petabytes and more. However, not all of the corporate data sources are destined for the BI solution, as some information may be inaccessible, untrustworthy, or simply irrelevant for current business needs.
2. Data integration and data quality management layer
The second step of the BI process is aimed at consolidating datasets from multiple sources for a unified view – that is how the information becomes viable for analytics and operational purposes. There are several data integration methods, the choice of which is dictated by the information type, format, and volumes, as well as the purpose – operational reporting, business analysis, ML use cases, etc.
Extract, Transform, and Load (ETL)
Extract, Transform, and Load (ETL) involves the retrieval of batches of information from the sources of data, conversion into another format/structure, and placement into ultimate storage. While the extract and load parts are rather mechanical, the transformation stage is a complex activity, which involves:
- Data profiling – a detailed examination of information, its type, structure, and quality, which would define what kind of transformations are reasonable.
- Data mapping – matching the data field of the source to the ultimate one.
- Code creation and actual execution to convert data according to the mapping rules.
- Data audit to ensure the performed transformation is correct and the output data adheres to the set requirement.
The exact transformations (which are multiple) are defined by business rules. They may be:
- Aggregation of several columns into a single one or vice versa, the split of a column into several ones.
- Encodement of values or translation of the existing ones (‘Male’ to ‘M’, ‘Female’ to ‘F’, or ‘1’ to ‘Female’, ‘2’ to ‘Male’, etc.).
- Creation of new calculations (for example, to follow varying accounting rules).
- Conversion of low-level data attributes into high level-attributes.
- Derivation of new attributes out of the existing ones, etc.
Extract, Load, and Transform
An alternative to the ETL process, the ELT approach implies the transformation happens after data loading. Firstly, this approach saves time and resources, secondly, it better suits the needs of data scientists and data analysts who often want to experiment with raw data and perform a non-trivial transformation. That explains the predominant application of this approach for ML, AI and big data scenarios.
The process can take place either in batches or real-time streams and encompasses copying information from the source system to its destination with no or minimal transformations. Data replication is used as an optimal data integration method when a company needs to copy data for backup, disaster recovery, or operational reporting.
Change data capture
This is a real-time data integration method that aims to detect what data changes happened in the source systems and update the destination storage accordingly.
Streaming data integration
This method implies continuous integration of real-time data for operational reporting, real-time analytics, or temporary storage before further processing.
This data integration method stands apart from the rest as it does not imply the physical movement of information and provides the consolidated data view by creating a unified logical view accessible via a semantic layer.
Data quality management
Data integration and data cleansing are two processes happening in parallel. Data ingested from multiple sources may be inconsistent or data sets may be duplicated. In addition to the problems that occur when data is collected from numerous sources, data may be just of poor quality, with some information missing, irrelevant in terms of time or value, etc. To deal with these issues, the component is structured with the technologies, which automate:
- Assessment of data quality and identification of data issues (data profiling)
- Correction of data errors, data scrubbing (removing duplicate and bad data), data enrichment, etc.
- Audit of data quality against the set data quality metrics
- Reporting on data quality issues, trends, performed activities, etc.
|Percentage of how many data sets are the same across different locations and storage.
|A percentage that shows how much information you have versus the amount of information required for the task.
|Percentage of correct values (representing actual events, amounts, statistics, etc.) out of the total information.
|Percentage of accurate data that is available and accessible within a set time period.
|Percentage of data that adheres to the set formatting requirements out of the total information.
|Percentage of the original data (not duplicated or overlapping) against total information.
|Percentage of information with the full and transparent edit history (the last time the data was updated and whom/what by).
3. Data repositories
This component encompasses various repositories that structure and store data for further processing. There are two major data repository groups:
Analytical data stores
Enterprise data warehouse – a unified repository with cleaned, consolidated data. Businesses use different types of databases for this purpose:
- relational (stores data in rows)
- columnar (stores data in columns)
- multidimensional (stores data in a data cube format)
Data marts – storage repositories tailored to the analytics and reporting needs of particular user groups (departments, divisions, etc.). They may be complementary to an EDW or function as self-consistent analytics repositories, as the approach to building a data warehouse defines it.
Since volumes and content of data varies significantly these days, having the analytics repositories solely may not be sufficient for a company. Storing all data in such repositories is resource- and time-consuming, not to mention costly. That is why many businesses today, if they do not have complementary storage repositories, set up such a technology environment that allows quick implementation of additional stores in the future. Such repositories could be:
Operational data stores – before getting uploaded into the data warehouse, data may be replicated to operational data stores and stored there in its raw format until overwritten with the newer one. ODSs store actual information to support such use cases as operational reporting and lightweight analytics queries (the ones that are not complex and do not need historical information).
A data lake is another data repository that may reside within the BI environment to accompany an EDW. Most commonly, it is used to store voluminous data in its native format for backup and archive, interim storage before loading data into the EDW, or further processing with ML and big data technologies.
4. BI and analytics layer
This layer encompasses solutions for accessing and working with data and aimed at data analysts, data scientists, or business users.
This layer naturally reflects the organization’s BI maturity and its data analytics objectives: for some companies descriptive and diagnostic analytics capabilities are sufficient enough, others need to run comprehensive analysis supported with ML and AI via a self-service user interface.
The portfolio of tools may include:
- Query and reporting tools to request specific information and create reports with the derived insights. The reports may be delivered to business users via email on a scheduled basis or may be triggered by some events (for example, by a sudden drop in sales). They also may be embedded into applications business users leverage daily for enhanced user experience and quick operationalization.
- Online Analytical Processing (OLAP) tools to roll up and roll down, drill down, slice and dice, etc. business data placed into multidimensional cubes beforehand.
- Data mining tools to search for trends, patterns, and hidden correlations in data.
- ML and AI tools to create models that help companies predict future events, model what-if scenarios, automate analytics-related processes for people without domain background, etc.
- Data visualization tools to create dashboards and scorecards which then can be shared in a secure viewer environment, via a public URL, or through embedding into user applications.
If the solution is equipped with self-service capabilities, business users may not only passively consume the reports and dashboards curated for them by dedicated teams, but also run their analysis, build dashboards and scorecards, edit the existing content, and share their findings with colleagues.
5. Data governance layer
This element is closely intertwined with the other four, as its major aim is to monitor and govern the end-to-end BI process. With data governance standards and policies in place, a company controls who accesses the information and how, if the information used for analysis is of proper quality and is safeguarded well, etc. All these policies and standards make up a data management program that can be automated with the data governance tools with capabilities like:
- Data catalogs – capturing data and cataloging it with categories, tags, indexes, etc., which helps both tech and business users know what data is available, where it is maintained, who can access it, what the sensitivity risks are, etc.
- Business glossaries – authoritative sources with the common definitions of business terms for business users from different departments to eliminate any ambiguity.
- Low-code or no-code creation and configuration of data governance workflows and built-in data stewardship functions for data governance teams to manage data-related issues (for example, approve business glossary entries).
- Automated data lineage documentation for data quality management and compliance with data privacy laws.
- Role-based access control for setting user permissions.
- Automated data quality metrics generation, measurement, quality levels monitoring, etc.
- Centralized data policies and standards management (creation, configuration, monitoring of adoption and compliance, etc.), and so on.
Business intelligence teams: core roles and responsibilities
A BI team may include project roles, or specialists who perform particular BI activities, and program roles, or those involved in maintaining and managing the BI activities across the company. Depending on the company's size and allocated budget, the specialist may fulfill several roles and have similar or even overlapping responsibilities. Some of the common BI team’s roles include:
BI program manager
- Defines the scope of the BI program and each BI initiative, as well as timeframes, resources, deliverables, etc.
- Establishes the collaboration of the involved parties (BI teams involved in different BI initiatives)
- Oversees the business intelligence program execution and recommends changes to the existing BI processes based on their analysis, industry trends, new goals, objectives, etc.
BI project manager
- Defines the scope of the BI project, its objectives, stages, deliverables, success metrics, etc.
- Provides project estimates and project scheduling, assesses project risks and suggests solutions
- Oversees project execution, ensuring deadlines and goals are met
- Sets up communication with all involved stakeholders
BI solution architect
- Cooperates with business analysts and business stakeholders to define business requirements, designs appropriate BI solutions, and supervises their development
- Evaluates the existing BI environment, creates new requirements, prioritizes change requests, etc.
- Defines and improves data governance and data security practices
- Develops the BI solution, including data models, ETL/ELT pipelines, reports and dashboards, etc.
- Manages and maintains BI solution components
- Performs troubleshooting, optimizes reports and dashboards, handles data quality issues, etc.
- Reviews and validates business data and develops policies for data management
- Manages master and metadata including creation, update, and deletion
- Gathers business user requirements, supports end-users, and consults leadership
BI systems administrator
- Manages BI systems, monitoring systems performance, availability, backup, updates, etc.
- Installs and configures security settings, user access controls, etc.
- Performs troubleshooting and provides tech support to BI users, etc.
Implement your business intelligence program with Itransition
We carefully analyze your data management and analytics needs and based on that equip your existing IT ecosystem with individual BI components or build comprehensive BI environments.
BI architecture benefits
Maximized data value
By implementing a BI architecture, a company gets a high-performing information management environment, with all components connected and working together. With such a system in place, companies can gain maximum value from their data with minimal manual intervention and decrease the amount of dark data.
Offloaded IT department
Even though BI architecture does not equate to self-sufficiency in business users and can’t completely offload dedicated IT and data analytics teams, it considerably relieves the IT and DA departments from tedious data management tasks, such as collecting information from corporate systems, modeling, preparing routine reports, etc.
Increased efficiency of business users
As the implementation of the BI architecture results in streamlined information management and analytics processes, business users don’t have to delay decisions or make them intuitively.
Although implementing a full-scale BI architecture is an expensive initiative, it promises costs savings, because a company:
- Doesn’t need to deploy and run multiple systems to satisfy specific analytics needs of different departments
- Automates and unifies all data management activities
- Minimizes mistakes deriving from poor data quality, inefficient security, etc.
- Prevents shadow IT
The Importance of Investing in Data and Analytics Pipelines study conducted by IDC reveals that improved operational efficiency, increased revenue and profit are the top three metrics tied to the investments in data management and analytics. The same study states, however, that investing in individual technological solutions for data collection, transformation, analysis, etc. is not enough: such discrete investments may result in more silos and ‘data leaks’. This is why an end-to-end BI architecture, when implemented and maintained with proper expert guidance, is set to bring greater enterprise intelligence and subsequently, better business outcomes.