What is Data Mart
A data mart refers to a subset within a data warehouse that is targeted at a particular business line. Data marts are repositories that contain summarized data, which can be used to analyze a particular section or unit of an organization.
What is Data Warehouse
A data warehouse can be described as a large central repository of data that includes information from multiple sources within an organization. This data is used to support business decisions by providing analysis, reporting, and data mining tools.
Comparison: Data Warehouse vs Data Mart
- Focus – A single topic or functional area of an organization
- Data Sources – Very few sources are linked to a single line of business
- Size – Less Than 100 GB
- Normalization – There is no preference between normalized or denormalized structures
- Decision Types – Tactical decisions regarding particular business lines or ways of doing things
- Cost – Usually from $10,000 and upwards
- Setup Time – 3-6 Months
- Data held – Typically summarized information
- Focus – Enterprise-wide repository for disparate data sources
- Data Sources – Many sources both internal and external from different parts of an organization
- Size – Minimum 100 GB, but large organizations often require terabytes.
- Normalization – Modern warehouses are often denormalized to speed up data querying and read performance
- Decision Types – Strategic decisions that impact the whole enterprise
- Cost – Variables but usually greater than $100,000. Cloud solutions costs can be significantly lower because organizations pay per usage
- Setup time – Minimum one year for on-premise warehouses. Cloud data warehouses can be set up much faster.
- Data – Summary, raw data, and metadata
Inmon vs. Kimball
Ralph Kimball and Bill Inmon, two data warehouse pioneers differ in how they view data warehouses from an organization’s perspective.
Bill Inmon’s design favors a top-down approach in which the data warehouse acts as the central data repository and is the most critical component of an organization’s data systems.
Inmon’s approach starts with the creation of a centralized corporate data model. The data warehouse is the physical representation of that model. When necessary, dimensional data marts can be created from the data warehouse to support specific business lines.
Inmon models integrate data from the data warehouse. This means that the data warehouse is the source for the data that is used in different data marts. This is how data consistency and integrity are maintained across organizations.
Ralph Kimball’s Data Warehouse design begins with the most critical business processes. This approach allows an organization to create data marts that combine relevant data from different subject areas. The data warehouse is the sum of all the data marts within an organization.
The Kimball approach to data warehouses is a conglomerate of many data marts. Inmon’s approach creates data marts using information from the warehouse. This contrasts with Kimball’s. Kimball stated in 1997 that “the data warehouse” is nothing but the union of all data marts.
Use Cases: Data Warehouse vs Data Mart
These use cases show you how to use each of the data warehouses and data mart approaches in data warehousing.
Data Marts Use Cases
- Because these activities are usually performed within a specific business unit and don’t require enterprise-wide data, a data mart approach is preferred for marketing analysis and reporting.
- An analyst in finance can use a financial data mart for financial reporting.
Use cases for centralized data warehouse
- To make informed decisions about expansion, a company must incorporate data from multiple sources. A data warehouse is required to combine data from marketing, sales, store management, customer loyalty, supply chain, and other areas.
- There are many factors that influence profitability in an insurance company. A central data warehouse is required to allow insurance companies to report on their profits. It can combine information from the claims department, sales, customer demographics, and investments.
Does Data Marts Still Have Relevance in Cloud Architecture
Data Mart architecture is a subtype of the data warehouse. It is the architecture that meets the needs of a particular user group. Data-driven decision-makers in organizations face a dilemma: When should they use data warehouses or data marts?
Data marts are useful for guiding tactical decisions at the departmental level, while data warehouses help with high-level strategic business decisions through a consolidated view all organizational data.
Two approaches to this challenge reflect the classic Bill Inmon-versus Ralph Kimball debate.
- Based on Bill Inmon’s view, the first step is to create a data warehouse that will serve as the central repository for all enterprise data. Data marts can then be created later to meet specific departmental needs.
- In keeping with Ralph Kimball’s ideas, the second approach is to create separate data marts, which hold aggregate data about the most important business processes, and then merge these data marts into a data warehouse.
Although data warehouses offer a single, convenient place to store all enterprise data, the costs of setting up such a system are much higher than those for building data marts. It takes a lot of time to create data warehouses on-site.
Cloud-based data warehouse services make data warehouses easier, quicker, and more affordable to set up. This eliminates the need to “start small”.
Cloud-based data warehouse services can be extremely cost-effective and scalable. Organizations of any size can use cloud infrastructure to create a central data warehouse and then leverage it.