The Data Vault modeling approach
In the Data Warehouse environment there are two well-known modeling approaches based on Kimball and Inmon, which have been used for countless years when it comes to storing data. Since the emergence of these methods, the demands on technologies, concepts and best practices in the working environment have been constantly evolving. Often, larger data volumes and the required flexibility to today's systems pose major problems for these approaches. Therefore, it is questionable whether they are still appropriate for all of today's modern questions and requirements.
This led to the Data Vault modeling approach. The hybrid approach combines all advantages of the third normal form with the star schema. Especially today, companies have to transform their business in ever shorter cycles and map these transformations in the Data Warehouse. Data Vault supports precisely these requirements without significantly increasing the complexity of the Data Warehouse over time. In contrast to Kimball and Inmon, this eliminates the ever-increasing IT costs of extensive implementation and test cycles and a long list of possible dependencies.
The Data Integration Architecture of the Data Vault approach has robust standards and definition methodologies that gather information for a targeted use. The model consists of three basic table types:
- Hub (blue): Contains a list of unique business keys, such as customer numbers
- Link (orange): Establishes relationships between the business keys. Links are often used to handle changes in data granularity and reduce the impact of adding a new business key to a linked hub
- Satellite (turquoise): Contains descriptive attributes that can change over time. Where hubs and links form the structure of the data model, satellites contain temporal and descriptive attributes, including metadata that links them to their parent hub or link tables
Due to the structure and the defined standards there are many advantages of the Data Vault approach:
- Massive reduction of development time for the implementation of business requirements
- Earlier Return of Investment (ROI)
- Scalable Data Warehouse
- Traceability of all data including the source system
- Near-Real-Time loading (besides classical batch run)
- Big Data Processing (>Terabytes)
- Iterative, agile development cycles with incremental expansion of the DWH
- Few automatable ETL patterns
This technological advance is already proving to be very effective and efficient for many of our customers. We would like to build the agile Data Warehouse of the future together with you.
The right vendor for every project
Are you looking for technical support in the areas of Analytics: Data Integration, Big Data/Data Warehouse, Business Intelligence, Data Science or Corporate Performance Management? We work with OpenSource technologies and commercial solutions of the following vendors: