Data model in data warehouse

Introduction


In data warehouse, a data model is like a blueprint that shows how information is arranged. It helps that data is stored and used in a way that's good for analyzing and reporting.

Fact tables

A fact table is a central table in data warehouse. Fact tables have foreign key columns that link to the primary keys of dimension tables. These foreign keys establish relationships between the fact table and dimension table

Dimension tables

Dimension tables store descriptive attributes that provide context to the quantitative data stored in the fact table. Each dimension table has a primary key, a unique identity for each record in the table. The primary key is used to establish relationships with the fact table. Dimension tables can have different levels that show how things are related in more or less detail. for example year > quarter > month > day

Types of data model

  • Star Schema

The star schema is a popular type data model because it is easy to understand and query.Its allows fast query response time. In a star schema, each dimension table is connected to the fact table by a foreign key relationship. But dimension table is not connect with other dimension table.



                                                                                                 Figure 1.1

       From above Figure 1.1 you can easily understand the star schema.by give example we can more clarify the Star Schema.





In this example, the SALES table contains only four columns with IDs from the dimension tables, TIME, ITEM, BRANCH, and LOCATION, instead of four columns for time data, four columns for ITEM data, three columns for BRANCH data, and four columns for LOCATION data.  the size of the fact table is reduced. If we need to change an item, we need only make a single change in the dimension table, instead of making many changes in the fact table.

  • Snowflake Schema

A snowflake schema is equivalent to the star schema. "A schema is known as a snowflake if one or more dimension tables do not connect directly to the fact table but must join through other dimension tables." The snowflake schema is like a star schema, but with more details.
 
In the star schema, each main point spreads out into even more detailed points, making it more complex. The schema is diagramed with each fact surrounded by its associated dimensions, and those dimensions are related to other dimensions, branching out into a snowflake pattern. The snowflake schema consists of one fact table which is linked to many dimension tables, which can be linked to other dimension tables through a many-to-one relationship. Tables in a snowflake schema are generally normalized to the third normal form.




                                                                                    Figure 1.2



    From Figure 1.2 can easily understand that how snowflake schema is look like.by give example we can more clarify the Snowflake schema 




A snowflake schema is designed for flexible querying across more complex dimensions and relationship. It is suitable for many to many and one to many relationships between dimension levels.



  • Fact Constellation 
 is a schema for representing multi-dimensional model. It is a collection of  multiple fact tables having some common dimension table. It can be collection of n number star schemas.
hence also known as Galaxy schema.It is more complex than star schema as well as snowflake schema .




                                                                                                Figure 1.3


From figure 1.3 we can easily understand that what is fact contellation .by give example we can more clarify the fact contellation.




This schema defines two fact tables, sales, and shipping. Sales are treated along four dimensions, namely, time, item, branch, and location. The schema contains a fact table for sales that includes keys to each of the four dimensions, along with two measures: Rupee_sold and units_sold. The shipping table has five dimensions, or keys: item_key, time_key, shipper_key, from_location, and to_location, and two measures: Rupee_cost and units_shipped.

The primary disadvantage of the fact constellation schema is that it is a more challenging design because many variants for specific kinds of aggregation must be considered and selected.



        















Comments