[arve url=”https://www.youtube.com/watch?v=t5N7PzcbDM0″ mode=”normal” align=”right” promote_link=”no” description=”dimensional approach of data modeling process for data warehouse/ETL/BI business reports” autoplay=”yes”/]
Data modeling is the process of building a database to meet the business requirement. The dimensional approach is one of the data modeling approaches to establishing a data mart or data warehouse database.
There are two types of schema building and it will be selected based on advantages and disadvantages between these two.
- Star Schema
- Snowflake schema
This type schema contains the fact table in center position. As we know that fact table contains a reference to dimension tables.
Then the fact table will be surrounded by dimension tables with foreign key reference. The dimension table will not have a reference with any other dimension.
Because of this, whole structure looks like a star, this type of schema is called star schema.
This type also contains a fact table in center position. The fact table has a reference to dimension tables.
The dimension table will have a reference to another dimension.
The data will be stored in the more normalized form.
Because of this, whole structure looks like a snow spread on the earth, it’s called Snowflake schema.
In below example, the Project dimension having a reference with Role dimension.
Difference Between Star Schema and Snowflake schema:
|Feature||Star Schema||Snowflake schema|
|Performance||As there is no relationship between dimensions to other dimensions the performance will be high.||Due to multiple links between dimensions the performance will be low.|
|Query complexity||The number of joins will be less which makes query complexity low||The number of joins will be more which makes query complexity high|
|Database size||Consider the Project dimension mentioned in above example it has Role column where the Role name value will be stored against for each project in case of start schema, the size of the table will be high||The role information is separately stored in a table and the reference will be linked in Project dimension, it reduces the table size|
|Normalization||Data will be stored in de-normalized format in dimension table||Data will be stored in more normalized format in dimension tables|
How to Select Schema Type:
- These are the deciding factors to select the schema type if you worry about the size of the database, then go ahead with snowflake which gives normalized approach.
- Same time if you look for more performance then start with Star schema approach.