Star, Snowflake and Galaxy Schema in Datawarehouse with examples

Star, snowflake, and galaxy schemas are three common data warehouse schema designs used for organizing and structuring data for efficient querying and reporting.

1. Star Schema:

  • Description: In a star schema, a central fact table is connected to multiple dimension tables through foreign key relationships. The fact table contains quantitative data (measurements or metrics), while the dimension tables store descriptive attributes that provide context to the data in the fact table.
  • Example:
    • Fact Table: Sales
      • Columns: OrderID, ProductID, CustomerID, DateID, Quantity, Revenue
    • Dimension Tables:
      1. Products
        • Columns: ProductID, ProductName, Category, Brand
      2. Customers
        • Columns: CustomerID, CustomerName, Address, Phone
      3. Dates
        • Columns: DateID, Date, DayOfWeek, Month, Year

2. Snowflake Schema:

  • Description: The snowflake schema is an extension of the star schema. In a snowflake schema, dimension tables are normalized, meaning that they are organized into multiple related tables. This results in a structure that resembles a snowflake when visualized.
  • Example:
    • Fact Table: Sales
      • Columns: OrderID, ProductID, CustomerID, DateID, Quantity, Revenue
    • Dimension Tables:
      1. Products
        • Columns: ProductID, ProductName
      2. ProductDetails
        • Columns: ProductID, Category, Brand
      3. Customers
        • Columns: CustomerID, CustomerName
      4. CustomerDetails
        • Columns: CustomerID, Address, Phone
      5. Dates
        • Columns: DateID, Date, DayOfWeek, Month, Year

3. Galaxy Schema (Constellation Schema):

  1. Description: The galaxy schema is a hybrid design that combines elements of both the star and snowflake schemas. It allows for more flexibility and complexity by connecting multiple fact tables to multiple dimension tables. This can be useful when dealing with multiple business processes or areas.
  2. Example:
    • Fact Table 1: Sales
      • Columns: OrderID, ProductID, CustomerID, DateID, Quantity, Revenue
    • Fact Table 2: Inventory
      • Columns: ProductID, WarehouseID, StockLevel, LastStockUpdate
    • Dimension Tables:
      1. Products
        • Columns: ProductID, ProductName, Category, Brand
      2. Customers
        • Columns: CustomerID, CustomerName, Address, Phone
      3. Dates
        • Columns: DateID, Date, DayOfWeek, Month, Year
      4. Warehouses
        • Columns: WarehouseID, WarehouseName, Location

Conclusion:

Each schema design has its own advantages and considerations. The star schema is often preferred for its simplicity and ease of use in query performance, while the snowflake schema offers normalization benefits at the cost of increased complexity. The galaxy schema provides a balance between the two, allowing for more intricate data relationships. The choice of schema depends on the specific requirements and characteristics of the data warehouse environment.

For more detailed explanation refer to below video:

The Datapedia, sponsored by “The Data Channel”,

Leave a comment

About the author

Sophia Bennett is an art historian and freelance writer with a passion for exploring the intersections between nature, symbolism, and artistic expression. With a background in Renaissance and modern art, Sophia enjoys uncovering the hidden meanings behind iconic works and sharing her insights with art lovers of all levels.

Get updates

Spam-free subscription, we guarantee. This is just a friendly ping when new content is out.