What is a slowly changing dimension (SCD)? Explain SCD Type 1, Type 2, and Type 3.

In data warehousing, a Slowly Changing Dimension (SCD) refers to how you handle changes to dimension data over time. Because dimension data, such as customer addresses or product descriptions, can change, you need a strategy to manage those changes. Here's an explanation of the most common SCD types:

What is a Slowly Changing Dimension (SCD)?

  • Essentially, SCDs are methods for managing changes in dimension tables. They address the problem of how to handle changes to attribute values in a dimension table.

SCD Types :

  • SCD Type 1: Overwrite
    • This is the simplest method. When a dimension attribute changes, the existing record is overwritten with the new value.
    • Characteristics:
      • Historical data is lost. Only the current value is stored.
      • Easy to implement.
      • Suitable for attributes where historical tracking is not required.
    • Example: If a customer's phone number changes, the old phone number is replaced with the new one.
  • SCD Type 2: Add New Row
    • When a dimension attribute changes, a new record is added to the dimension table.
    • Characteristics:
      • Historical data is preserved.
      • Each record is typically given effective start and end dates to indicate its validity period.
      • Requires a surrogate key to distinguish between different versions of the same dimension member.
      • Allows for historical analysis.
    • Example: If a customer moves to a new address, a new record is added to the customer dimension table with the new address and the corresponding effective dates.
  • SCD Type 3: Add New Attribute
    • When a dimension attribute changes, a new column is added to the dimension table to store the previous value.
    • Characteristics:
      • Limited historical tracking (typically only the previous value is stored).
      • Can lead to wide tables.
      • Suitable for attributes where only a limited history is required.
    • Example: If a product's price changes, a "previous price" column is added to the product dimension table.

Key Considerations :

  • The choice of SCD type depends on the specific requirements of the data warehouse and the analysis that will be performed.
  • SCD Type 2 is often the most common because it provides the most comprehensive historical tracking.