Methods for Period Over Period (PoP) Analysis in L...

molly_lippsett · ‎06-23-2022

Looker will not be updating this content, nor guarantees that everything is up-to-date.

This post describes the period over period (pop) Method 8: Arbitrary Period and Many Previous Periods highlighted in the main Methods for Period Over Period (PoP) Analysis in Looker Community post.

Period-over-Period Method 8: Arbitrary Period and Many Previous Periods - Compare Any Period to Any Other Period

This method allows users to choose the current period, the grouping for that period, which period to compare (pivot), and how many periods to compare. This is the most complex PoP method, but it is also the most flexible.

Below is an example with order items in the current month-to-date, broken down by day and compared to last month:

Below is an example with an arbitrary date range, broken down by day and compared to the same date range for the previous three years:

Advantages and Disadvantages

Advantages:

The date range can be broken to a more granular level (e.g., months, days, weeks).
You can be very specific about which prior periods to compare, and you can quickly compare multiple periods at once (e.g., "x periods ago" or "past y periods").
This method defines possible previous periods in bulk. This is in contrast to Method 4, where previous periods are defined manually and individually.
The joins in this method allow for overlapping period comparison.
The LookML for this method lives in its own explore, so it is kept separate from any complexity in the rest of the model.
Although the initial setup is time consuming, the approach is modular and relatively easy to scale to add new date dimensions. You will only need a single new derived table and join for each date dimension. To achieve this, make a new view with a derived table for each new date, add the desired measures to the derived table definition, and join the new view to the explore. See the pop_order_items_created and pop_order_items_delivered views in the LookML example below.
- The structure is similar to joining different views to a central date dimension table. This makes it quick to compare different aggregations for the same period without triggering symmetric aggregates, which can limit expensive queries.

Disadvantages:

This method requires its own explore, whereas other approaches can be incorporated into a single view file.
- Any complex data transformations or joins required for the explore need to be replicated for each new date added, as the model is scaled.
- Users must navigate to a new Explore solely for PoP analysis.
Measures need to be defined in advance, rather than existing measures being reused. Any existing logic would need to be replicated, as the model is scaled to incorporate new dates.
The LookML is complex; maintenance and collaboration may be difficult.

Method

Create two views: one to configure the filters and plotted date dimensions, and a second to establish the maximum number of previous periods for comparison. These two views will remain untouched, as the method is scaled to include additional dates. See the pop and numbers views in the LookML below.
In the first view, create four hidden fields that become the filters users interact with, as depicted in the examples above. Two are filter fields, and two are dimensions:
These filters are hidden to avoid cluttering the Explore field picker. The filters are surfaced in the Explore with an always_filter parameter. See the flexible_pop explore under EXPLORE in the LookML example below.
- Filter 1 — Date Range: A filter of type: date for users to select the date range on which to filter. See the date_filter field under (HIDDEN) FILTERS TO CUSTOMIZE THE APPROACH in the LookML example below.
- Past Periods Filter: A filter of type: number for users to choose the number of past periods to compare. If the user makes no selection, the number of past periods will default to 1. See the over_how_many_past_periods field under (HIDDEN) FILTERS TO CUSTOMIZE THE APPROACH in the LookML example below.
- Filter 2 — Date Range Breakdown: A dimension of type: string for users to select the granularity of the selected date range (in quarters, months, weeks, days) by which to group aggregates. This dimension uses LookML case parameters to control the values the user can choose from. See the within_period_type field under (HIDDEN) FILTERS TO CUSTOMIZE THE APPROACH in the LookML example below.
- Filter 3 — Previous Period: A second dimension of type: string for users to select the time period for comparison and the granularity by which to group aggregates. Similar to Filter 2, this dimension uses LookML case parameters to control the values the user can choose from. See the over_period_type field under (HIDDEN) FILTERS TO CUSTOMIZE THE APPROACH in the LookML example below.
Next, in the second maximum periods view (numbers below), create a hidden dimension of type: number to represent the number of periods. We will use this dimension in the next step. See the n dimension in the numbers view below.
Returning to the first view, create three dimensions:
- One hidden dimension to serve as an order_by field. This dimension calculates the interval for the dimension to be plotted as rows and ensures it will always be ordered correctly. See the reference_date dimension under DIMENSIONS WE WILL ACTUALLY PLOT in the LookML example below.
- A second dimension to plot for rows, with appropriate date formatting applied. This field will be ordered by the hidden dimension above. See the reference_date_formatted dimension under DIMENSIONS WE WILL ACTUALLY PLOT in the LookML example below.
- The last dimension will be for pivoting the prior periods for comparison. It is ordered by the type: number number of periods dimension (n, in this example) from the second maximum periods view. This ensures that each previous period is ordered in the correct way. See over_periods_ago under DIMENSIONS WE WILL ACTUALLY PLOT in the LookML example below.
Create an explore, joining the first two views together. The maximum periods view, numbers will be joined twice — once for calculating the size of the selected current period, in the units selected by the Date Range Breakdown filter (within_period_type), and again to determine the number of previous periods, as selected by the Past Periods Filter (over_how_many_past_periods). See the within_periods and over_periods joins in the flexible_pop explore under EXPLORE in the LookML example below.
Now, create measures, or aggregations, based on each date field for users to compare. For example, an orders created date or an orders delivered date. This requires developing one derived table view per date field.
- Create one view and one derived table per date field. The derived table sql definition needs to include a date for joining the view to the original explore. This join date is the date field, which is updated dynamically to match the date granularity selected in Filter 2 — Date Range Breakdown from the first view. The derived table sql definition also should contain the desired aggregations, such as COUNT and SUM. See the pop_order_items_created view under VIEWS TO EDIT in the LookML example below.
- Define the aggregations as LookML measures that users can select and view in an Explore. See agg_1 and agg_2 in the pop_order_items_created view under VIEWS TO EDIT in the LookML example below.
- Join the view into the explore. See the pop_order_items_created join in the flexible_pop explore under EXPLORE in the LookML example below.
- Repeat the above steps for the additional date field aggregations.

LookML Example

All SQL in the LookML sql parameters in the following examples is specific to the Redshift dialect and will need to be adapted to your database dialect.

    ###  Period over Period Method 8: Compare any period to any other period
    # the most complex implementation, but also the most flexible


    #### ------------ VIEWS TO LEAVE ALONE  ------------ ###

    # This view sets up the config (doesn't need editing)
    view: pop {
      sql_table_name: (SELECT NULL) ;;
      view_label: "_PoP"

    ### ------------ (HIDDEN) FILTERS TO CUSTOMIZE THE APPROACH ------------


    # Choose a date range to filter on
    filter: date_filter  {
    label: "1. Date Range"
    hidden: yes
    type: date
    convert_tz: no
    }

    # A second filter to choose number of past periods. Defaults to 1 if this is not selected
    filter: over_how_many_past_periods {
    label: "Override past periods"
    description: "Apply this filter to add past periods to compare to (from the default of current vs 1 period ago)"
    type: number
    default_value: "<=1"
    }


    # Choose how to break the range down - normally done with a parameter but here is a dimension
    dimension: within_period_type {
    label: "2. Break down date range by"
    hidden: yes
    type: string
    #Using case just to get friendlier UI experience in filters. Otherwise, could have a no-sql filter field
    case: {
        when: {
        sql: {% parameter pop.within_period_type %}='quarter' ;;
        label: "quarter"
        }
        when: {
        sql: {% parameter pop.within_period_type %}='month' ;;
        label: "month"
        }
        when: {
        sql: {% parameter pop.within_period_type %}='week' ;;
        label: "week"
        }
        when: {
        sql: {% parameter pop.within_period_type %}='day' ;;
        label: "day"
        }
        when: {
        sql: {% parameter pop.within_period_type %}='hour' ;;
        label: "hour"
        }
    }
    }

    # Choose the previous period
    # Again we use a dimension here instead of a parameter
    dimension: over_period_type {
    label: "3. Compare over"
    hidden: yes
    type: string
    #Using case just to get friendlier UI experience in filters. Otherwise, could have a no-sql filter field
    case: {
        when: {
        sql: {% parameter pop.over_period_type %}='year' ;;
        label: "year"
        }
        when: {
        sql: {% parameter pop.over_period_type %}='quarter' ;;
        label: "quarter"
        }
        when: {
        sql: {% parameter pop.over_period_type %}='month' ;;
        label: "month"
        }
        when: {
        sql: {% parameter pop.over_period_type %}='week' ;;
        label: "week"
        }
        when: {
        sql: {% parameter pop.over_period_type %}='day' ;;
        label: "day"
        }
    }
    }

    ### ------------ DIMENSIONS WE WILL ACTUALLY PLOT ------------


    # This is the dimension we will plot as rows
    # This version is always ordered correctly
    dimension: reference_date {
    hidden: yes
    #type: date_time <-- too aggressive with choosing your string formatting for you
    #type: date <-- too aggressive with truncating the time part
    #convert_tz: no
    #type: nothing <-- just right
    sql: DATE_TRUNC({% parameter pop.within_period_type %},DATE_ADD({% parameter pop.within_period_type %},0 - ${within_periods.n} - 1,{% date_end pop.date_filter %}));;
    }

    # This is the version we will actually plot in the data with nice formatting
    dimension: reference_date_formatted {
    type: string
    order_by_field: reference_date
    label: "Reference date"
    sql:  TO_CHAR(
        ${reference_date},
        CASE {% parameter pop.within_period_type %}
        WHEN 'year' THEN 'YYYY'
        WHEN 'month' THEN 'MON YY'
        WHEN 'quarter' THEN 'YYYY"Q"Q'
        WHEN 'week' THEN 'MM/DD/YY' --or 'YYYY"W"WW' or 'YY-MM"W"W'
        WHEN 'day' THEN 'MM/DD/YY'
        WHEN 'hour' THEN 'MM/DD HHam'
        ELSE 'MM/DD/YY'
        END)
        ;;}


    # This is the dimension we will plot as pivots
        dimension: over_periods_ago  {
        label: "Prior Periods"
        description: "Pivot me!"
        sql: CASE ${over_periods.n}
                    WHEN 0 THEN 'Current '||{% parameter pop.over_period_type %}
                    WHEN 1 THEN ${over_periods.n}||' '||{% parameter pop.over_period_type %} || ' prior'
                    ELSE        ${over_periods.n}||' '||{% parameter pop.over_period_type %} || 's prior'
                    END;;
        order_by_field: over_periods.n
        }
    }

    # This view establishes the max number of previous periods (doesn't need editing)
    view: numbers {
    sql_table_name: (
                SELECT 00 as n UNION ALL SELECT 01 UNION ALL SELECT 02 UNION ALL
                SELECT 03 UNION ALL SELECT 04 UNION ALL SELECT 05 UNION ALL
                SELECT 06 UNION ALL SELECT 07 UNION ALL SELECT 08 UNION ALL
                SELECT 09 UNION ALL SELECT 10 UNION ALL SELECT 11 UNION ALL
                SELECT 12 UNION ALL SELECT 13 UNION ALL SELECT 14 UNION ALL
                SELECT 15 UNION ALL SELECT 16 UNION ALL SELECT 17 UNION ALL
                SELECT 18 UNION ALL SELECT 19 UNION ALL SELECT 20 UNION ALL
                SELECT 21 UNION ALL SELECT 22 UNION ALL SELECT 23 UNION ALL
                SELECT 24 UNION ALL SELECT 25 UNION ALL SELECT 26 UNION ALL
                SELECT 27 UNION ALL SELECT 28 UNION ALL SELECT 29 UNION ALL
                SELECT 30 UNION ALL SELECT 31 UNION ALL SELECT 32 UNION ALL
                SELECT 33 UNION ALL SELECT 34 UNION ALL SELECT 35 UNION ALL
                SELECT 36 UNION ALL SELECT 37 UNION ALL SELECT 38 UNION ALL
                SELECT 39 UNION ALL SELECT 40 UNION ALL SELECT 41 UNION ALL
                SELECT 42 UNION ALL SELECT 43 UNION ALL SELECT 44 UNION ALL
                SELECT 45 UNION ALL SELECT 46 UNION ALL SELECT 47 UNION ALL
                SELECT 48 UNION ALL SELECT 49 UNION ALL SELECT 50 UNION ALL
                SELECT 51 UNION ALL SELECT 52 )
            ;;
    dimension: n {
    type: number
    hidden:  yes
    sql: ${TABLE}.n ;;
    }
    }

    #### ------------ VIEWS TO EDIT  ------------ ###

    view: pop_order_items_created {
    view_label: "Order Items (By created date)"
    sql_table_name: (
    SELECT
        DATE_TRUNC({% parameter pop.within_period_type %}, order_items.created_at) as join_date,
        COUNT(*) as agg_1,
        SUM(order_items.sale_price) as agg_2
        FROM order_items
    -- Could add templated filters here
    -- OPTIONAL : Filter inner query on min/max dates (since query optimizer probably won't)
        GROUP BY 1
        ) ;;

    measure:  agg_1 {
    type:  number
    label: "Count"
    sql: SUM(${TABLE}.agg_1) ;;
    }
    measure:  agg_2 {
    type:  number
    label: "Total Sales"
    sql: SUM(${TABLE}.agg_2) ;;
    }
    }



    #### ------------ EXTENSIBLE VERSION (Multiple dates)  ------------ ###
    ## This is what the above looks like if we scale it to work with multiple date fields
    ## We make one view per date and join them all to the model

    view: pop_order_items_delivered {
    view_label: "Order Items (By delivered)"
    sql_table_name: (SELECT
        DATE_TRUNC({% parameter pop.within_period_type %},order_items.shipped_at) as join_date,
        COUNT(*) as agg_1,
        SUM(order_items.sale_price) as agg_2
        FROM order_items
        WHERE {%condition pop_order_items_delivered.sale_price %}order_items.sale_price{% endcondition %}
        GROUP BY 1
        ) ;;
    }

    ### ------------ EXPLORE ----------------###

    explore: flexible_pop {
        label: "PoP Method 8: Flexible implementation to compare any period to any other"
        from:  pop
        view_name: pop

        # No editing needed - make sure we always join and set up always filter on the hidden config dimensions
        always_join: [within_periods,over_periods]
        always_filter: {
          filters: [pop.date_filter: "last 12 weeks", pop.within_period_type: "week", pop.over_period_type: "year"]
        }

    # No editing needed
        join: within_periods {
          from: numbers
          type: left_outer
          relationship: one_to_many
          fields: []
          # This join creates fanout, creating one additional row per required period
          # Here we calculate the size of the current period, in the units selected by the filter
          # The DATEDIFF unit is in days, so if we want hours we have to multiply it by 24
          # (It might be possible to make this more efficient with a more granular function like TIMESTAMPDIFF where you can specify the interval units)
          sql_on: ${within_periods.n}
                <= (DATEDIFF( {% parameter pop.within_period_type %},{% date_start pop.date_filter %},{% date_end pop.date_filter %} ) - 1 )
                 * CASE WHEN {%parameter pop.within_period_type %} = 'hour' THEN 24 ELSE 1 END;;
        }
    # No editing needed
        join: over_periods {
          from: numbers
          view_label: "_PoP"
          type: left_outer
          relationship: one_to_many
          sql_on:
                      CASE WHEN {% condition pop.over_how_many_past_periods %} NULL {% endcondition %}
                      THEN
                        ${over_periods.n} <= 1
                      ELSE
                        {% condition pop.over_how_many_past_periods %} ${over_periods.n} {% endcondition %}
                      END;;
        }

    # Rename (& optionally repeat) below join to match your pop view(s)
        join: pop_order_items_created {
          type: left_outer
          relationship: many_to_one
          #Apply join name below in sql_on
          sql_on: pop_order_items_created.join_date = DATE_TRUNC({% parameter pop.within_period_type %},
                          DATEADD({% parameter pop.over_period_type %}, 0 - ${over_periods.n},
                              DATEADD({% parameter pop.within_period_type %}, 0 - ${within_periods.n},
                                  {% date_end pop.date_filter %}
                              )
                          )
                      );;
        }

      }

LucianaPadua · ‎06-26-2022

Hi,

I’m in a situation where I need to use this and trying to implement.

However, not sure how I would build dynamic measures comparisons.

I understand that this method would build the absolute measures, i.e in this case Order items delivered in each compared periods. This would look already good in a Visualization, but what if I would like to also have the measure itself of agg 1 vs agg 2?

I would like to avoid table calculations.

Also, the example here is very simple in the sense that it’s just assuming our aggregation level is date and no other, whereas in reality we have usually channels, countries, etc.

So would this approach work with more aggregations?

Is there any analytical block laready available that can help me achieve more complex comparisons?

thanks.