Arbitrary Period Comparisons

Google Analytics has a date selector that allows for comparison of two arbitrary time periods on the same chart. Period over Period analysis is useful for looking at this 30 days vs the previous 30 days, but it doesn’t let you compare how two arbitrary periods might compare.

For example: What did registration for next week’s webinar look like in comparison to a similar one we did 3 months ago?

Arbitrary time period analysis in Google Analytics

One way to do this in Looker is to use liquid variables to create an explore that looks like this:

There’s a few things going on here:

  1. First Period Filter & Second Period Filter: View filters to take input from the users
  2. Days from First Period: The magic to make this possible; conditional logic to find the lowest non-negative value from each period.
  3. Period Selected: used to pivot the chart and create the First Period and Second Period series

And here’s the LookML to put it together (for Redshift):


filter: first_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }
  
  filter: second_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }
  
  dimension: days_from_start_first {
    hidden: yes
    type: number
    sql: DATEDIFF('day',  {% date_start first_period_filter %}, ${created_date}) ;;
  }
  
  dimension: days_from_start_second {
    hidden: yes
    type: number
    sql: DATEDIFF('day',  {% date_start second_period_filter %}, ${created_date}) ;;
  }
  
  dimension: days_from_first_period {
    group_label: "Arbitrary Period Comparisons"
    type: number
    sql:
      CASE
       WHEN ${days_from_start_second} >= 0
       THEN ${days_from_start_second}
       WHEN ${days_from_start_first} >= 0
       THEN ${days_from_start_first}
      END;;
  }
  
  
  dimension: period_selected {
    group_label: "Arbitrary Period Comparisons"
    type: string
    sql:
        CASE
          WHEN ${created_raw} >=  {% date_start first_period_filter %}
          AND ${created_raw} <= {% date_end first_period_filter %}
          THEN 'First Period'
          WHEN ${created_raw} >=  {% date_start second_period_filter %}
          AND ${created_raw} <= {% date_end second_period_filter %}
          THEN 'Second Period'
          END ;;
  }

For BigQuery:

  filter: first_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }

  filter: second_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }

dimension: days_from_start_first {
hidden: yes
type: number
sql: DATE_DIFF( ${session_date}, CAST({% date_start first_period_filter %} AS DATE), DAY) ;;
  }

  dimension: days_from_start_second {
hidden: yes
type: number
sql: DATE_DIFF(${session_date}, CAST({% date_start second_period_filter %} AS DATE), DAY) ;;
  } 

    dimension: days_from_first_period {
type: number
sql:
  CASE
   WHEN ${days_from_start_first} >= 0
   THEN ${days_from_start_first}
   WHEN ${days_from_start_second} >= 0
   THEN ${days_from_start_second}
  END;;
}

  dimension: period_selected {
    group_label: "Arbitrary Period Comparisons"
    type: string
    sql:
        CASE
          WHEN ${session_raw} >=  {% date_start first_period_filter %}
          AND ${session_raw} <= {% date_end first_period_filter %}
          THEN 'First Period'
          WHEN ${session_raw} >=  {% date_start second_period_filter %}
          AND ${session_raw} <= {% date_end second_period_filter %}
          THEN 'Second Period'
          END ;;
  }
3 12 1,503
12 REPLIES 12

bpowis
Participant III

Hi @ernesto1

I am looking to implement exactly this functionality but having some problems copying over your code.

I am getting the error:

Failed to retrieve data - A valid date part argument is required, but found mydata.Date at [13:77]

I replaced ‘created_date’ in your code with ‘session_date’ from my data, where sessions_date is a time dimension group. Any idea where I am going wrong? Does this data field need to be a specific type?

Ben

Hey @bpowis! (where’s Holly btw?)

Which database? I wrote this for Redshift which expects DATEDIFF ( datepart, {timestamp}, {timestamp} ) but some databases the datepart is at the end.
`

bpowis
Participant III

Hi @ernesto1

Aha, of course! I have updated the code BigQuery and shared below. However, when I run this my results aren’t looking quite right - how do you get the data to align (row wise) on days from first period as this is different for each date range?

Code for BigQuery:

  filter: first_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }

  filter: second_period_filter {
    group_label: "Arbitrary Period Comparisons"
    type: date
  }

dimension: days_from_start_first {
hidden: yes
type: number
sql: DATE_DIFF( ${session_date}, CAST({% date_start first_period_filter %} AS DATE), DAY) ;;
  }

  dimension: days_from_start_second {
hidden: yes
type: number
sql: DATE_DIFF(${session_date}, CAST({% date_start second_period_filter %} AS DATE), DAY) ;;
  } 

    dimension: days_from_first_period {
type: number
sql:
  CASE
   WHEN ${days_from_start_first} >= 0
   THEN ${days_from_start_first}
   WHEN ${days_from_start_second} >= 0
   THEN ${days_from_start_second}
  END;;

}

  dimension: period_selected {
    group_label: "Arbitrary Period Comparisons"
    type: string
    sql:
        CASE
          WHEN ${session_raw} >=  {% date_start first_period_filter %}
          AND ${session_raw} <= {% date_end first_period_filter %}
          THEN 'First Period'
          WHEN ${session_raw} >=  {% date_start second_period_filter %}
          AND ${session_raw} <= {% date_end second_period_filter %}
          THEN 'Second Period'
          END ;;
  }

@bpowis- thanks for sharing a working BQ version, very cool!

I think all you’re missing is to turn off Plot Null Values

bpowis
Participant III

@ernesto1 I tried that, but I still have a big gap between my date periods, rather than then stacking on top of one another like in your example. Here’s a screenshot of my data, where I don’t have any alignment in pivoted columns, because they all have different Days from first period, does this make sense? 🙂

Ben

Could you try reversing the logic in days_from_first_period ?

  dimension: days_from_first_period {
    type: number
    sql:
      CASE
       WHEN ${days_from_start_first} >= 0
       THEN ${days_from_start_first}
       WHEN ${days_from_start_second} >= 0
       THEN ${days_from_start_second}
      END;;
  }

I think the other dialect intricacy is that date diff arguments for start/end are swapped in BigQuery vs Redshift. So you can do the above or flip arguments.

bpowis
Participant III

Hi @ernesto1

I tried this and the results are the same. The BigQuery docs say date_diff:

Returns the number of date_part boundaries between the two date_expression s. If the first date occurs before the second date, then the result is non-positive.

I am wondering if I need to change the code below too?

  dimension: days_from_start_second {
    hidden: yes
    type: number
    sql: DATE_DIFF(CAST({% date_start second_period_filter %} AS DATE), ${session_date}, DAY) ;;
  }

What is ${session_date} (my date field) doing here? Should this be today’s date, or something else?

Thanks for your help with this!
Ben

Hi @bpowis,

I recreated in BigQuery and you will need to swap your arguments for DATE_DIFF around - no need to swap the days_from_first_period around after you’ve done this.

dimension: days_from_start_first {
    hidden: yes
    type: number
    sql: DATE_DIFF( ${session_date}, CAST({% date_start first_period_filter %} AS DATE), DAY) ;;
  }

  dimension: days_from_start_second {
    hidden: yes
    type: number
    sql: DATE_DIFF(${session_date}, CAST({% date_start second_period_filter %} AS DATE), DAY) ;;
  } 

bpowis
Participant III

Amazing! Thanks again for your help getting this working with BigQuery - Holly will be delighted! 🧚

Hi,

First of all thanks @ernesto1 for this solution, Period over Period are a basic type of analysis to track temporal metrics and GA have a very nice approach.
Following the method described here I evolved the code (Redshift) to have this PoP comparisons in several granularities (hourly, daily, weekly, hour of day, …). The code is almost the same adding a granularity parameter to filter and several modifications adding CASEs to the days_from_first_period which I renamed to comparison_date:

parameter: granularity {
    default_value: "hourly"
    allowed_value: {
      label: "Hourly"
      value: "hourly"
    }
    allowed_value: {
      label: "Daily"
      value: "daily"
    }
    allowed_value: {
      label: "Weekly"
      value: "weekly"
    }
    allowed_value: {
      label: "Monthly"
      value: "monthly"
    }
    allowed_value: {
      label: "Yearly"
      value: "yearly"
    }
    allowed_value: {
      label: "Hour of day"
      value: "hour_of_day"
    }
    allowed_value: {
      label: "Day of the Week"
      value: "day_of_week"
    }
  }

dimension: comparison_date {
    group_label: "Arbitrary Period Comparisons"
    type: date_time
    sql:
      CASE
      WHEN {% parameter granularity %} = 'hourly'
        THEN dateadd(days, -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+(CASE
                               WHEN ${days_from_start_second} >= 0
                               THEN ${days_from_start_second}
                               WHEN ${days_from_start_first} >= 0
                               THEN ${days_from_start_first}
                              END),
                              dateadd(hours, ${created_hour_of_day}, {% date_end second_period_filter %}::date))

       WHEN {% parameter granularity %} = 'daily'
        THEN CASE
               WHEN ${days_from_start_second} >= 0
               THEN dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_second}, {% date_end second_period_filter %})
               WHEN ${days_from_start_first} >= 0
               THEN dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_first}-1, {% date_end second_period_filter %})
              END
       WHEN {% parameter granularity %} = 'weekly'
        THEN CASE
               WHEN ${days_from_start_second} >= 0
               THEN date_trunc('week',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_second}, {% date_end second_period_filter %}))
               WHEN ${days_from_start_first} >= 0
               THEN date_trunc('week',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_first}, {% date_end second_period_filter %}))
              END
        WHEN {% parameter granularity %} = 'monthly'
        THEN CASE
               WHEN ${days_from_start_second} >= 0
               THEN date_trunc('month',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_second}, {% date_end second_period_filter %}))
               WHEN ${days_from_start_first} >= 0
               THEN date_trunc('month',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_first}, {% date_end second_period_filter %}))
              END
        WHEN {% parameter granularity %} = 'yearly'
        THEN CASE
               WHEN ${days_from_start_second} >= 0
               THEN date_trunc('year',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_second}, {% date_end second_period_filter %}))
               WHEN ${days_from_start_first} >= 0
               THEN date_trunc('year',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_first}, {% date_end second_period_filter %}))
              END
        WHEN {% parameter granularity %} = 'hour_of_day'
          THEN dateadd(hours, ${created_hour_of_day}, {% date_end second_period_filter %}::date)
         WHEN {% parameter granularity %} = 'day_of_week'
          THEN dateadd(day, ${created_day_of_week_index}, dateadd(days, -7,{% date_end second_period_filter %}::date))

      END;;
  }

I also put this PoP code on a diferent view wich can be included in any view with:
include: "/views/PoP.view.lkml"
at the top of the target view
and
extends: [pop]
under the view section.
The only requirements for this to work as an external include is the target view have a dimension_group named created.

This way the x axis show dates instead of numbers for better readability. It still can be improved 🙂

Full code here:

view: pop {
  parameter: granularity {
    default_value: "hourly"
    allowed_value: {
      label: "Hourly"
      value: "hourly"
    }
    allowed_value: {
      label: "Daily"
      value: "daily"
    }
    allowed_value: {
      label: "Weekly"
      value: "weekly"
    }
    allowed_value: {
      label: "Monthly"
      value: "monthly"
    }
    allowed_value: {
      label: "Yearly"
      value: "yearly"
    }
    allowed_value: {
      label: "Hour of day"
      value: "hour_of_day"
    }
    allowed_value: {
      label: "Day of the Week"
      value: "day_of_week"
    }
  }

  filter: first_period_filter {
    label: "First date"
    description: "This dates ALWAYS MUST be before the second date"
    group_label: "Arbitrary Period Comparisons"
    type: date_time
  }

  filter: second_period_filter {
    label: "Second date"
    description: "This dates ALWAYS MUST be after the first date"
    group_label: "Arbitrary Period Comparisons"
    type: date_time
  }

  dimension: period_selected {
    group_label: "Arbitrary Period Comparisons"
    type: string
    sql:
        CASE
          WHEN ${created_raw} >=  {% date_start first_period_filter %}
          AND ${created_raw} <= {% date_end first_period_filter %}
          THEN 'First Period'
          WHEN ${created_raw} >=  {% date_start second_period_filter %}
          AND ${created_raw} <= {% date_end second_period_filter %}
          THEN 'Second Period'
          END ;;
  }

  dimension_group: created {
    type: time
    timeframes: [
      raw,
      hour,
      time,
      hour_of_day,
      day_of_week,
      day_of_week_index,
      day_of_month,
      date,
      week,
      month,
      quarter,
      year
    ]
  }

  dimension: days_from_start_first {
    hidden: yes
    type: number
    sql: DATEDIFF('day',  {% date_start first_period_filter %}, ${created_date}) ;;
  }

  dimension: days_from_start_second {
    hidden: yes
    type: number
    sql: DATEDIFF('day',  {% date_start second_period_filter %}, ${created_date}) ;;
  }

  dimension: comparison_date {
    group_label: "Arbitrary Period Comparisons"
    type: date_time
    sql:
      CASE
      WHEN {% parameter granularity %} = 'hourly'
        THEN dateadd(days, -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+(CASE
                               WHEN ${days_from_start_second} >= 0
                               THEN ${days_from_start_second}
                               WHEN ${days_from_start_first} >= 0
                               THEN ${days_from_start_first}
                              END),
                              dateadd(hours, ${created_hour_of_day}, {% date_end second_period_filter %}::date))

       WHEN {% parameter granularity %} = 'daily'
        THEN CASE
               WHEN ${days_from_start_second} >= 0
               THEN dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_second}, {% date_end second_period_filter %})
               WHEN ${days_from_start_first} >= 0
               THEN dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_first}-1, {% date_end second_period_filter %})
              END
       WHEN {% parameter granularity %} = 'weekly'
        THEN CASE
               WHEN ${days_from_start_second} >= 0
               THEN date_trunc('week',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_second}, {% date_end second_period_filter %}))
               WHEN ${days_from_start_first} >= 0
               THEN date_trunc('week',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_first}, {% date_end second_period_filter %}))
              END
        WHEN {% parameter granularity %} = 'monthly'
        THEN CASE
               WHEN ${days_from_start_second} >= 0
               THEN date_trunc('month',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_second}, {% date_end second_period_filter %}))
               WHEN ${days_from_start_first} >= 0
               THEN date_trunc('month',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_first}, {% date_end second_period_filter %}))
              END
        WHEN {% parameter granularity %} = 'yearly'
        THEN CASE
               WHEN ${days_from_start_second} >= 0
               THEN date_trunc('year',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_second}, {% date_end second_period_filter %}))
               WHEN ${days_from_start_first} >= 0
               THEN date_trunc('year',dateadd(days , -(datediff(days, {% date_start second_period_filter %}, {% date_end second_period_filter %}))+${days_from_start_first}, {% date_end second_period_filter %}))
              END
        WHEN {% parameter granularity %} = 'hour_of_day'
          THEN dateadd(hours, ${created_hour_of_day}, {% date_end second_period_filter %}::date)
         WHEN {% parameter granularity %} = 'day_of_week'
          THEN dateadd(day, ${created_day_of_week_index}, dateadd(days, -7,{% date_end second_period_filter %}::date))

      END;;
  }
}

@ernesto1 what does ${created_raw} represent in period_selected?

Hey @ernesto1 got this working but wondered if it can be extended to do three dates? Will give it a go myself but if you have any thoughts?

Also if it’s included as a view is there anyway to have the date be different named dimensions or would you always need to present created (or another common dimension_group).

Top Labels in this Space
Top Solution Authors