Date Comparison Block

Date Comparison Block

What is this block?

This block allows for comparisons from one date period to another, or from one date period to a number of periods. It provides extensive flexibility in comparing date ranges, and presenting these back as a single visualisation. The code can be found in this repo in the Datatonic GitHub.

Motivation

There are already a few blocks that allow for comparison between periods in Looker. The motivation for this block was to combine all the elements from the other blocks in a way that is intuitive, flexible and fast to use.

Solution

The approach was to use the intuitive style of date comparison that is used in Google Analytics - where the basic idea is that you choose the current date range that you’re interested in, then choose the comparison date range in the past. There are some templated period filters to make this easy for the user, but also custom filters to allow flexibility in comparing dates. The features are:

  1. Templated comparison periods - Once the initial period of interest is chosen, rather than manually choose the period before, there is a range of options presented such as “Previous Period”, “Previous Month”, “Previous Year” etc

  2. Custom comparison periods - in the situation where any of the above don’t fit what you are trying to see, then a custom previous date range may be chosen

  3. Any granularity - Choose how granular you want your results to be by picking the appropriate date dimension from the dimension group

  4. Multiple periods - Choose the number of periods you would like to compare, this is only available for templated periods, e.g January this year vs January last year vs January 2 years ago etc.

How to use

The process for using this in your explore is as follows:

  1. Add the filter Date Range to choose your initial date range
  2. Add the filter Compare To (Templated) to choose a templated comparison range OR Add the filter Compare To (Custom) to choose a custom comparison range
  3. If you have chosen a templated range, you may choose to add more periods by using the Comparison Periods filter
  4. Choose your date dimension - only pick from the dimension group Current Period Date, don’t use any date dimension from any other view
  5. Choose your other dimensions and measures as usual
  6. Finally, pivot on Period
  7. Hit run

How to implement

Syntax assumes a BigQuery connection, you may need to adjust this for other database connections. To have this available to use in your explores and dashboards there are a few steps:

  1. Copy the view file _date_comparison.view.lkml into your project
  2. In the view file where the date dimension you would like to be able to compare is, extend the _date_comparison view by adding the parameter extends: [_date_comparison]
  3. In the same view file, add two new dimensions, event_date and event_raw.These are simply <your_date_dimension>_date and <your_date_dimension>_raw respectively. This step is just so that naming convention used in the _date_comparison view works correctly
  4. In the relevant explore LookML, add in the sql_always_where clause defined in the model file here. Replace all instances of <your_view_name> with your view name.
14 63 12.1K
63 REPLIES 63

Edgars
Participant I

Thanks for great post! @bencannon, can you please also upload pics again? All of them are broken unfortunately.

bens1
Participant V

Does anyone know how to handle the

DATE({% date_end current_date_range %})

in MS SQL 2012’s dialect?

Thanks! Yes just reuploaded them, had some issues displaying them on github for some reason. Can’t upload them to this post as I’m limited to 1 picture per post as a new user.

Hey Ben, so the bit within the brackets will just return a timestamp relating to the end of the the current_date_range filter (thats “1. Date Range” in the field picker). This is just to convert that timestamp to a date using BQ’s DATE() function - I’d imagine ms sql would be the same if not very similar.

Not applicable

Just posting some examples of using this in the explore:

Using templated and number of periods

Using custom with date ranges that are not the same lengh

Templated & matches (advanced)

That didn’t seem right, given the quality of this post, so I bumped you up manually! Welcome, Ben 🙂 Thanks for sharing this.

bens1
Participant V

Thanks so much Ben - I got it going now. I’ve got to say - this is absolutely fantastic. I love how lean the code is too. If anyone wants, I can post the MS SQL dialect version of this.

Edgars
Participant I

Would be great, thanks in advance!

bens1
Participant V

For _date_comparison.view.lkml for MS SQL:

# this is the code for the date comparison tool, which mimics what google 360 does in the browser in comparing two different date ranges. use with _date_dim.view.lkml
view: _date_comparison {
  extension: required
  filter: current_date_range {
    view_label: "Timeline Comparison Fields"
    label: "1. Date Range"
    description: "Select the date range you are interested in using this filter, can be used by itself. Make sure any filter on Event Date covers this period, or is removed."
    type: date
  }
  filter: previous_date_range {
    view_label: "Timeline Comparison Fields"
    label: "2b. Compare To (Custom):"
    group_label: "Compare to:"

    description: "Use this if you want to specify a custom date range to compare to (limited to 2 comparison periods). Always use with '1. Date Range' filter (or it will error). Make sure any filter on Event Date covers this period, or is removed."

    type: date
  }

  dimension: days_in_period {
    description: "Gives the number of days in the current period date range"
    type: number
    sql: DATEDIFF(day, {% date_start current_date_range %}, {% date_end current_date_range %}) ;;
    hidden:  yes
  }

  dimension: period_2_start {
    description: "Calculates the start of the previous period"
    type: date
    sql:
    {% if compare_to._in_query %}
      {% if compare_to._parameter_value == "Period" %}
        DATEADD(day, -1*${days_in_period},{% date_start current_date_range %} )
      {% else %}
        DATEADD({% parameter compare_to %},-1,{% date_start current_date_range %} )
      {% endif %}
    {% else %}
      {% date_start previous_date_range %}
    {% endif %};;
    hidden:  yes
  }

  dimension: period_2_end {
    description: "Calculates the end of the previous period"
    type: date
    sql:
    {% if compare_to._in_query %}
      {% if compare_to._parameter_value == "Period" %}
        DATEADD(day,-1,{% date_start current_date_range %})
      {% else %}
        DATEADD({% parameter compare_to %},-1,DATEADD(day,-1,{% date_end current_date_range %} ) )
      {% endif %}
    {% else %}
      {% date_end previous_date_range %}
    {% endif %};;
    hidden:  yes
  }

  dimension: period_3_start {
    description: "Calculates the start of 2 periods ago"
    type: date
    sql:
    {% if compare_to._parameter_value == "Period" %}
        DATEADD(day,-2*${days_in_period},{% date_start current_date_range %})
    {% else %}
        DATEADD({% parameter compare_to %},-2,{% date_start current_date_range %}  )
    {% endif %};;
    hidden: yes

  }

  dimension: period_3_end {
    description: "Calculates the end of 2 periods ago"
    type: date
    sql:
    {% if compare_to._parameter_value == "Period" %}
      DATEADD(day,-1,${period_2_start})
    {% else %}
      DATEADD({% parameter compare_to %},-2,DATEADD(day,-1,{% date_end current_date_range %}) )
    {% endif %};;
    hidden: yes
  }

  dimension: period_4_start {
    description: "Calculates the start of 4 periods ago"
    type: date
    sql:
    {% if compare_to._parameter_value == "Period" %}
        DATEADD(day,-3*${days_in_period},{% date_start current_date_range %} )
    {% else %}
        DATEADD({% parameter compare_to %},-3,{% date_start current_date_range %} )
    {% endif %};;
    hidden: yes
  }

  dimension: period_4_end {
    description: "Calculates the end of 4 periods ago"
    type: date
    sql:
      {% if compare_to._parameter_value == "Period" %}
      DATEADD(day,-1,${period_2_start})
      {% else %}
      DATEADD({% parameter compare_to %},-3,DATEADD(day,-1,{% date_end current_date_range %}) )
      {% endif %};;
    hidden: yes
  }

  parameter: compare_to {
    description: "Choose the period you would like to compare to. Must be used with Current Date Range filter"
    label: "2a. Compare To (Templated):"
    type: unquoted
    allowed_value: {
      label: "Previous Period"
      value: "Period"
    }
    allowed_value: {
      label: "Previous Week"
      value: "Week"
    }
    allowed_value: {
      label: "Previous Month"
      value: "Month"
    }
    allowed_value: {
      label: "Previous Quarter"
      value: "Quarter"
    }
    allowed_value: {
      label: "Previous Year"
      value: "Year"
    }
    default_value: "Period"
    view_label: "Timeline Comparison Fields"
  }

  parameter: comparison_periods {
    label: "3. Number of Periods"
    description: "Choose the number of periods you would like to compare - defaults to 2. Only works with templated periods from step 2."
    type: unquoted
    allowed_value: {
      label: "2"
      value: "2"
    }
    allowed_value: {
      label: "3"
      value: "3"
    }
    allowed_value: {
      label: "4"
      value: "4"
    }
    default_value: "2"
    view_label: "Timeline Comparison Fields"
  }

  dimension: period {
    view_label: "Timeline Comparison Fields"
    label: "Period"
    description: "Pivot me! Returns the period the metric covers, i.e. either the 'This Period', 'Previous Period' or '3 Periods Ago'"
    type: string
    order_by_field: order_for_period
    sql:
       {% if current_date_range._is_filtered %}
         CASE
           WHEN {% condition current_date_range %} ${event_raw} {% endcondition %}
           THEN 'This {% parameter compare_to %}'
           WHEN ${event_date} between ${period_2_start} and ${period_2_end}
           THEN 'Last {% parameter compare_to %}'
           WHEN ${event_date} between ${period_3_start} and ${period_3_end}
           THEN '2 {% parameter compare_to %}s Ago'
           WHEN ${event_date} between ${period_4_start} and ${period_4_end}
           THEN '3 {% parameter compare_to %}s Ago'
         END
       {% else %}
         NULL
       {% endif %}
       ;;
  }

  dimension: order_for_period {
    hidden: yes
    view_label: "Timeline Comparison Fields"
    label: "Period"
    description: "Pivot me! Returns the period the metric covers, i.e. either the 'This Period', 'Previous Period' or '3 Periods Ago'"
    type: string
    sql:
       {% if current_date_range._is_filtered %}
         CASE
           WHEN {% condition current_date_range %} (${event_raw}) {% endcondition %}
           THEN 1
           WHEN ${event_date} between ${period_2_start} and ${period_2_end}
           THEN 2
           WHEN ${event_date} between ${period_3_start} and ${period_3_end}
           THEN 3
           WHEN ${event_date} between ${period_4_start} and ${period_4_end}
           THEN 4
         END
       {% else %}
         NULL
       {% endif %}
       ;;
  }

  dimension_group: date_in_period {
    description: "Use this as your date dimension when comparing periods. Aligns the all previous periods onto the current period"
    label: "Current Period"
    type: time
    sql: DATEADD(day,${day_in_period}-1,{% date_start current_date_range %} ) ;;
    view_label: "Timeline Comparison Fields"
    timeframes: [date, week, month, quarter, year]
  }

  dimension: day_in_period {
    description: "Gives the number of days since the start of each periods. Use this to align the event dates onto the same axis, the axes will read 1,2,3, etc."
    type: number
    sql:
    {% if current_date_range._is_filtered %}
      CASE
        WHEN {% condition current_date_range %} (${event_raw}) {% endcondition %}
        THEN DATEDIFF(day, {% date_start current_date_range %},${event_date})+1

        WHEN ${event_date} between ${period_2_start} and ${period_2_end}
        THEN DATEDIFF(day,${period_2_start},${event_date})+1

        WHEN ${event_date} between ${period_3_start} and ${period_3_end}
        THEN DATEDIFF(day,${period_3_start},${event_date})+1

        WHEN ${event_date} between ${period_4_start} and ${period_4_end}
        THEN DATEDIFF(day,${period_4_start},${event_date})+1
      END

    {% else %} NULL
    {% endif %}
    ;;
    hidden: yes
  }

}

bens1
Participant V

To add to your base explore’s SQL_ALWAYS_WHERE for MS SQL:

 {% if your_view_name.current_date_range._is_filtered %}
    (
    {% condition your_view_name.current_date_range %} (${event_date}) {% endcondition %}

    {% if your_view_name.previous_date_range._is_filtered or your_view_name.compare_to._in_query %}
      {% if your_view_name.comparison_periods._parameter_value == "2" %}
      or
      ${event_date} between ${period_2_start} and ${period_2_end} )

      {% elsif your_view_name.comparison_periods._parameter_value == "3" %}
        or
        ${event_date} between ${period_2_start} and ${period_2_end}
        or
        ${event_date} between ${period_3_start} and ${period_3_end} )


      {% elsif your_view_name.comparison_periods._parameter_value == "4" %}
        or
        ${event_date} between ${period_2_start} and ${period_2_end}
        or
        ${event_date} between ${period_3_start} and ${period_3_end}
        or
        ${event_date} between ${period_4_start} and ${period_4_end} )

      {% else %} AND 1 = 1 )
      {% endif %} 
    {% endif %}
  {% else %} 1 = 1
  {% endif %}

Not applicable

hi all,
I just discovered a bug if you have the Database Time Zone and Query Time Zone not set to the default. It causes the query the bug out as additional arguments are passed through the query which cause an argument type conflict. Will try and work on a fix for this

A bug that I don’t have to get Looker’s engineers to fix 😮 I’d automatically tagged this for bug filing haha.

Let me know if you get stumped and I can load it up into a project and give it a try.

John_nk
Participant I

I’m struggling to implement this when I use my date dimension as
dimension_group: fulldate {
type: time
timeframes: [year, month, week_of_year, day_of_week, week, date, raw]
sql:${TABLE}.datepartition ;;
}

I get a No matching signature for function FORMAT_TIMESTAMP for argument types: STRING, DATE, STRING. Supported signature: FORMAT_TIMESTAMP(STRING, TIMESTAMP, [STRING]) at [110:77] error, has anyone encountered this?

thanks 🙂

bens1
Participant V

To anyone who used my code - I forgot a few critical closing brackets in my SQL code. What’s dangerous is that it won’t error out but depending on your code before it, could have an impact. I’ve updated the code in my original post!

John, I think this is related to this

where casting in the SQL specifically or specifying the datatype might resolve the error.

John_nk
Participant I

Thanks for the quick response.

Can you give an example of the casting workaround please?

Thanks

John

John_nk
Participant I

I’m seeing a similar issue where it tries to convert to a timestamp using london zone and takes off an hour in months during BST then it converts to date meaning the calculations are a day out

** update **
fixed it using

convert_tz: no

CaitlinK
Participant II

I’ve found an interesting iteration of the problem that causes the error “Failed to retrieve data - No matching signature for function FORMAT_TIMESTAMP for argument types: STRING, DATE, STRING. Supported signature: FORMAT_TIMESTAMP(STRING, TIMESTAMP, [STRING]) at [147:35]” (At least I think I have, I’m testing some scenarios)

We have a table in our system that records an artificial transaction posting date for all transactions in case we need to use an adjusted date rather than the row’s created at timestamp for a variety of reasons. When we pull it into our BigQuery data warehouse using a 3rd-party ETL service, the column is defaulted to a type of timestamp, even though the data is a date, not a timestamp. When we pull the data into Looker, we’re using a warehouse view that casts the column back to date, so the LookML of ${TABLE}.transaction_post_date generates SQL of CAST(our_table.our_date as DATE) for that field.

The Looker dimension_group for this field has a type of time, data_type of date, and convert_tz set to no. This results in the following statement being generated for the Date Comparison Block’s sql_always_where statement:

((( TIMESTAMP((CAST(TIMESTAMP(FORMAT_TIMESTAMP('%F %T', (CAST(CAST(transactions.trandate AS TIMESTAMP) AS DATE)) , 'America/Chicago')) AS DATE))) ) >= (TIMESTAMP(FORMAT_TIMESTAMP('%F %T', TIMESTAMP('2019-01-01 00:00:00')), 'America/Chicago')) AND ( TIMESTAMP((CAST(TIMESTAMP(FORMAT_TIMESTAMP('%F %T', (CAST(CAST(transactions.trandate AS TIMESTAMP) AS DATE)) , 'America/Chicago')) AS DATE))) ) < (TIMESTAMP(FORMAT_TIMESTAMP('%F %T', TIMESTAMP('2019-06-12 00:00:00')), 'America/Chicago'))))

It’s applying the “cast as date” after the “cast as timestamp” and before the format_timestamp, so it’s attempting to format a date as a timestamp. Then it’s doing all the additional casts resulting in a obvious clusterfluff. (There were more aggressive swear words in real life, lol.)

I think what I need to do is 1) Abstract the warehouse level cast-as-date away from what Looker sees (so either a PDT in Looker or putting the data into a table in the warehouse), and/or 2) not use a Looker dimension with a data_type of date for this comparison block, since the additional cast seems to break the universe.

I’ll report back with findings!

ETA: well, the original problem is still a problem, but even using a different timestamp field that is a timestamp in the database, warehouse, and Looker results in the same error. I have no idea what it thinks is a date now. This is definitely not the Bigquery error mentioned above, as this is never a datetime, only a timestamp. Back to the drawing board.

CaitlinK
Participant II

Jamie_Fry, are you referring to the error that happens when attempting the DATE_SUB function in the period_start dimensions? ex: DATE_SUB(DATE({% date_start current_date_range %}), INTERVAL 2*${days_in_period} DAY)

I’m not exactly sure what you mean by time zones being set to the “default”, but my database timezone is UTC, and my query timezone is US Central Time, and I’m getting the type conflict error. The statement above breaks because the DATE_SUB returns a date, but that DATE_SUB is wrapped in TIMESTAMP(FORMAT_TIMESTAMP('%f %T', [date sub goes here]),'America/Chicago')) (which is then also wrapped in a cast-as-date as part of the code block).

I’ve set convert_tz to no in several places/permutations in my view with no effect, and I expect that behavior based on the use of DATE_SUB in the period range calculations. This error seems unavoidable unless timezone conversion is turned off at the system level, or convert_tz is set to no in the _date_conversion view and for all fields being filtered using this code block.

Additionally, no parameter selection of a date part smaller than days will produce expected effects - the DATE_DIFF for days_in_period would produce a 0 day period, and the downstream DATE_SUBs will produce unexpected effects.

I have some ideas on how to fix this if we ignore the smaller-than-day problem, but I don’t want to reinvent the wheel! (Parameterizing this to all get-out to solve for that problem would be cool, if I had time, though!)

Not applicable

Yes that’s exactly the problem, thanks for taking the time to write it out.

I’ve not had a chance to look into a work around just yet.

A previous version of the code didn’t use the date_sub function, we replaced it to make the code alot shorter and easier to use. I’d rather not role it back to this so finding a more elegant solution is best.

Message me if you have any brain waves!

John_nk
Participant I

I managed to get some joy with using type: date_raw. In general dates in Looker seem a hassle to deal with!

CaitlinK
Participant II

Aw, bugger, I hadn’t seen the following in the documentation before, so I thought some settings could solve most of the problem. If I turn off conversion, I will inevitably wreck existing reports. Argh! Going to put some fresh eyes on this in the morning.

"Common Challenges

convert_tz: no applies only to a dimension, not to a filter that uses the dimension. In other words, filters always perform time zone conversion. When you specify convert_tz: no , time-based data values are displayed in the database time zone, but are filtered using the query time zone.

Because filters always do time zone conversion, a difference between the database time zone and query time zone could cause data to unexpectedly be included or excluded from a dataset. To avoid this, ensure that the query time zone is set to the same value as the database time zone."

(This seems like a pretty huge functionality problem, to be honest…)

John_nk
Participant I

managed to get it work, to be honest I’m not sure what I’ve done!

happy to share the code if you’d like it

CaitlinK
Participant II

So, I bit the bullet and basically created a new code block to handle my requirements and constraints. I used concepts from this block and the Flexible Period-over-Period Block by fabio to create something I think is pretty neat - it’s very flexible and should work for almost anyone. It doesn’t handle comparing a specific date range to another specific range like this block does.

So, introducing for the first time, Looker PoP Comparison. Please feel free to tear it apart! Any feedback or suggestions are welcome!

I’d love to try this @CaitlinK … but I’m using snowflake and not Big query. Anyone know of any good conversion engines out there? I tried to convert it myself without luck 😦

Caitlin, you should make a whole new post about it! It’s definitely cool enough that it deserves its own article.

simon_onfido
Participant IV

+1 to this!

Not applicable

nice! will make sure to check it out and incorporate learning into this block.

CaitlinK
Participant II

I wish I had time to put together some examples for you, but I took a quick look at Snowflake syntax vs BigQuery, and the translation should be possible. This is the reference for all BigQuery data functions. I know it’s extremely icky to parse apart all the functions within functions, so what I usually do is just dump the individual statements in a text editor and add a ton of line breaks and tabs til I can see what’s going on.

Let’s all take a moment to share our collective irritation that date functions aren’t standard.

CaitlinK
Participant II

I’ve managed to find an issue already… lol. Working on it now. The Anchor Date dimension has disappeared. I think I know why, and my fix I thought would work is making it worse.

CaitlinK
Participant II

There was also another issue that cropped up with date formatting when I tried a different kind of date range. Both issues have been resolved (at least enough to keep the report I’m working on from breaking!).

Not applicable

@izzymiller we now have a number of date comparison / flexible date blocks. which have all taken inspiration from each other (all different). Do you know if looker is going to add any of this functionality to looker? Any advice you can give us to help show how these differentiate between each other so users find it easy in the forums to pick which one works for them?

CaitlinK
Participant II

Seconded. I’d love to help put something together to show off each tool’s benefits in one place. It’s pretty clear that there are two needed use cases - comparing analogous/matching date periods in the past and comparing potentially “mismatched” date ranges that aren’t necessarily over a fixed period. I do both of those things rather frequently, often on an ad hoc basis for operations execs wanting quick insight.

That is something I’ll have to ask about. It’s definitely not on our immediate roadmap.

I was just thinking the same thing while reading through this thread. Let me reach out internally and see if we would want to make an official Help Center article in the Patterns section that directs to / incorporates these resources and can be the Authority on PoP analysis.

I’ll loop back here on if we’re going to put it in Help Center (in which case soliciting feedback from y’all will be step numero uno) or if it’ll be a meta-post in Community (in which case I may leave it entirely in your capable hands). More soon!

Not applicable

@izzymiller I think I have fixed the code with some adjustments (no errors). It wasn’t giving me the right results however, but after I took a look at @John_nk code today and saw he applied convert_tz: no to his filters. This made complete sense to me as the user is selecting that filter as if its their local timezone. This then gave me the right results. We have some slight differences in our code, so more testing is needed to see if the differences makes any impact.

I think this bit mentioned above was miss interpreted:

convert_tz: no applies only to a dimension, not to a filter that uses the dimension. In other words, filters always perform time zone conversion.

The filters in this code do not rely on any dimension and can therefore have convert_tz: no, as they are parameter filters. From what i can tell this seems to work in the SQL.

I’m keen to test this some more on some different datasets and database locations before i update the public repo with it, as there are alot of dependencies. Hopefully will have the fixes up this week.

CaitlinK
Participant II

Using convert_tz: no on my filters didn’t work (or in any combination on the dimension, filter, or both), though I do have a bit of a unique date situation. We have a field meant to be a date stored as a datetime, and because it’s at midnight, things get especially weird when you throw in the database vs. query time zone issue.

Quick update on the mission to create a resource for date block comparison: One of our professional services people ended up (without even knowing about this conversation!) creating an awesome guide that walks through the different options. He’s putting the finishing touches on it, and then has promised to share it up here for everyone!

Hello!

This was really awesome. Im trying to compare sales between years.

When I only run on a specific date I get a result but then if I run between several dates the result changes on the date I searched for earlier.

An example. If I only search for 2018-01-01 then I get a value of 15000 in sales. But then I search for 2019-01-01 and compare with the date for the year before then get 20000 for sales in the result for 2018-01-01.

Im doing something wrong? I did everything from the instruction.

Seeing the SQL generated for this might be helpful to know if there’s a filtering issue going on.

Top Labels in this Space
Top Solution Authors