Question

Date handling from LookML to SQL Server, suggestions

  • 30 April 2018
  • 5 replies
  • 393 views

Userlevel 1

I’ve recently been having some performance issues from some looks that use date range filters, such as Date is in the past 8 weeks etc.


When interrogating the SQL for this, the code generated by Looker is extremely convoluted, with multiple converts between the datetime datatype and strings and always ends up with a string. The issue with this, is this affects the cardinality estimator within SQL Server and renders any indexing on the table largely useless; after all strings are ordered differently than dates.


The LookML is effectively trying to work out the start and end date of the range, but the way it’s doing this seems a little… backwards?


For example, to get the start date of the range for in the past 8 weeks, the following SQL code is generated (I’ve stuck a SELECT there in order for you to run it):-


  SELECT	((CONVERT(
varchar(10)
, DATEADD(
WEEK
, -7
, CONVERT(
varchar(10)
, DATEADD(
DAY
, (0
- (((DATEPART(dw, CONVERT(datetime, CONVERT(varchar, CURRENT_TIMESTAMP, 102), 120))
- 1) - 1 + 7) % (7)))
, CONVERT(datetime, CONVERT(varchar, CURRENT_TIMESTAMP, 102), 120))
, 120))
, 120)));

The whole statement can be replaced with the following logic (and if you wanted minutes/hours/days/months/years/quarters etc, simply replace every instance of the Week keyword with the appropriate keyword):-


SELECT DATEADD(WEEK,-7,DATEADD(WEEK, DATEDIFF(WEEK,0,GETDATE()),0))


This works by using an anchor date: 0, which equates to ‘1900-01-01’. We then calculate the number of weeks between this date and GETDATE (or CURRENT_TIMESTAMP if preferred, or SYSDATETIME etc). We then add that number of weeks back on to the anchor date to return the start of the current week, and from there move forwards or backwards the number of weeks we want to go, in this case -7 to go eight weeks back.


The advantage of this method is that at no point is any datatype conversion required, and the datatypes entirely remain within the date realm.


I would strongly recommend trying to switch the date handling syntax to this style if possible, as it will reap serious performance benefits compared to the existing method, along with being simpler code which is easier to understand.


This will also ensure that if the login connected to SQL Server has a different DATEFIRST setting (i.e. which day is considered the first of the week) then the behaviour of week calculations will be consistent, with the existing logic, this is not the case.


5 replies

Hi Matthew – Thank you very much for your feedback and suggestions! I’ve relayed them to our product team.


I definitely understand where you’re coming from and why this seems backwards. Looker does this in order to standardize across different database dialects.


If you’re looking to reference dates in the same format as your database, you can use the raw timeframe for dimension groups of type: time, and you can perform date operations on that. Note that it is only accessible from LookML, not from explores. Here is a Discourse article detailing the raw timeframe.

Userlevel 1

Hi Carl,


In this case we are using the raw timeframe in joins; however when we had a filter to the explore is when we get the problem with the above cardinality estimate issues due to the conversion; an example of the join code is below:-


  join: dim_loan_applications  {
view_label: "Applications"
sql_on: CONVERT(date,${dim_loan_applications.application_date_raw}) = ${dim_calendar.date_raw} ;;
type: inner
relationship: one_to_many
}

We then use a filter in the explore, which is where the code in my initial post is being produced, and is what seems to cause the issue. Would this be something that could be worked around with parameters, and have the somehow updated from the filter?


From a performance point of view, for the query in question, this actually results in the query generated by LookML taking 30 seconds, whilst the simply changing the calculation of the dates to the style I suggested, down to 2 seconds. The same could be achieved by actually hard coding the strings; so perhaps the start and end date ranges from those filters could be calculated in the front end code and passed directly to the db?

Userlevel 6
Badge

Hi Matthew,


The design point for Looker filters is such that the right hand side of the relation should always evaluate to a constant so it can be used by indexes. It sounds like you’ve found a case with week filters in MSSQL where we are performing sub-optimally.



This is an interesting solution. We’ll take a look. Sounds like we might have another problem that our weeks will be inconsistent based on server settings.


BTW: we don’t emit constant date/times so that the SQL generated for relative filters will always operate relatively, without editing. Dates and times will be relative to the database server, not the Looker server.

Userlevel 1

Hi Lloyd,


I think that’s a fair point on not sending through the constants; there are also other reasons for not doing this with plan caching and so on.


Obviously I’m coming at all of this from a SQL Server point of view, which is my area of expertise, which can be quite fussy with execution plans and performance.


Cheers for getting back.


Matthew

Userlevel 3

I recently wrote a discourse article covering various topics around time zone conversion for Microsoft SQL Server. I recommend using a time zone User Attribute for datetime conversions.


Time Zone Conversions for SQL Server Using User Attribute

Reply