Question

Allow ability to add query hints and join hints, e.g. HASH join etc

  • 4 February 2019
  • 12 replies
  • 226 views

Userlevel 1

It would be really useful to have a mechanism to add query hints to the generated SQL from LookML. I have some queries where the optimiser for SQL Server is struggling due to the casting of date columns to character and back to facilitate month date ranges, this is resulting in a cardinality estimate issue, which in turn is resulting in a nested loop join instead of a hash join.


Simply specifiying the HASH hint in the join clause sorts the problem out completely; so having the ability to ensure that these are added via LookML would be very beneficial.


12 replies

Userlevel 6
Badge

Its kind of unfortunate, but if you are replacing a sql: parameter within a join you need to either stick with sql:. If you start with an sql_on: you need to stick with an sql_on: in the extended version in both the base and extended declarations. Extends isn’t smart enough to replace a sql: with an sql_on: when overriding the declaration.


The extends works in a context free way over objects. The same problem exists with sql_table_name: and derived_table: declarations in views. If you start with a sql_table_name:, you can’t extend it to be a derived table. This is one of the regrets for me in LookML.


There is a reasonable work around. You can create a version an ancestor without either sql: or sql_on: defined and force the declaration in subsequent uses. Not ideal, but probably more readable.


explore: foo_base {
extension: required
join: some_view {
}
}

explore foo1 {
extends: [foo_base]
join: some_view {
sql: .... ;;
}
}

explore foo2 {
extends: [foo_base]
join: some_view {
sql_on: .... ;;
}
}
Userlevel 1

Yes, that works. Had some messing about to do with some aliasing, as this is also in an extended view, but got things working ok. So thanks!

I didn’t find this in the feature requests, should we expect to be able to apply hints in Looker anytime soon?

Hints in the select statement seem to work if you add them directly in the ‘sql’ parameter of a dimension definition (although you’ll have to add it to all dimensions you want to apply them to). One problem is that you need that dimension to be the first one selected (as the hint needs to be right after the select statement) otherwise the hint won’t be taken into account. I haven’t found a solution for that last part yet so if you have any ideas I’d be happy to try them out.


Disclaimer: I haven’t tested this very thoroughly yet.

Userlevel 1

I’ll give that a whirl; I wasn’t aware of that as an option. It should be very very useful in a lot of scenarios here!

Userlevel 1

izzy, I have a further question on this. We’re currently using this in an extended model, where the same table is referred multiple times, once in the original model, the second in the extended part (to join back to the original model).


This is in order to tie data back to a calendar view, and allow the same logical joins of data, but would allow that same logical set to be queried based on different event dates, e.g. application date, deal date and so on. The reason we used an extended model is so we didn’t need to maintain the logical joins more than once, just the anchor for the date selection.


The original SQL output for this would actually result in just a single join to that table, as the lookML implementation seemed clever enough to realise that’s what we’re doing.


However, with using sql: in the extension we now get errors for that table where it’s complaining about having both a sql: and sql on: parameter for the same table. I can get around this by using a different name in the extension, however this then results in two joins. Which is fine, but has a bit of an impact on performance and readability.

Userlevel 1

Sorry Izzy, missed your reply. I’ll try and grab the error and code for replication in a moment.

Userlevel 1

OK, that can work. The original join will therefore look something like this:-


  join: dim_all_allocations_with_departmentgroup {
view_label: "Applications"
#sql_on: ${dim_all_allocations_with_departmentgroup.application_id} = ${dim_loan_applications.application_id}
# -- AND ${dim_all_allocations_with_departmentgroup.finance_exec_id} IS NOT NULL
# ;;
#type: left_outer
sql: LEFT OUTER HASH JOIN DIM.AllAllocations AS dim_all_allocations_with_departmentgroup
ON ${dim_all_allocations_with_departmentgroup.application_id} = ${dim_loan_applications.application_id};;
relationship: one_to_many
}

and the extended view like this:-


  join: dim_all_allocations_with_departmentgroup  {
view_label: "Applications"
#sql_on: CONVERT(date,${dim_all_allocations_with_departmentgroup.allocated_from_raw}) = ${dim_calendar.date_raw} ;;
#type: inner
# switching to sql: in order to specify the join hint
sql: INNER HASH JOIN DIM.AllAllocations AS dim_all_allocations_with_departmentgroup
ON CONVERT(date,${dim_all_allocations_with_departmentgroup.allocated_from_raw}) = ${dim_calendar.date_raw};;
#AND ${dim_all_allocations_with_departmentgroup.application_id} = ${dim_loan_applications.application_id};;
relationship: one_to_many
}

join: dim_loan_applications {
view_label: "Applications"
sql_on: ${dim_all_allocations_with_departmentgroup.application_id} = ${dim_loan_applications.application_id};;
type: inner
relationship: many_to_one
}

Seems to be working ok in the intial testing, so thanks for that!

Userlevel 7
Badge +1

This is a lovely idea— I will move it over to the feature requests section where it can be tracked. Having an options parameter in LookML would be cool.


That said, you might want to explore using the sql: parameter: https://docs.looker.com/reference/explore-params/sql-for-join. It’s been pushed down in favor of the much better sql_on:, but if you want total control over your join SQL, that parameter gives it to you.


You could do something like


join: table {
view_label: "sweet table"
relationship: many_to_one
sql: LEFT OUTER HASH JOIN table ON ${table.id} = ${othertable.id} ;;
}

or even


join: table {
view_label: "sweet table"
relationship: many_to_one
sql: LEFT OUTER JOIN table ON ${table.id} = ${othertable.id} OPTION(HASH JOIN) ;;
}

I don’t have a SQL server db to mess around with, so this is all conjecture— Would love to know if it works!

Userlevel 7
Badge +1

It’s kinda funny, I literally last week was like “nobody uses this darn parameter anymore” and made an edit to the product code to further deprecate it/add warnings for it. Guess I’ll delete that pull request…


Glad that looks promising!

Userlevel 7
Badge +1

This might be part of the reason why that parameter is now deprecated in favor of sql_on:. With the sql: parameter, Looker just kind of gives you carte blanche and assumes you know what you’re doing— We don’t parse stuff out from there in the same clever way as we do with sql_on.


I’ll give this a real test tomorrow morning and see if I can’t figure out a way around it… What’s the actual wording of the error you’re seeing before renaming?

Totally agree with this feature, especially since this is quite consistent cross engines (you don’t have that much differences). Except SQL Server that uses the OPTION keyword in SELECT’s BNF, most others (including Oracle, MySQL, Vertica…) use the SELECT /*+hint*/ syntax.

Reply