Dimension SQL: Self-Join

  • 26 April 2019
  • 5 replies


I am new to Looker but very experienced in SQL. I attended the LookerML seminar yesterday and the instructor told me that a self-join within a dimension is possible. Looker documentation supports that SQL queries can be incorporated into a dimension LookerML statement. However I am getting conflicted information on how to construct this or if it is even possible.

Here is what I am trying to do: Create a dimension where I self join a table to return the records that only contain the most recent invoice date for a given account ID. Comparing account IDs and the max invoice date to existing invoice date. Returning the Invoice Number meeting this criteria. This statement can be made in the SQL runner but I am not really sure how to incorporate this into a look. Which is where it would be actually used.

Can anyone assist me in resolving this?

   dimension: most_recent_invoice_number {
type: string
sql: (
SELECT ${invoice_number}
FROM invoice_table as it
ON it.account_id = ${TABLE}.account_id AND it.max_invoice_date =
${TABLE}.invoice_date ) ;;

5 replies

Userlevel 7

You have got into a little muddle/confusion. I assume you have a view file (invoices) where you have declared dimensions and measures and you have a model file with an explore in which uses invoices. You can do something like this (not proper code here btw!)

explore: invoices

fields: [-most recent invoice]

join: invoices2

from: invoices

sql_on: ${invoices.account_id} = ${invoices2.account_id} AND ${invoices.max_invoice_date } = ${invoices2.invoice_date } ;;

fields: [most recent invoice]

so here you only have to create a single view file containing fields for both invoices and invoices2 and in the explore you say which fields you want from each instance if the view

Userlevel 7

I have simplified the fields attribute here, it depends if your main view in the explore is the invoices view or not, if it is then you will need to look into fields -

If its used in a join then you can just tag it onto the join attribute for invoices.

Userlevel 7

I would do this in the explore, you can declare another join but use the same looker view using "from"and join in using the join condition you wrote above. In the single view file you are now using twice you can then declare dimensions and measures with different labels and descriptions. In the explore in the joins use the fields attribute to remove the fields which are not relevant for the instance of the view you are joining.

Another approach to avoid all the self-join issues would be to use window functions

quick example (adjust accordingly for column names / dialect)

dimension: most_recent_invoice_number {
type: string
sql: LAST_VALUE(invoice_number) OVER (PARTITION BY account_id ORDER BY invoice_date);;

While I can see this in the documentation I am not really sure I understand what this is. If I open up explore into this LookerML space.

 view: invoice_table {
sql_table_name: db1.invoice ;;

Where would I make the change you are describing?