This article has been retired as the information is now in documentation on this page.
If you reached here from the product, our apologies – we’re working on getting that link redirected to this LookML reference page.
This article has been retired as the information is now in documentation on this page.
If you reached here from the product, our apologies – we’re working on getting that link redirected to this LookML reference page.
This is awesome! This will save your users so much time. I’ve done a lot of painstaking copying and pasting to generate derived tables from explore results (and then replacing generated table names with ${table.SQL_TABLE_NAME} syntax… this is a WAY better solution
This is honestly one of the best features Looker has ever launched! Big time saver. One weird thing I found so far is when using date as a dimension. The timezone conversion doesn’t work correctly so I had to create a new time dimension in my original view and use CONVERT_TIMEZONE before referencing it in the NDT. This seemed to work well.
Does anyone know if you can join one of these NDTs back to another table?
Hi Devin,
We’re excited about this feature too. Glad to hear it’s useful for you. Do you mind sharing the syntax that didn’t work and the solution you used either here or in an email to help.looker.com? We’d be interested in taking a look so we can document the behaviour and work on a fix if needed.
Thanks!
Yeah no problem. I’ll send an email to support.
Does anyone know if you can join one of these NDTs back to another table?
You can definitely join one of these NDTs back to another table. Once defined, they really just behave exactly like the current sql-defined derived tables.
thanks
Will there be (or is there already) support for joins, unions, and templated filters in the Native Derived Table definition?
i was recently experiencing a weird situation when I tried to define a PDT (PDT1) that contained a templated_filter (TF1), defined another PDT (PDT2pdt1) that referenced the initial PDT, defined an NDT that performed some agg on the second PDT (NDT1pdt2), and then finally joined the NDT (NDT1pdt2) back to the second PDT (PDT2). the resulting SQL would properly evaluate the templated_filter (TF1) within the initial PDT (PDT1) but not within the NDT (NDT1pdt2) … and would look something like this:
with PDT1 as
(
select accountname
from SOMESQLTABLE1
where [****PROPERLY EVALUATED TF1****]
group by accountname
),
PDT2 as
(
select
date,
accountname,
sum(value) as sumvalue
from SOMESQLTABLE2
inner join PDT1
on SOMESQLTABLE2.accountname = PDT1.accountname
group by date, accountname
)
select
PDT2.date,
PDT2.accountname,
PDT2.sumvalue,
NTD1pdt2.sumvalue_avg
from PDT2
left outer join
(
select
date,
avg(sumvalue) over (partition by date) as sumvalue_avg
from SOMESQLTABLE2
inner join
(
select accountname
from SOMESQLTABLE1
where [****IMPROPERLY EVALUATED TF1: 1=1 ****]
group by accountname
) as PDT1hashcode
on SOMESQLTABLE2.accountname = PDT1hashcode.accountname
) as NDT1pdt2
on PDT2.date = NDT1pdt2.date
LOOKML:
explore: PDT2 {
join: PDT1 {
sql_on: ${PDT1.accountname}=${PDT2.accountname} ;;
type: inner
relationship: one_to_one
}
join: NDT1pdt2 {
sql_on: ${PDT2.date} = ${NDT1pdt2.date} ;;
type: left_outer
relationship: one_to_one
}
}
view: PDT1 {
derived_table: {
sql:
select uid, accountname from
(
select 1 uid, 'x' accountname UNION ALL
select 2 uid, 'y' accountname
) a1
where a1.accountname = {% parameter accountname %}
;;
}
dimension: PDT1uid {
hidden: yes
type: number
value_format_name: id
sql: ${TABLE}.uid ;;
primary_key: yes
}
dimension: accountname {
hidden: no
type: string
sql: ${TABLE}.accountname ;;
}
}
view: PDT2 {
derived_table: {
sql:
select a2.uid, a2.date, a2.accountname, a2.value from
(
select 1 uid, 100 value, 'x' accountname, '20161230' date UNION ALL
select 2 uid, 200 value, 'x' accountname, '20161230' date UNION ALL
select 3 uid, 500 value, 'x' accountname, '20161230' date UNION ALL
select 4 uid, 800 value, 'y' accountname, '20161230' date UNION ALL
select 5 uid, 250 value, 'y' accountname, '20161230' date UNION ALL
select 6 uid, 100 value, 'x' accountname, '20161231' date UNION ALL
select 7 uid, 400 value, 'x' accountname, '20161231' date UNION ALL
select 8 uid, 600 value, 'x' accountname, '20161231' date UNION ALL
select 9 uid, 100 value, 'y' accountname, '20161231' date
) a2
inner join ${PDT1.SQL_TABLE_NAME} pdt1
on pdt1.accountname = a2.accountname;;
}
dimension: PDT2uid {
hidden: yes
type: number
value_format_name: id
sql: ${TABLE}.uid ;;
primary_key: yes
}
dimension: date {
type: date
sql: ${TABLE}.date ;;
}
dimension: accountname {
hidden: yes
type: string
sql: ${TABLE}.accountname ;;
}
measure: sumvalue {
type: sum
sql: ${TABLE}.value ;;
value_format_name: decimal_2
}
}
view: NDT1pdt2 {
derived_table: {
explore_source: PDT2 {
column: date {}
column: sumvalue {}
derived_column: sumvalue_avg {
sql: AVG(sumvalue) OVER (PARTITION BY date) ;;
}
}
}
dimension: date {
hidden: yes
type: date
}
measure: sumvalue_avg {
type: average
value_format_name: decimal_2
}
}
RESULTING SQL:
WITH PDT1 AS (select uid, accountname from
(
select 1 uid, 'x' accountname UNION ALL
select 2 uid, 'y' accountname
) a1
********* where a1.accountname = 'x' *********
)
, PDT2 AS (select a2.uid, a2.date, a2.accountname, a2.value from
(
select 1 uid, 100 value, 'x' accountname, '20161230' date UNION ALL
select 2 uid, 200 value, 'x' accountname, '20161230' date UNION ALL
select 3 uid, 500 value, 'x' accountname, '20161230' date UNION ALL
select 4 uid, 800 value, 'y' accountname, '20161230' date UNION ALL
select 5 uid, 250 value, 'y' accountname, '20161230' date UNION ALL
select 6 uid, 100 value, 'x' accountname, '20161231' date UNION ALL
select 7 uid, 400 value, 'x' accountname, '20161231' date UNION ALL
select 8 uid, 600 value, 'x' accountname, '20161231' date UNION ALL
select 9 uid, 100 value, 'y' accountname, '20161231' date
) a2
inner join PDT1 pdt1
on pdt1.accountname = a2.accountname)
, NDT1pdt2 AS (
SELECT as53f08f27e18.*,
AVG(sumvalue) OVER (PARTITION BY date) AS sumvalue_avg
FROM (SELECT
CONVERT(VARCHAR(10),PDT2.date ,120) AS date,
COALESCE(SUM(PDT2.value ), 0) AS sumvalue
FROM (select a2.uid, a2.date, a2.accountname, a2.value from
(
select 1 uid, 100 value, 'x' accountname, '20161230' date UNION ALL
select 2 uid, 200 value, 'x' accountname, '20161230' date UNION ALL
select 3 uid, 500 value, 'x' accountname, '20161230' date UNION ALL
select 4 uid, 800 value, 'y' accountname, '20161230' date UNION ALL
select 5 uid, 250 value, 'y' accountname, '20161230' date UNION ALL
select 6 uid, 100 value, 'x' accountname, '20161231' date UNION ALL
select 7 uid, 400 value, 'x' accountname, '20161231' date UNION ALL
select 8 uid, 600 value, 'x' accountname, '20161231' date UNION ALL
select 9 uid, 100 value, 'y' accountname, '20161231' date
) a2
inner join (select uid, accountname from
(
select 1 uid, 'x' accountname UNION ALL
select 2 uid, 'y' accountname
) a1
****** where a1.accountname = '' ******
) pdt1
on pdt1.accountname = a2.accountname) AS PDT2
GROUP BY CONVERT(VARCHAR(10),PDT2.date ,120)) as53f08f27e18
)
SELECT
TOP 500
CONVERT(VARCHAR(10),PDT2.date ,120) AS "pdt2.date_1",
PDT1.accountname AS "pdt1.accountname_1",
AVG(NDT1pdt2.sumvalue_avg) AS "ndt1pdt2.sumvalue_avg_1",
COALESCE(SUM(PDT2.value ), 0) AS "pdt2.sumvalue_1"
FROM PDT2
INNER JOIN PDT1 ON PDT1.accountname=PDT2.accountname
LEFT JOIN NDT1pdt2 ON (CONVERT(VARCHAR(10),PDT2.date ,120)) = (CONVERT(VARCHAR(10),NDT1pdt2.date,120))
WHERE
(PDT1.accountname = 'x')
GROUP BY CONVERT(VARCHAR(10),PDT2.date ,120),PDT1.accountname
ORDER BY 3 DESC
It seems to be working as expected for me in 4.18.6. What version are you running? I know we fixed some bugs around this recently.
SELECT
PDT1.accountname AS pdt1_accountname_1,
DATE(PDT2.date ) AS pdt2_date_1,
AVG(NDT1pdt2.sumvalue_avg) AS ndt1pdt2_sumvalue_avg_1,
COALESCE(CAST(SUM(PDT2.value ) AS FLOAT), 0) AS pdt2_sumvalue_1
FROM (select a2.uid, a2.date, a2.accountname, a2.value from
(
select 1 uid, 100 value, 'x' accountname, '20161230' date UNION ALL
select 2 uid, 200 value, 'x' accountname, '20161230' date UNION ALL
select 3 uid, 500 value, 'x' accountname, '20161230' date UNION ALL
select 4 uid, 800 value, 'y' accountname, '20161230' date UNION ALL
select 5 uid, 250 value, 'y' accountname, '20161230' date UNION ALL
select 6 uid, 100 value, 'x' accountname, '20161231' date UNION ALL
select 7 uid, 400 value, 'x' accountname, '20161231' date UNION ALL
select 8 uid, 600 value, 'x' accountname, '20161231' date UNION ALL
select 9 uid, 100 value, 'y' accountname, '20161231' date
) a2
inner join (select uid, accountname from
(
select 1 uid, 'x' accountname UNION ALL
select 2 uid, 'y' accountname
) a1
where a1.accountname = 'test'
) pdt1
on pdt1.accountname = a2.accountname) AS PDT2
...
Hi Lloyd, sorry, maybe I wasn’t clear enough. I meant if I currently have a sql-defined PDT which joins two or more tables together, can I write a corresponding LookML-defined PDT which does this join, or is it currently not supported?
Yes, for sure. in The example Model for Native Derived Tables The user/order fact table below joins users, products, inventory items to product the SQL
The normal ‘explore’ object for order_items, defines the joins.
The Native Derived Table:
include: "order_items.explore"
view: user_order_facts {
derived_table: {
explore_source: order_items {
column: user_id {field:order_items.user_id}
column: lifetime_revenue {field:order_items.total_revenue}
column: lifetime_number_of_orders {field:order_items.order_count}
column: lifetime_product_categories {field:products.category_list}
column: lifetime_brands {field:products.brand_list}
}
}
dimension: user_id {hidden:yes}
dimension: lifetime_revenue {type:number}
dimension: lifetime_number_of_orders {type:number}
dimension: lifetime_product_categories {}
dimension: lifetime_brands {}
}
Produces the following SQL
SELECT
order_items.user_id AS user_id,
COALESCE(SUM(order_items.sale_price ), 0) AS lifetime_revenue,
COUNT(DISTINCT order_items.order_id ) AS lifetime_number_of_orders,
STRING_AGG(DISTINCT CAST(products.category AS STRING), '|RECORD|') AS lifetime_product_categories,
STRING_AGG(DISTINCT CAST(products.brand AS STRING), '|RECORD|') AS lifetime_brands
FROM thelook_web_analytics.order_items AS order_items
LEFT JOIN thelook_web_analytics.inventory_items AS inventory_items ON order_items.inventory_item_id = inventory_items.id
LEFT JOIN thelook_web_analytics.products AS products ON inventory_items.product_id = products.id
GROUP BY 1
This a great feature!! We’re trying it out already!
Just one question, is it necessary to declare all columns inside the ‘explore_source’, or is there a way to retrieve all columns available by the explore?
For example, if we create a NDT that is retrieving columns from an explore that uses the view ‘orders’, when we create a new measure inside ‘orders’ view, do we also have to declare this column inside the ‘explore_source’ of the NDT?
Hi
That is correct, if you want to access a field to aggregate on using that new measure you want to create, that field must be declared in the explore_source:
.
Cheers,
Vincent
I noticed that the include is for a “.explore.lkml” file. How do I create one of those and save it like that? I am only aware of .view and .model files. Alternative, how can i include only the explore file from within the model to prevent validation errors?
To add an explore file, we recommend choosing + and Create View, then specifying the explore’s name using an “explore.lkml” extension. Looker displays the file in the Other section of the LookML IDE.
Has anyone gotten this to work by including the model file rather than by creating a separate explore file?
I get a duplicate explore definition error, which I assume is created by circular dependencies, i.e. model M includes all views, including view V, which includes model M in order to use one of its explores as an explore source.
Is there a way to fix this without moving the explore into its own file and being careful about the view includes in that one?
Adding onto this question because it’s related. I just noticed that this whole NDT approach forces every model (defined in the default way) to include every explore file. That’s because the models include all the views, and the NDT view includes the explore it’s based on.
Should we stop importing *.view and start listing them explicitly? I’d still like to be able to use an NDT-defined view in a model without exposing its underlying explore, but that seems like a minor point.
I also want to add on that I am loving this new feature overall. It’s easily the most exciting advancement in LookML in the three years that we’ve been working with you guys. Making every query potentially reusable as a view is big deal. It’s a step toward LookML becoming a full replacement for SQL.
Probably a more important question than my other one. Is it possible to use a custom filter in an NDT? It doesn’t appear that the syntax supports it, yet.
In view v1:
dimension_group: event_time {
type: time
convert_tz: yes
timeframes: [
time,
date,
week,
month,
raw
]
sql: ${TABLE}.event_time ;;
}
Then explore based on event_time_date, resulting SQL for the dimension is:
DATE(CONVERT_TIMEZONE('UTC', 'America/New_York', v1.event_time))
All good so far.
But if I create view v2 based on this explore with column event_time_date (even using the Looker provided native derived table syntax based on the explore), the resulting SQL for view v2 is missing the CONVERT_TIMEZONE piece.
Yes, look at the bind_fitlers:
syntax. Here is an example:
https://github.com/looker/bq_thelook/blob/master/filtered_lookml_dt.view.lkml
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.