Cascading PDTs w/Different Datagroups on Each Dependency (last pdt wait for others to build first)

Knowledge Drop

Last tested: Oct 30, 2020
 

This is pretty jank. Optimally, you should instead set the child PDTs to use the same caching policy, that way Looker handles cascading pdts internally for you, like in this article: https://help.looker.com/hc/en-us/articles/360023904433-Using-datagroups-with-Cascading-PDTs

However, this won't work if you only want the final PDT to build after all dependencies have updated, and those tables all build based on different triggers. In that case, keep reading:

HOW IT WORKS

We are going to manually create a table in their scratch schema to track the updates of each table. Create a table with a date field and X booleans, one for each dependency PDT (in this example code, X = 3). Upon creation, also add in a row with the current time and false for all the other values.

From there, change all the PDTS to use create_process. The first sql_step should create the table as normal, then the second one we are going to need to update the update_table like so:

insert into update_table (date, A, B, C)

select now(), true, update_table.b, update_table.c

from update_table

order by date desc

limit 1

This would be the sql_step for table A in this example. For table B and C you need to change which column the "true" is being inserted into.

Then, the sql_trigger for your final parent PDT will be:

select max(date) where a = true and b = true and c = true

This will only trigger once all the child PDTs have updated, and will trigger every new time all 3 are updated. As the final step, add another sql_step to the final PDT after it is built to reset the table update flags:

insert into update_table (date, A, B, C)

select now(), false, false, false

This content is subject to limited support.                

Version history
Last update:
‎07-07-2021 01:13 PM
Updated by: