I’m currently working with a client where Postgres is used as the data warehouse DB. One of Postgres’ features is that CTEs in queries, i.e. WITH subquery AS ( SELECT ... )
act as ‘optimisation fences’, meaning that, if you reference subquery
later in your SQL and include WHERE
predicates, e.g.
SELECT * FROM subquery WHERE date > '2019-01-01' AND date < '2020-04-01'
…the date predicates will not be pushed down into the CTE - Postgres will run the complete CTE regardless of dates, and only apply the outer query to these results afterwards…
Unfortunately, since Looker uses CTEs for §DTs, Postgres’ query performance suffers a lot from this, since your PDT SQL with always be in a CTE and any filters you apply through Looker will not be pushed down.
My workaround is to use the sql_table_name:
parameter. Whatever SQL you put in here will be inserted into queries where the table reference would normally go, i.e. an inline subquery, where Postgres will do predicate push-downs.
You need to wrap your DT SQL in brackets so that the query will look like
SELECT * FROM ( SELECT... ) AS subquery
and be syntactically correct.
This works fine for ephemeral (non-persisted) DTs which is all we’re using with this client. It should also work for PDTs, albeit with some extra code, by using a second view for the persistence, referencing the first with ${viewname.SQL_TABLE_NAME
} but I haven’t verified it. A similar idea is noted here in the discussion about avoiding CTEs in (P)DTs as they cause nested CTEs and can be brittle: https://docs.looker.com/data-modeling/learning-lookml/derived-tables#monitoring_and_troubleshooting_...