Hi all,
I have a redshift cluster with event_timestamp assigned as sort and dist key. The event_timestamp values are stored as epoch seconds (integers) which I then use to create a time dimension (event_time) for ease of querying within looker.
However, I’ve found query performance to be poor as I believe redshift doesn’t like the way that Looker generates the SQL for the resulting date period. For example, if I wanted to filter for the last 7 days the looker generated sql would be:
(((timestamp 'epoch' + (Cast(master_prod.event_timestamp as INT)) * interval '1 second') >= ((CONVERT_TIMEZONE('Europe/London', 'UTC', DATEADD(day,-6, DATE_TRUNC('day',CONVERT_TIMEZONE('UTC', 'Europe/London', GETDATE())) )))) AND (timestamp 'epoch' + (Cast(master_prod.event_timestamp as INT)) * interval '1 second') < ((CONVERT_TIMEZONE('Europe/London', 'UTC', DATEADD(day,7, DATEADD(day,-6, DATE_TRUNC('day',CONVERT_TIMEZONE('UTC', 'Europe/London', GETDATE())) ) ))))))
I believe a simple query would perform a scan of the entire master_prod table, loading and decompressing every timestamp. Obviously this is not what I intended as I wanted to restrict the query to just the rows relevant to last 7 days!
Anyone had similar issues or a workaround? thanks in advance!