Question

Redshift SortKey and Epoch Timestamps

  • 8 August 2017
  • 4 replies
  • 134 views

Hi all,


I have a redshift cluster with event_timestamp assigned as sort and dist key. The event_timestamp values are stored as epoch seconds (integers) which I then use to create a time dimension (event_time) for ease of querying within looker.


However, I’ve found query performance to be poor as I believe redshift doesn’t like the way that Looker generates the SQL for the resulting date period. For example, if I wanted to filter for the last 7 days the looker generated sql would be:


(((timestamp 'epoch' + (Cast(master_prod.event_timestamp as INT)) * interval '1 second') >= ((CONVERT_TIMEZONE('Europe/London', 'UTC', DATEADD(day,-6, DATE_TRUNC('day',CONVERT_TIMEZONE('UTC', 'Europe/London', GETDATE())) )))) AND (timestamp 'epoch' + (Cast(master_prod.event_timestamp as INT)) * interval '1 second') < ((CONVERT_TIMEZONE('Europe/London', 'UTC', DATEADD(day,7, DATEADD(day,-6, DATE_TRUNC('day',CONVERT_TIMEZONE('UTC', 'Europe/London', GETDATE())) ) ))))))


I believe a simple query would perform a scan of the entire master_prod table, loading and decompressing every timestamp. Obviously this is not what I intended as I wanted to restrict the query to just the rows relevant to last 7 days!


Anyone had similar issues or a workaround? thanks in advance!


4 replies

Userlevel 3

Redshift handles native timestamp types just fine. You might try just changing to a regular timestamp column and using that.

Thanks Mike, I have added a regular timestamp column now but still experience the same issue. If I have my data in a timestamp column, what type would I specify in looker dimension? I tried date and date_time but that then causes redshift to perform a conversion of the value which I understand destroys sort key performance as per notes from AWS below:


Bear in mind that queries evaluated against a sort key column must not apply a SQL function to the sort key; instead, ensure that you apply the functions to the compared values so that the sort key is used. This is commonly found on TIMESTAMP columns that are used as sort keys.

Userlevel 3

Hi Kashif,


Could you please visit help.looker.com with the details of how your field is defined. We’d love to take a closer look at this to help find a solution.


Thanks,

Rufus


Done thanks

Reply