Question

Restrict join on Redshift


I have an issue with Redshift scanning very large tables joined in an explore and I’d like to know if someone else had a similar problem. For instance, let’s assume that we have two tables: person and purchase. Now I want to list all people and their purchases from the last month (even if someone has’t bought anything, s/he should appear on the query). The query would look like:


SELECT * FROM person p

LEFT JOIN purchase c ON c.person_id = p.id AND c.date > DATEADD(day, -30, GETDATE())


The problem is that it seems that Redshift doesn’t filter the purchase table before joining, scanning the whole table. This increases a lot the query cost.

But if I use a query like this:


SELECT * FROM person p

LEFT JOIN purchase c ON c.person_id = p.id

WHERE c.date > DATEADD(day, -30, GETDATE())


Redshift will filter the table (because of the WHERE clause), but the query result will not answer my problem because only people who has bought in the last 30 days will appear.

Is there a way to solve this problem?


0 replies

Be the first to reply!

Reply