Suppose I have a table called sales_fact and a table item_groups in following layout:
sales_fact:
| sale_id | date | item_id | amount |
item_groups:
| item_id | group_name |
sales_fact and item_groups are linked together in 1:m relationship i.e. a items has multiple sales groups.
I want create a dashboard to report the value of each group.
If the cardinality of the sale_fact and item_groups is small then creating a PDT is ok.
If the cardinality of the two tables is quite big, let’s 110^9 rows for sales and an average ratio item to groups is 1:10 then the PDT would end up ~ 110^10.
However, in most use cases there would be no need to materialise the join between the two table because the both sales_fact and item_group tables would be filtered reducing a lot the number of row on which the join is done.
Essentially I would like to be able to build the following query
SELECT s.*, i.* FROM sale_fact s LEFT JOIN item_groups i ON i.item_id = s.item_id WHERE s.date BETWEEN ? AND ? AND i.group_name IN (?)
and being able to inject the value for the placeholders ?