Post Context: This content is adapted from our internal SME (Subject Matter Expert) curricula. Similar to knowledge drops, this content may be highly Looker specific and may become out of date in the future. The goal is to convey a high level understanding of a common SQL query issue and convey next steps for troubleshooting and resolving.
ELI5:
Partitions essentially work by creating a reduced subset of rows that the Database can quickly scan to reduce the number of rows it has to evaluate to fetch query results.
To restate this a different way, and in the context of getting Looker to work with them:
For example, using the following sample data:
ID | Animal Name | Danger Level (1-5) | Has Hair (y/n) |
1 | Panda | 3 | Yes |
1 | Bear | 5 | Yes |
1 | Lion | 5 | Yes |
2 | Shark | 5 | No |
2 | Salmon | 1 | No |
2 | Trout | 1 | No |
3 | Butterfly | 1 | No |
3 | Bird | 2 | No |
If we were to run select * from table where Animal = butterfly
, the DB has to scan over all 8 rows to fetch you your result at row 7.
However if you partition by ID, and adjust the query to reference the appropriate partition like so; select * from table where Animal = butterfly and ID = 3
the following occurs:
While in the example above, the performance gain is negligible, in large tables (hundreds of thousands or millions of rows), the performance gain can be exponential.