Knowledge Drop

Why do we recommend many_to_one joins over one_to_many joins?

  • 7 May 2021
  • 0 replies
  • 122 views

Userlevel 3

Last tested: Aug 2, 2019

Minimize joins

Because in most scenarios, the most granular table usually has the most valuable information, and is therefore the most likely one to always be necessary in your queries! For example, for a B2B SaaS company, most of your sales involve opportunity data (because that's where # of deals and $ ACV live). If you therefore start your explore with the Opportunity table, and then join into other tables (which usually have many_to_one relationships), your queries will on average contain fewer joins than if you started your explore with a less used table, like accounts.

Avoid Symmetric Aggregates when possible

Looker is less likely to unnecessarily use symmetric aggregates if you define your explore from the most granular table. Let’s say you have order_items, orders, and users. You could define your explore starting with order_items (most granular) or users (least granular)

Starting with the most granular table:

explore: order_items {
 join: orders {
   relationship: many_to_one
   sql_on: ${order_items.order_id} = ${orders.id} ;; } 

join: users {
   relationship: many_to_one
   sql_on: ${orders.user_id} = ${users.id} ;; }
}

If you define it this way, starting from the most granular table, Looker won't use symmetric aggregates against measures in the order_items table.

Starting with the least granular table:

explore: users {

join: orders {
  relationship: one_to_many
  sql_on: ${orders.user_id} = ${users.id} ;; }

join: order_items {
  relationship: one_to_many
  sql_on: ${order_items.order_id} = ${orders.id} ;; }
}

If you define it this way, and you build a query that uses columns from all 3 tables, Looker sees two one_to_many joins (users->orders and users->order_items) and has no way to know whether the join to orders causes order_items to be fanned out. Even though
we know that the join from users->orders->order_items are chined and that order_items isn't fanned out, Looker does not! (You might think it does because it knows it has to join in orders to join order_items, but Looker does not infer any relationships between multiple joined-in tables within an explore, it only considers the relationship defined between the base view and joined in views.) In this case, Looker will unnecessarily use symmetric aggregates against measures in the order_items table, which will be a big performance hit compared to setting up your explore from the most granular table.

 

A historical note: Before symmetric aggregates, Looker wouldn't display any measures from potentially fanned-out tables in explore, so you had to define your explores from the most granular table to be able to use measures at all!!

 

 

This content is subject to limited support.                

 

 


0 replies

Be the first to reply!

Reply