Re: Looker Over Complex Queries

Nick_Lam · 05-29-2023 10:11 AM

Hi, I'm having an issue right now in looker where I have an explore based on one table let's just call it "amount". Then I have another explore that I join the "amount" table to which we can call "accounts". If I use the amount measure from the amount table, looker generates a simple query like ->

COALESCE(SUM(( "amount" ) ), 0) AS "Amount"

However in the second "accounts" explore, when I join the "amount" table in and attempt to use the same measure, Looker generates a very complex query to achieve the same effect ->

COALESCE(CAST( ( SUM(DISTINCT (CAST(FLOOR(COALESCE( ( amount ) ,0)*(1000000*1.0)) AS DECIMAL(38,0))) + (TO_NUMBER(MD5( concat((amount."H"),'9',(amount."HP"),'9',(amount."IB"),'9',(TO_CHAR(TO_DATE(amount."U" ), 'YYYY-MM-DD'))) ), 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX') % 1.0e27)::NUMERIC(38, 0) ) - SUM(DISTINCT (TO_NUMBER(MD5( concat((amount."H"),'9',(amount."HP"),'9',(amount."IB"),'9',(TO_CHAR(TO_DATE(amount."U" ), 'YYYY-MM-DD'))) ), 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX') % 1.0e27)::NUMERIC(38, 0)) ) AS DOUBLE PRECISION) / CAST((1000000*1.0) AS DOUBLE PRECISION), 0) AS "Amount"

No matter how I change the join to the explore or other joins in the explore, Looker produces this query which essentially changes the run time from 5 seconds to longer than 5 minutes. I haven't actually been able to get it to run successfully.

Using SQL Runner and subbing in the original simpler query in place, I'm able to get a result instantly. Has anyone else deal with this kind of issue / how were you able to resolve it?

The measure type is sum and not sum_distinct just for reference.

Tashina

Are you trying to join a table "accounts" to an Explore "amounts", or are you truly trying to join an Explore to an Explore?

madmax

It may be the symmetric aggregates functionality that generates the code depending on the relationship between the views to ensure correct calculation of the measures. It should not affect performance, so I would check if the views that referencing the explore have a defined primary key, the type of relationship between them, etc.

understanding-symmetric-aggregates

Michael_Jassows

This is clearly a symmetric aggregate. It is introduced when you try to do a SUM on data joined with a many to many relationship. Without this function, you have the potential to return data which has fanout, and a plain SUM will return an incorrect result.

You can find a good description of why this is needed for many-to-many joins, and why this works here:

https://cloud.google.com/looker/docs/best-practices/understanding-symmetric-aggregates

And yes: there is often a big performance penalty for this more complicated equation.

If your query seems to work -- and returns the correct result -- with just a regular sum, then maybe your relationship is not really many-to-many? Or because of the way you've filtered the data, there is no longer fanout? If that's the case, you should be able to add the filter condition to the explore and change the relationship from many-to-many to one-to-many or many-to-one. If adding the filter condition to the explore makes it unusable for other scenarios which require the fanout inducing data, consider making a copy of the explore, or extending it for this use case.