Summing rows with a specific condition

Maartje
Participant I

Hi,
I am struggling with finding a function in Looker that would enable summing rows that satisfy a specific condition. I have multiple lines with items belonging to 1 client and a table that contains multiple clients. For each specific client, I want to show the total number of items (in order to calculate the relative frequency per item).

In for example Excel, one would simply use a sumif function. This would sum all values from the column items that have a value for Client in their row, which is the same as the value for Client in the row for which the formula was set up. This is illustrated below:

How can you construct this in looker?
Best,
Maartje

1 20 23.6K
20 REPLIES 20

Hello @Maartje,

It looks like what you are looking for is a Table Calculation!

Our documentation on Table Calculations can be found [here] (https://docs.looker.com/exploring-data/using-table-calculations). Another helpful page is our “Creating Looker Expressions” document, found here.

Table calculations are great for recreating something like an Excel sumif function. In this situation, I would recommend using a combination of the ‘sum’ and ‘if’ functions, something along the lines of

sum(if(${Client} = ‘client1’, ${items} ,null))

It’s important to note that you want to include the ‘sum’ outside of the ‘if’, not inside, otherwise you would be summing all items!

Please let me know if you have any other questions.

Maartje
Participant I

Hi Emma,
This is almost what I need as I don’t want to fix ‘client1’ in the formula, but I want it to fix the value for client in that specific row. So for each row, I want to sum all $items which have the same $Client as that row.
Best,
Maartje

Hi @Maartje,

It would be really helpful to see your current setup in Looker. What dimensions/measures you have at the moment. As I think this can be achieved several ways. Feel free to pop on chat with this so we can assist you.

Best,

Sasha

Maartje
Participant I

Hi Sasha,
How can I access the chat?
Best,
Maartje

Maartje
Participant I

The setup is actually fairly the same as in the table above, only the number of clients is not set and quite large. So manually typing “client1”, “client2”, etc. will be quite a problem. So I need the sum to recognise the specific Client in that row and only sum the Items for that Client.

Hey @Maartje,

You have to be an admin or developer to have access to chat. You might need to talk to your Looker admin to get the roles. So even if you can paste the screenshot from Looker with relevant fields it would give me an idea.

Best,

Sasha

Maartje
Participant I

Ahh that explains the reason I cannot find the chat function. As the look I am building does not display yet what I want, I can better show the Excel example of what exactly I want to build:

In this case there are two dimensions (Client and Item) with a measure for Item. Furthermore I used a before and after measure, that compares two periods. Furthermore I want to make two table calculations: Total Items and Relative frequency items per client. Finally those two relative frequencies are used to calculate the change in relative frequencies, again using a table calculation.

In red the only calculation I have difficulty with, the total items count. As there are many lines (which are not always the same) manual calculation using “client1”, “client2” etc. does not work. So I need a way to sum values from only specific rows.

Best,
Maartje

Hey again @Maartje,

Well the thing is that if you have Client/dimension, Item/dimension, Count_item/measure, Total_item/measure, Looker will always Group_By all the dimensions used on the explore so if you exclude Item/dimension & Count_item/measure. You would get what you looking for without doing any table calculations. This is why I am asking you for a screenshot from Looker, so I can see all the fields used.
Also it’s probably a good idea to talk to your Looker admin to be able to get Developer permissions if you are developing in Looker.

Best,

Sasha

Maartje
Participant I

Hi Sasha,
These are actually all the fields, but I see I made a mistake in the Item column, the third row should be item 3 (so not item 1). So everything is split out over different client/dimension - item/dimension combinations. Excluding item/dimension and count_item_measure are not what I want to do (although that would indeed easily give me the totals) as I am eventually interested in relative frequency of the items per client and especially the change in relative frequencies. Or is there a way to exclude item/dimension and count_item_measure just for the calculations of the total? No right?..
Best,
Maartje

Maartje
Participant I

Just to make sure, I currently need the total items column in order to calculate the relative frequency items / client. So that is the reason why I asked about it. If you have another way to calculate the relative frequency items per client column than that would work as well!
Best,
Maartje

Hey @Maartje could you please send us an email at help.looker.com with the details of this issue and the relevant Looker screenshots so we may help you further with this?
thanks

Hi Maartje,

Have u managed to tackle this problem yet? Right now, I am currently stuck on a problem like you. I want another extra column to calculate sumif dynamically based on the column. I don’t want to create derived table, and I think it is possible to create the calculation for that directly in looker.

adstott90
Participant III

Hi Chalee,

Do you want to do this as a measure or a table calculation? If wanting to do this as a measure, you could use a CASE statement or filter in a new sum measure to create a new measure, or if using table calculations you could use one column with an if() calculation to bring back only the values you want, and then sum this “if column”. Hope this helps, let me know if it’s unclear or you need any more information on this.

I am currently doing like what @Marrtje was doing which is to create another dynamic columns like total items that can accept data from that row which is clientID 1 to create sum if or group by countitem value. Doing it anywhere is fine for me whether it is a measure or table calculation. Doing CASE statement is impossible, because we can have lots of clientID, having dynamic sumif will be a lot better (a must). I am looking for a table calculation that can handle dynamically like for example, for the row with client ID = x, then the table calculation will be sum(if client ID = x, then filter only client ID x to do summation) (client ID =x is meaning that the table calculation will dynamically accept value from that row). Thank you so much for helping me tackle the problem. Let me know more if you don’t clear of what I am saying.

Is there an update on the above? I am facing the same issue as Chalee and Maartje.

Same here.
I have two columns with a date. I want to then take the max date of these columns and then a count without duplicate rows in my visualization.

If the only solution for this is to make a lookml dimension that combines the columns its kind of strange. It seems like a very easy thing.

I’m trying to do the same thing here and stuck. @Chalee_Fongamor, were you ever able to figure this out?

I know I’m a bit late to the game here, but it doesn’t look like anyone has really solved this here yet, so here is my solution for summing all rows with matching criteria using only table calculations. 

NOTE: Before creating any of these fields, you will need to sort your table by your criteria, as this method is heavily dependent on your criteria being grouped together.

Field #1: Partition Row Number

if(match(${criteria}, ${criteria})=offset(match(${criteria},${criteria}),-1)
 , 1+row()-match(${criteria},${criteria}) , 1)

Field #2: Subtotal (Only appears on last instance of criteria)

if(
 NOT(${criteria} = offset(${criteria},1)),
 sum(offset_list(${count}, -(${partition_row_number}-1), ${partition_row_number})),

null)

Field #3: Subtotal Row (For referencing in Field #4)

if(${subtotal}>0,row(),null)

Field #4: Total For Rows Matching Criteria

index(${subtotal},  min(offset_list(${subtotal_row}, 0, 5000)))

HOW IT WORKS:

  • The Partition Row Number is used to identify when a change in ${criteria} happens.  
  • Subtotal uses the Partition Row Number to sum all of the values within that specific ${criteria}.
  • Total For Rows Matching Criteria looks at all future values and returns the next row with a non-null value in Subtotal Row, which is the Subtotal value for that criteria.

Personally I needed to take this a step further and added a 5th field in order to see what % of the corresponding criteria each row represented:

${count}/${total_for_rows_matching_criteria}

If you want to know why this works, it plays off of several concepts which I brought together from these resources:

Thanks for this solution! It worked great for me! (Mostly)

One alteration I suggest if your Subtotal (Field #2) results in a 0 when it is non null, which was my case, is to change the Subtotal Row (Field #3) to the following: if(is_null(${subtotal}), null , row()).

I know I’m a bit late to the game here, but it doesn’t look like anyone has really solved this here yet, so here is my solution for summing all rows with matching criteria using only table calculations. 

NOTE: Before creating any of these fields, you will need to sort your table by your criteria, as this method is heavily dependent on your criteria being grouped together.

Field #1: Partition Row Number

if(match(${criteria}, ${criteria})=offset(match(${criteria},${criteria}),-1)
 , 1+row()-match(${criteria},${criteria}) , 1)

Field #2: Subtotal (Only appears on last instance of criteria)

if(
 NOT(${criteria} = offset(${criteria},1)),
 sum(offset_list(${count}, -(${partition_row_number}-1), ${partition_row_number})),

null)

Field #3: Subtotal Row (For referencing in Field #4)

if(${subtotal}>0,row(),null)

Field #4: Total For Rows Matching Criteria

index(${subtotal},  min(offset_list(${subtotal_row}, 0, 5000)))

HOW IT WORKS:

  • The Partition Row Number is used to identify when a change in ${criteria} happens.  
  • Subtotal uses the Partition Row Number to sum all of the values within that specific ${criteria}.
  • Total For Rows Matching Criteria looks at all future values and returns the next row with a non-null value in Subtotal Row, which is the Subtotal value for that criteria.

Personally I needed to take this a step further and added a 5th field in order to see what % of the corresponding criteria each row represented:

${count}/${total_for_rows_matching_criteria}

If you want to know why this works, it plays off of several concepts which I brought together from these resources:

This works for me! But I am running into a problem. My proceeding non-null value is still taking a sum of all previous values of different criteria… so say my first sum should be 10,000 and my next sum should be 15,000. Instead of giving me 10,000 and then 15,000… it gives me 10,000 and then 25,000 (as its pulling in the 10,000 into the second summation operation). I hope I explained this well. Do you have idea what the problem could be? 

Top Labels in this Space
Top Solution Authors