Question

How do I count duplicate values in a table?

  • 5 October 2018
  • 12 replies
  • 6721 views

I’m looking to count (and show the count) for duplicate values in a table, similar to how in Excel I can use the @ symbol to refer to a cell in the current row, then count how many times that row appears in the column, i.e. COUNTIF([CustomerName],[@CustomerName])


How do I replicate this in looker, using table calculations?


12 replies

Hi Wayne!


One way to do this using table calculations is to wrap a sum function around an if statement. An example of this is:


sum(if(contains(${city.city},"Woodridge"),1,0))


Here I’ve got a city name (type string), Woodridge, that I’m using boolean logic on. I’m then summing all the 1’s that occur, so all the instances where the city name is Woodridge.


Hope that helps!


Sara L.

Hey Sara!


Thanks a lot for your response, which would be useful in some circumstances, but as I’m dealing with 1000s of rows, I was after something similar but dynamic, so in other words, I wouldn’t have to specify “Woodridge” but rather “count how many times the specified cell in this row appears in the specified column”, so going down the column, the formula counts occurrences of “Woodridge”, “Libertyville”, “Richmond” etc, without me having to specify for each one. Does that make sense and do you know of a way to do that?


Thanks in advance!

Hi Henry,


We can count the Frequency of a Value in a Column by, modifying the table calcs for subtotals described here then, we can use the table calc below:


count(${view.field}) - match(${view.field}, offset(${view.field}, count(${view.field}) - row() * 2 + 1)) + 2


match(${view.field}}, ${view.field})


Please let us know if that helps!

Hi Sara!


My apologies for the delay in replying to your post, due to being on vacation. Thanks a lot for that calculation, it was exactly, what I was looking for and actually opened a lot more possibilities. Thanks a lot!

Wayne

We had a similar duplicate count request, and we used the above steps, however the table is messy. We are currently using the suggested partition_row_number and duplicate count custom fields. We are trying to get a total number of ‘# of reviews’ for each Project ID. Right now, when a project is reviewed more than once the project ID shows up the number of time it was reviewed. An example of what we would like to show is to have the project ID show up on one line with the TOTAL # of reviews on the same line. We are trying to count the number of times a project ID was in Workscope Review.


Example of Current Visualization:

Project ID Review Review Task Details Task Name # of Reviews

1 Passed Review 1

1 Error Review 2

2 Error Review 1

2 Error Review 2

2 Error Review 3

2 Error Review 4

2 Error Review 5


This is what we would like the Visualization to look like:

Project ID Review Review Task Details Task Name # of Reviews

1 Passed Review 2

2 Passed Review 5


Thank you,

Sarah

Here is a better screenshot of the tables. I recreated them in Excel so I could remove sensitive information. Thanks again!

Looker%20Request

I was able to answer my own question via: Filtering in Visualizations using Table Calculations (3.28+)


Thanks!

Sarah

Userlevel 7
Badge

Nice! That’s exactly the solution I would recommend too 😄


Hi Sara,


I’ve got a similar question to the original poster, however when I try and adapt the code you’ve posted, I get an ‘Expression incomplete’ error and also one of the ‘}’ brackets highlighted in the bottom row to indicate it is not required. I’ve attached an example, albeit without the correct column names for now.


Any help would be greatly appreciated.


Thanks,


Shaun


I’ve not read all the details in this thread, so I don’t know exactly what your formula is supposed to do, but it is obvious to me it is syntactically incorrect. While I don’t know if it will do what you want here, here is revision of your formula that will at least not give an error:


count(${view.field}) - match(${view.field}, offset(${view.field}, count(${view.field}) - row() * 2 + 1)) + 2 + match(${view.field}}, ${view.field})

Hi Quinn,


Appreciate you taking the time to respond and fixing the error in the formula. Happily the formula worked but unfortunately it didn’t achieve what I was hoping it would do. I’ll raise a new thread in regards to this.


Many thanks,


Shaun

Userlevel 3

Hi All

I have a similar question - however I found that the formulae you suggest above does a kind of ranking or just something really wierd.

In Excel terms I want to do the following:
  I have a bunch of IDs possibly duplicated over multiple rows as they might have different individual payments to them. And I want to count the number of times an ID occurs (hence counting the number of payments per ID).
Using the Looker count fields are not working, as they just give me a count of 1 per row.
In Excel terms, I want to do a ‘ =countif([ID], [@ID]) ‘

 

 

Reply