Question

Distinct count or select across several columns

  • 1 July 2019
  • 5 replies
  • 903 views

Hi!


I have a table with three columns. I want to be able to count all the unique values across the 3 columns. Basically, in the end, I would like to have a column with each ID found in the 3 previous ones, but only once (thus, only have unique values) - a column with the total number of unique values, from the 3 initial ones.


Any ideas on how to do that or whether it’s possible to do it?


Thanks a lot!


5 replies

Hi Miruna,


If you simply want to count the number of unique items in the columns I would use the count_distinct function in the calculation to do something like:


count_distinct(column 1) + count_distinct(column 2) + count_distinct(column 3)


Alternatively, you could possibly create a separate calculation for each column (if you want to see the count of unique values per column) and then add these together using a fourth table calculation.

Thanks Adam!


However, if I do a count_distinct per each column, then I would end up counting the same values several times.


In the 3 columns that I have, the same values can be found across several columns (for example, column 1 and column 2 can include the same value on several rows). I need a way to do an aggregation of the 3 columns first and then do a count_distinct on everything.

In that case I think a way would be, if this is something you want to use quite frequently, to maybe build a custom dimension, and then use the UNION function in the SQL line to combine the three columns (in theory, although I haven’t got chance to test that at the moment!). There may be a way using table calcs but after a quick internet search I haven’t been able to find a function.

Great! Thanks!


Do you have any other information about how to use the Union function?


Unfortunately I could find information about this function in Looker only here: UNION & UNION ALL in Looker.


Thanks again for your help! 🙂

I know this is a bit old but it still ranks highly on google

if you dialect supports it

measure: my_distinct_count {
    type: count_distinct
    sql:  ${col_1}, ${col_2}, ${col_3} ;;
  }

if not try to concatenate the columns with explicit sql
 

measure: my_distinct_count {
type: number
sql: (SELECT COUNT(DISTINCT col_1 || col_2 || col_3)) ;;
}

 

Reply