Distinct count or select across several columns

miruna_a
Participant I

Hi!

I have a table with three columns. I want to be able to count all the unique values across the 3 columns. Basically, in the end, I would like to have a column with each ID found in the 3 previous ones, but only once (thus, only have unique values) - a column with the total number of unique values, from the 3 initial ones.

Any ideas on how to do that or whether it’s possible to do it?

Thanks a lot!

0 6 8,393
6 REPLIES 6

adstott90
Participant III

Hi Miruna,

If you simply want to count the number of unique items in the columns I would use the count_distinct function in the calculation to do something like:

count_distinct(column 1) + count_distinct(column 2) + count_distinct(column 3)

Alternatively, you could possibly create a separate calculation for each column (if you want to see the count of unique values per column) and then add these together using a fourth table calculation.

miruna_a
Participant I

Thanks Adam!

However, if I do a count_distinct per each column, then I would end up counting the same values several times.

In the 3 columns that I have, the same values can be found across several columns (for example, column 1 and column 2 can include the same value on several rows). I need a way to do an aggregation of the 3 columns first and then do a count_distinct on everything.

adstott90
Participant III

In that case I think a way would be, if this is something you want to use quite frequently, to maybe build a custom dimension, and then use the UNION function in the SQL line to combine the three columns (in theory, although I haven’t got chance to test that at the moment!). There may be a way using table calcs but after a quick internet search I haven’t been able to find a function.

miruna_a
Participant I

Great! Thanks!

Do you have any other information about how to use the Union function?

Unfortunately I could find information about this function in Looker only here: UNION & UNION ALL in Looker.

Thanks again for your help! 🙂

I know this is a bit old but it still ranks highly on google

if you dialect supports it

measure: my_distinct_count {
    type: count_distinct
    sql:  ${col_1}, ${col_2}, ${col_3} ;;
  }

if not try to concatenate the columns with explicit sql
 

measure: my_distinct_count {
type: number
sql: (SELECT COUNT(DISTINCT col_1 || col_2 || col_3)) ;;
}

xwb
Observer

To echo the above, this still ranks highly on google. My own problem was similar, and my solution syntax was a little different. just adding it here in case someone else (or me again) ends up needing to revisit. For context, our dialect was redshift.

measure: count_distinct_col_ {
label: "Count of distinct ids"
type: count_distinct
sql: CONCAT(${col_2},${col_1}) ;;
}
Top Labels in this Space
Top Solution Authors