Question

How to combine minor dimension values into an "Other" value

  • 15 May 2015
  • 9 replies
  • 1431 views

Userlevel 2

Is there a way to do in a Look what the following does in SQL?


select
if( c in ('Anchorage','Fairbanks','Juneau'),-- that is, the c column has one of those values
-- then
c,
-- else
'Other'
) as city,
count(*) -- or another aggregate function
from t
group by city;

This groups not by the column in the table itself but by that same column with some values combined into one.


9 replies

Userlevel 4

You probably have two options here depending on the amount of flexibility you want.


First, you could create a simple CASE WHEN dimension, as to whether city is part of the group you would like (you can use raw SQL as well, if that is easier for you): http://www.looker.com/docs/reference/field-reference#sql_case


The second alternative would be to use the design pattern Lloyd lays out, using parameters to dynamically choose the cities in a given group. This requires a touch more modeling, but would let you define the cities to compare on the fly: https://discourse.looker.com/t/cohort-analysis-ad-hoc-dimensions-pivots/703/2

Userlevel 2

Many thanks, colin. I’ll do the former for now but will bear the latter in mind for future use. Here’s the code I used, so later users can see the solution:


  - dimension: city_with_minor_collapsed
sql: |
if(${TABLE}.c in ('Anchorage','Fairbanks','Juneau')
,concat(' ',${TABLE}.c)
,'Other'
)

The space before the city name (concat(' ',${TABLE}.c)) is so that ‘Other’ alphabetizes last.

Userlevel 5
Badge

Another alternative that would help out with custom sorting is the sql_case parameter. With sql_case, it will automatically sort the buckets in the order they are defined. So something like this might work:


- dimension: city_with_minor_collapsed
sql_case:
Anchorage: ${TABLE}.c = 'Anchorage'
Fairbanks: ${TABLE}.c = 'Fairbanks'
Juneau: ${TABLE}.c = 'Juneau'
Other: true

This will sort the Other bucket after the three buckets. This is also less dynamic than Colin’s second recommendation above, but can work great for set buckets.

Userlevel 2

lindsey, thanks. I prefer my solution because it’s briefer and less redundant. The workaround of adding a space is a small cost in my opinion. Would


- dimension: city_with_minor_collapsed
sql_case:
${TABLE}.c: ${TABLE}.c in('Anchorage','Fairbanks','Juneau')
Other: true

work (and sort ‘Other’ last)? That would be even better.

Userlevel 5
Badge

sql_case only works with explicit strings to the left of the colons. This is due to that fact that it only takes string values, whereas a field could have any type. So in your case, a CASE WHEN would probably be the best way to go!

Michael, what SQL dialect are you working in? I haven’t seen an Excel-like IF statement in a SQL dialect I’ve worked with yet, so I’m curious.

Userlevel 2

MySQL.

Userlevel 2

I should note that the code I used above


- dimension: city_with_minor_collapsed
sql: |
if(${TABLE}.c in ('Anchorage','Fairbanks','Juneau')
,concat(' ',${TABLE}.c)
,'Other'
)

exposes a Looker bug: when filtering (either by typing values into a filter or by drilling), the initial space is trimmed. Thus, if your city_with_minor_collapsed is ’ Anchorage’ (with an initial space) and you attempt to filter on that value, no rows will be returned.


For this reason, I gave up on the initial space and wound up instead using:


    - dimension: city_with_minor_collapsed
sql: |
if(${TABLE}.c in ('Anchorage','Fairbanks','Juneau')
,${TABLE}.c
,'Other'
)

This no longer alphabetizes ‘Other’ last; if that’s important to you, see @lindsey’s reply above.

Userlevel 5
Badge

Our filters and drills trim extra spaces for usability, since often when filter values are written with commas, a space is added after the comma that shouldn’t be in the filter value.


In order to custom sort on this field while also being able to use normal filters and drills, I would suggest either using sql_case or checking out the order_by_field parameter. This enables you to sort a field by the order of another field. So in this case, you could hide the field with spaces, and create another field without spaces that is ordered by the field with spaces.


Alternatively, selecting matches(advanced) when filtering will let you filter on values with leading spaces by surrounding the string with double quotes.

Reply