Create an "Other" bucket in a chart

April 2016 Update: You can also add an “Other” bucket [using Table Calculations] 

https://community.looker.com/technical-tips-tricks-1021/creating-a-pie-chart-with-over-50-rows-using...

It is common to have a few customers or products that make up most of a companies sales, and a long list of others making up a small amount of sales. The same principal often applies to errors, brands, or any other categorical data. Exploring data from our (fictional) shoe store, it is pretty easy to see situation in a chart of shoe sales by brand:

417bb416a3d894ad4165e4a1ea2c1462ab57f53a.png

As long as it is under 50 rows, Looker will also allow for a pie chart:

a8ef158f73910ac323114f9ac830ab74e26bd208.png

Which is very tough to read. But what if we only care about our top brands, we could filter our look to give us only those with a large (over $1 Million) total revenue.

de73627ec96a8a2f612db84e4046447b5e76e517.png

However we lose any understanding of the rest of our sales. We could however show all of our brands and show only the details for the top, lets say 10, brands and bucket the rest together as"Other."

The following view uses a derive table to rank the brands, so that we can bucket an “other” group with the top_10_brand_name dimension. [This use case only has a date and brand requirements. In order to add other dimensions you would need a more complicated query]

 - view: top_brand

derived_table:

sql: |

SELECT brand,

RANK () OVER

(

ORDER BY sum(sales_revenue) DESC

) rank

FROM order_details

WHERE {% condition order_details.date %} order_details.date {% endcondition %}

GROUP BY 1



fields:



- dimension: brand_name

primary_key: true

sql: ${TABLE}.brand_name



- dimension: brand_rank

sql: ${TABLE}.rank



- measure: min_rank

type: min

sql: ${brand_rank}

hidden: true



- dimension: top_10_brand

type: yesno

sql: ${TABLE}.rank < 10





- dimension: top_10_brand_name

sql: CASE WHEN ${top_10_brand} = 'Yes' then ${brand_name} else 'Other' end

order_by_field: min_rank

This can be joined in (or extended) to your existing orders explore (this case through the ‘items’ table:

- explore: order_details

joins:

- join: top_brands

sql_on: ${order_details.brand} = ${top_brands.brand_name}

Here is the output in explore:

361be28d707a54164e3d364bcfd3181f88b74f30.png

Or as a column chart:

26bae9b9dfb6669b59b503f87dc94ddadaf73ca8.png

Add a quick table calculation and make this a Pareto chart: (More on these coming soon)

running_total(${order_details.total_revenue})/sum(${order_details.total_revenue})

7e225b5d0c66acddacd82b63c38583f5634aca76.png

value_format in this graph is [<=1]#%;$#,“M”

8 6 3,890
6 REPLIES 6

How do you actually add the pareto chart series and include in the graph? Thanks!

Jason,

That is just a simple running_total table calculation and divide that by the sum of the whole column:

49608eed77557fbd7205ad23fa408421f36a18d7.png

You can use percent_0 as the format or a custom format as shown in the image.

The calculation is here:

running_total(${order_details.total_revenue})/sum(${order_details.total_revenue})

Don’t want to create a derived table for brand ranking? Consider creating a dimension which uses a subquery: Capturing “top” entities/values, using a dimension

Rich1000
Participant III

I didn’t know you could do that, very nice!

Not sure if anyone monitors these threads anymore but it seems like Zev’s post (https://discourse.looker.com/t/other-buckets-using-table-calculations/2419never got transferred to the new Google Community.  I’d love to see the table calc version of this.

Measure min_rank references another measure brand_rank (which uses sum). This keeps giving warnings! Measure referencing a measure.

Top Labels in this Space
Top Solution Authors