Create an "Other" bucket in a chart

  • 10 July 2015
  • 5 replies
  • 963 views

Userlevel 4

April 2016 Update: You can also add an “Other” bucket [using Table Calculations] (https://discourse.looker.com/t/other-buckets-using-table-calculations/2419)


It is common to have a few customers or products that make up most of a companies sales, and a long list of others making up a small amount of sales. The same principal often applies to errors, brands, or any other categorical data. Exploring data from our (fictional) shoe store, it is pretty easy to see situation in a chart of shoe sales by brand:



As long as it is under 50 rows, Looker will also allow for a pie chart:



Which is very tough to read. But what if we only care about our top brands, we could filter our look to give us only those with a large (over $1 Million) total revenue.



However we lose any understanding of the rest of our sales. We could however show all of our brands and show only the details for the top, lets say 10, brands and bucket the rest together as"Other."


The following view uses a derive table to rank the brands, so that we can bucket an “other” group with the top_10_brand_name dimension. [This use case only has a date and brand requirements. In order to add other dimensions you would need a more complicated query]


 - view: top_brand
derived_table:
sql: |
SELECT brand,
RANK () OVER
(
ORDER BY sum(sales_revenue) DESC
) rank
FROM order_details
WHERE {% condition order_details.date %} order_details.date {% endcondition %}
GROUP BY 1

fields:

- dimension: brand_name
primary_key: true
sql: ${TABLE}.brand_name

- dimension: brand_rank
sql: ${TABLE}.rank

- measure: min_rank
type: min
sql: ${brand_rank}
hidden: true

- dimension: top_10_brand
type: yesno
sql: ${TABLE}.rank < 10


- dimension: top_10_brand_name
sql: CASE WHEN ${top_10_brand} = 'Yes' then ${brand_name} else 'Other' end
order_by_field: min_rank

This can be joined in (or extended) to your existing orders explore (this case through the ‘items’ table:


- explore: order_details
joins:
- join: top_brands
sql_on: ${order_details.brand} = ${top_brands.brand_name}

Here is the output in explore:



Or as a column chart:



Add a quick table calculation and make this a Pareto chart: (More on these coming soon)


running_total(${order_details.total_revenue})/sum(${order_details.total_revenue})


value_format in this graph is [<=1]#%;$#,“M”


5 replies

How do you actually add the pareto chart series and include in the graph? Thanks!

Userlevel 4

Jason,



That is just a simple running_total table calculation and divide that by the sum of the whole column:





You can use percent_0 as the format or a custom format as shown in the image.



The calculation is here:



running_total(${order_details.total_revenue})/sum(${order_details.total_revenue})

Userlevel 1

Don’t want to create a derived table for brand ranking? Consider creating a dimension which uses a subquery: Capturing “top” entities/values, using a dimension

Userlevel 3


I didn’t know you could do that, very nice!

Not sure if anyone monitors these threads anymore but it seems like Zev’s post (https://discourse.looker.com/t/other-buckets-using-table-calculations/2419never got transferred to the new Google Community.  I’d love to see the table calc version of this.

 

 

Reply