Create an "Other" bucket in a chart

eFein · 07-10-2015 02:07 PM

April 2016 Update: You can also add an “Other” bucket [using Table Calculations]

https://community.looker.com/technical-tips-tricks-1021/creating-a-pie-chart-with-over-50-rows-using...

It is common to have a few customers or products that make up most of a companies sales, and a long list of others making up a small amount of sales. The same principal often applies to errors, brands, or any other categorical data. Exploring data from our (fictional) shoe store, it is pretty easy to see situation in a chart of shoe sales by brand:

As long as it is under 50 rows, Looker will also allow for a pie chart:

Which is very tough to read. But what if we only care about our top brands, we could filter our look to give us only those with a large (over $1 Million) total revenue.

However we lose any understanding of the rest of our sales. We could however show all of our brands and show only the details for the top, lets say 10, brands and bucket the rest together as"Other."

The following view uses a derive table to rank the brands, so that we can bucket an “other” group with the top_10_brand_name dimension. [This use case only has a date and brand requirements. In order to add other dimensions you would need a more complicated query]

 - view: top_brand

   derived_table:

    sql: |

     SELECT brand,

      RANK () OVER

      (

       ORDER BY sum(sales_revenue) DESC

      ) rank

     FROM order_details

     WHERE {% condition order_details.date %} order_details.date {% endcondition %}

     GROUP BY 1



    fields:



    - dimension: brand_name

      primary_key: true

      sql: ${TABLE}.brand_name



    - dimension: brand_rank

      sql: ${TABLE}.rank



    - measure: min_rank

      type: min

      sql: ${brand_rank}

      hidden: true



    - dimension: top_10_brand

      type: yesno

      sql: ${TABLE}.rank < 10





    - dimension: top_10_brand_name

      sql: CASE WHEN ${top_10_brand} = 'Yes' then ${brand_name} else 'Other' end

      order_by_field: min_rank

This can be joined in (or extended) to your existing orders explore (this case through the ‘items’ table:

- explore: order_details

  joins:

    - join: top_brands

      sql_on: ${order_details.brand} = ${top_brands.brand_name}

Here is the output in explore:

Or as a column chart:

Add a quick table calculation and make this a Pareto chart: (More on these coming soon)

running_total(${order_details.total_revenue})/sum(${order_details.total_revenue})

value_format in this graph is [<=1]#%;$#,“M”

Jason_Shoemaker

How do you actually add the pareto chart series and include in the graph? Thanks!

eFein

Jason,

That is just a simple running_total table calculation and divide that by the sum of the whole column:

You can use percent_0 as the format or a custom format as shown in the image.

The calculation is here:

running_total(${order_details.total_revenue})/sum(${order_details.total_revenue})

anika1

Don’t want to create a derived table for brand ranking? Consider creating a dimension which uses a subquery: Capturing “top” entities/values, using a dimension

Rich1000

I didn’t know you could do that, very nice!

duke_of_data

Not sure if anyone monitors these threads anymore but it seems like Zev’s post (https://discourse.looker.com/t/other-buckets-using-table-calculations/2419) never got transferred to the new Google Community. I’d love to see the table calc version of this.

divya_bennett

Measure min_rank references another measure brand_rank (which uses sum). This keeps giving warnings! Measure referencing a measure.