[Analytic Block] Peer Comparison: Compare Against ...

shohei · 10-15-2015 12:18 PM

About This Block

Note: you can find an alternative pattern for multi-level peer comparison and “share of wallet” comparison here

Peer comparison analysis shows you how one group, categorized by an individual or set of determinants, is performing against another group. You can take this one step further, and compare that first group against of the rest of the population of users/items. This helps companies get a comprehensive view of an individual item’s performance based on any number of attributes that you collect about that product, user, etc. Peer comparison has helped companies answer many questions, such as:

How is Sales Rep A performing compared to others in his/her geography? Compared to the national average?
How is the average weekly growth of Stock A trending compared to the rest of the portfolio?
What is the customer conversion rate for visitors referred from Google and Facebook compared to all other channels?
What is the most popular or profitable category of item sold by Brand A, compared to every other brand?

In this example, we will demonstrate how you can implement peer comparison, or Rest of the Population, analysis in Looker using a sample eCommerce dataset. In the process, we will uncover two key takeaways:

How average profit dollars vary between item categories across brands
Which item category is the most popular in one brand compared, to all other brands

Ideal Data Types

This Block is best suited for datasets containing individual or groups of populations such as brands, sales representatives, regions, etc. This data may come from Salesforce, web analytics tools (Google Analytics, Segment, etc.), and other transactional data tied to groups that can be singled out for comparison against everybody else.

In terms of KPIs, averages or percent-of-total calculations work best, as these KPIs enable you to look at apples-to-apples comparisons across groups. On the flip side, totals and cumulative calculations will skew the results towards larger groups, especially for the Rest of the Population category. We will explore both results to show you what kind of insights you should be looking for in peer comparisons.

Expected Output

Peer comparison analysis allows you to see how one individual / group is performing compared to the rest of the population in one or more KPIs.

This sample image compares the number of Calvin Klein products with the rest of the products offered by this retailer. With this info in hand, a brand manager may decide to focus on marketing campaigns to increase sales for Calvin Klein by focusing on the items with the highest product count.

Try it yourself!

How It’s Done

This is a very simple and easy-to-implement pattern utilizing both filter-only fields and Liquid parameters to specify individual groups to compare with the rest of the population. Part of the beauty of this approach of using Liquid in Looker is that it allows end-users to make the item selection straight from the user interface, which is then safely inserted as SQL into our optimizer.

We will first start with creating a filter-only field in LookML. Filter-only fields are created using the following syntax:

 filter: insert filter field name here {

  suggest_dimension: (insert dimension name where potential values come from)

}

The suggest_dimension parameter enables you to use an existing dimension from which to select your desired individual / group. While the use of suggest_dimension is optional, it is highly recommended, especially if you have dimensions referring to brand names, sales representative names, region / state names, etc., to ensure users select the correct values rather than mistyping them using free form text.

For our example, we’ll want to define a filter-only field to select specific brand names as described below:

 filter: brand_select {

  suggest_dimension: brand.name

  }

The second field we’ll define is a comparitor dimension. This is where Liquid parameters come into play:

 dimension: brand_comparitor {

    type: string

  sql: 

    CASE

      WHEN {% condition brand_select %} ${brand_name} {% endcondition %}

        THEN ${brand_name}

      ELSE 'Rest of Population'

    END ;;

  }

What’s happening here is the following:

The brand_comparitor dimension reads the values specified in the brand_select filter
For each brand_name entry in the data table, if the brand_name value matches the value in brand_select, that same brand_name is returned.
If the brand_name value does not match what’s in brand_select, it gets grouped into “Rest of Population.”

Now you can select any measure and slice it by this brand_comparitor to see the results grouped by each value captured in brand_select, as well as all other values captured in “Rest of Population”!

With this Looker Analytics Block under your belt, you’re on your way to creating more peer comparison patterns. Some common combinations of filter-only fields and comparitor dimensions include:

Sales reps
Stock and other securities
Product categories
SKUs in inventory management
etc.

Compare to Entire Population

This design can also be used to compare groups to the entire population just by tweaking the LookML logic. Let’s take a look at the following LookML:

**Model File**

explore: products {

  hidden: true



   join: num {

    type: cross

    relationship: one_to_many

    }



}



**View File**

view: num {

  derived_table : {

    sql: SELECT 1 as n UNION SELECT 2 ;;

}

  dimension: n {

    hidden: true

}

Notice the cross join to num containing the values 1 and 2; we’ve just created a fanout in order to obtain two rows per entry in products so we can aggregate data for selected groups using num.n = 2. Though other groups will each have two rows, the one_to_many relationship lets Looker know to be aware of this fanout in order to avoid double-counting, and correctly calculate symmetric aggregates.

Next, we’ll edit the brand_comparitor dimension to reflect the change in business logic:

dimension: brand_comparitor {

type: string

    sql: 

      CASE

        WHEN {% condition brand_select %} ${brand_name} {% endcondition %} AND ${num.n} = 1 

          THEN ${brand_name}

        ELSE 'Total Of Population'

      END ;;

}

This CASE statement runs through the following steps:

For each row, look for brand_name that matches brand_select, and where num.n = 1
If brand_name indeed matches brand_select, that entry gets grouped by brand_name
If the brand_name doesn’t match brand_select or num.n = 2, that entry gets thrown into the Total of Population category

Due to the one_to_many relationship, Looker will use COUNT DISTINCT instead of COUNT to accurately aggregate whatever measure you’re using.

The following is an example of what the results look like:

Tanmmay92

Hi! Let’s say we want to include 5 more brands in the comparitor pivot with “Calvin Klein” along with the rest of the population, how would we go about doing that?

GenDemo

If your ‘Compare to Entire Population’ section, point 4: wouldn’t this be double counting? The case statement will select where the brand_name AND n = 1. But if it is one or the other, or neither, then it will all flow into ‘Total of Population’?

GenDemo

I did an extension on this - use case: we have agents operating in different states. We want to be able to compare an agent to the state they are in, not compare them to the national figure.

So after surprisingly little trial and error, I got this working - please let me know if there is a more elegant way to do it.

# Filters for Peer Group Comparison --------------------------------------------------------------------------------------
  filter: agent_select {
    suggest_dimension: latest_primary_agent_name
  }

  dimension: agent_select_state {
    type: string
    hidden: yes
    sql: 
      (select address_state
       from (
        select distinct ${TABLE}.address_state, count(*)
        from dw.${TABLE}
        where ({% condition agent_select %} ${primary_agent_name} {% endcondition %})
        group by ${TABLE}.address_state
        order by count(*) desc
        limit 1
        ))
        ;;
  }


  dimension: agent_peer_state_group {
    group_label: "Peer Groups"
    type: string
    sql: case
            when {% condition agent_select %} ${primary_agent_name} {% endcondition %} 
              and ${address_state} = ${agent_select_state}
              then ${primary_agent_name}
            when ${address_state} = ${agent_select_state}
              then 'State Peer Group'
           else 'Rest of the Population'
          end
        ;;
  }
# ---------------------------------------------------------------------------------------

Then in the Looker tiles, I just filter out the ‘Rest of the Population’.