Using parameters to find the top N items versus th...

jax1 · ‎07-18-2022

Looker will not be updating this content, nor guarantees that everything is up-to-date.

The Looker parameter field lets you inject a value directly into the SQL.

Various Looker pages describe how to use this feature:

The Templated filters and parameters documentation page, which discusses creating the field and applying the user input
The parameter reference page for this field type and its allowed_value subparameter
The Field parameters page, which includes the parameter field type and shows which LookML parameters can be subparameters of the parameter field
The Liquid variable reference documentation page, which mentions the {% parameter %} Liquid variable
The Dimension, filter, and parameter types documentation page, which indicates the LookML type parameter values that can be used with parameter fields

In this article, we discuss how to use parameter to let users find the top N items versus the rest of the population in Explores.

The problem

Users often want to find out the top n performing products and compare those with other products.

Without a parameter, you would have to hard code certain buckets (for example, top 5, 10, 20, and so on). However, with a parameter, we can let the user determine n, and we can label the remainder as Other.

The solution

In the context of an e-commerce store, where we want to find out how well the top n products are doing, we would start with a derived table where we rank each product based on the number of sales:

view: brand_rankings {
  derived_table: {
    sql:
    SELECT brand, count(*) as count, RANK() OVER(ORDER BY COUNT(*) DESC) as rank
    FROM order_items LEFT JOIN inventory_items on order_items.inventory_item_id=inventory_items.id
    LEFT JOIN products ON inventory_items.product_id = products.id
    GROUP BY 1 ;;
  }

  dimension: brand {
    type: string
    primary_key: yes
    sql: ${TABLE}.brand ;;
  }

  dimension: rank_raw {
    type: number
    sql: ${TABLE}.rank ;;
  }
}

Using a parameter, we let users specify how many of the top brands they're interested in:

parameter: max_brands {
  type: number
}

By referencing the parameter inside the sql parameter of our rank dimension, we can then create the Other bucket on the fly:

dimension: rank {
  type: string
  sql: CASE WHEN ${rank_raw} <= {% parameter max_brands %} THEN RIGHT('00' + CAST(${rank_raw} AS VARCHAR), 2) ELSE 'Other' END;;
}

dimension: rank_and_brand {
  type: string
  sql: CASE WHEN ${rank} = 'Other' THEN 'Other' ELSE ${rank} || '-' || ${brand} END;;
}

This results in the user's top n items having their own row and the rest being grouped together as Other:

Additionally, we can create a ranking of countries we have sold to, set that ranking based on a metric not shown in the visualization, and select a date range different from what is shown in the results.

To do this, create a view like this:

view: top_5_countries {
  label: "Countries Ranking"
  derived_table: {
    sql:
      select
        country_id as country_code,
        {% parameter country_name_criteria %} as country_rank
      from
      (
        select country_code,
          rank() over(order by count(*) desc) as visitorCount,
          rank() over(order by sum(orders.gross_revenue) desc) as totalGrossRevenue,
          rank() over(order by avg(orders.gross_revenue) desc) as averageGrossRevenue
        FROM orders
        WHERE
          {% condition rank_date_range %}created_at {% endcondition %}
        group by country_code
      ) AS country_summary ;;
    }

  filter: rank_date_range {
    type: date
    description: "Select a range within which you are ranking the ordering of countries by metric selection. E.g. the rank of countries who had the top 10 highest revenue in May"
  }

  dimension: country_code {
    primary_key: yes
    hidden: yes
    type: string
    sql: ${TABLE}.country_code ;;
  }

  parameter: country_name_criteria {
    label: "Ranking Criteria"
    description: "Specify which metric to order the ranking by"
    type: unquoted
    default_value: "totalGrossRevenue"
    allowed_value: {
      label: "Total Gross Revenue"
      value: "totalGrossRevenue"
    }
    allowed_value: {
      label: "Average Gross Revenue"
      value: "averageGrossRevenue"
    }
    allowed_value: {
      label: "Visitor Count"
      value: "visitorCount"
    }
  }

  #### This parameter will allow a user to select a Top N ranking limit for bucketing the countries, almost like parameterizing the Row Limit in the UI
  parameter: country_rank_limit {
    label: "Rank Limit"
    description: "Specify the cutoff for overall rank"
    type: unquoted
    default_value: "5"
    allowed_value: {
      label: "Top 5"
      value: "5"
    }
    allowed_value: {
      label: "Top 10"
      value: "10"
    }
    allowed_value: {
      label: "Top 20"
      value: "20"
    }
    allowed_value: {
      label: "Top 50"
      value: "50"
    }
  }

  dimension: country_rank_top_N {
    hidden: yes
    description: "Rank within the range selected and list of countries based on metric selected. Useful for sorting visualization based on ranking."
    label_from_parameter: country_name_criteria
    label: "Country Code"
    type: number
    sql: case when ${TABLE}.country_rank<={% parameter country_rank_limit %} then ${TABLE}.country_rank else null end ;;
  }

  dimension: country_name_top_N {
    description: "Name of the Country within the rank selection."
    label: "Country Name (Top N)"
    type: string
    sql: case when ${TABLE}.country_rank<={% parameter country_rank_limit %} then ${TABLE}.country_code else 'other' end ;;
  }
}

And then join it to the base Explore:

explore: orders {
  join: top_5_countries {
    type: inner
    sql_on: ${top_5_countries.country_name_top_N} = ${orders.country_code} ;;
    relationship: many_to_one
  }
}

This allows us to answer such edge-case questions as "Show me the top 10 countries and their Gross Revenue for the last 7 days, based on how many unique visitors those countries had in the 7 days prior, and display them in order of that visitor count."

Using parameters to find the top N items versus the rest of the population

The problem

The solution