Naming Fields for Readability

Brett_Sauve · 04-16-2015 01:26 PM

Dimension and measure names can get confusing, especially as models grow. Making the proper field references while writing LookML, and choosing the correct field in the field picker, requires thoughtful naming. One approach that will help with this is a consistent naming methodology. The following conventions are those often used internally at Looker.

“Count” for Count Measures

type: count measures are called [Filter] Count. For example:

- view: users
  fields:
  - measure: count
    type: count

  - measure: active_count
    type: count
    filters:
      status: 'active'

This will give names in the Looker UI like:

USERS Count
USERS Active Count

These field names read naturally, although business users may not initially know to search for “count” as opposed to “number of” or “total”. However, with some very simple training, this seems to be the best option.

We don’t use “number of” or “total” for counts, because we want to reserve those words for other cases.

“Unique Count” for Count Distinct Measures

type: count_distinct measures are called Unique [Filter] [Entity] Count. For example:

- view: users
  fields:
  - measure: unique_user_count
    type: count_distinct
    sql: ${id}

  - measure: unique_active_user_count
    type: count_distinct
    sql: ${id}
    filters:
      status: 'active'

  - measure: unique_name_count
    type: count_distinct
    sql: ${name}

This will give names in the Looker UI like:

USERS Unique User Count
USERS Unique Active User Count
USERS Unique Name Count

“Total” for Sum Measures

type: sum measures are called Total [Filter] [Entity]. For example:

- view: orders
  fields:
  - measure: total_revenue
    type: sum
    sql: ${TABLE}.revenue

  - measure: total_chicago_revenue
    type: sum
    sql: ${TABLE}.revenue
    filters:
      city: 'Chicago'

This will give names in the Looker UI like:

ORDERS Total Revenue
ORDERS Total Chicago Revenue

“Number of” for Quantity Dimensions

type: number dimensions that represent a quantity are called Number of [Entity]. For example:

- view: orders
  fields:
  - dimension: number_of_items
    type: number
    sql: ${TABLE}.items

This will give names in the Looker UI like:

ORDERS Number of Items

Using the word “count” or “total” in this case can be quite confusing, because you will not know if you are working with a dimension or measure.

“Average” for Average Measures

type: average measures are called Avg [Filter] [Entity]. For example:

- view: orders
  fields:
  - measure: avg_revenue
    type: average
    sql: ${TABLE}.revenue

  - measure: avg_chicago_revenue
    type: average
    sql: ${TABLE}.revenue
    filters:
      city: 'Chicago'

This will give names in the Looker UI like:

ORDERS Avg Revenue
ORDERS Avg Chicago Revenue

“Over” or “Per” For Ratios

Measures and dimensions that are based on ratios between two other fields are called [Numerator] over [Denominator] or [Numerator] per [Denominator]. For example:

- view: orders
  fields:
  - dimension: revenue_per_number_of_items
    type: number
    sql: ${revenue} / ${number_of_items}

  - measure: total_revenue_per_unique_customer
    type: number
    sql: ${total_revenue} / ${unique_customer_count}

This will give names in the Looker UI like:

ORDERS Revenue Per Number Of Items
ORDERS Total Revenue Per Unique Customer

If you were to call the first dimension something like Avg Revenue Per Item it would be less clear that you were using a type: number dimension instead of a type: average measure.

Brett_Sauve

“Is” or “Has” for Yes / No Dimensions

type: yesno dimensions are called Is [Condition] or Has [Condition]. For example:

- view: orders
  fields:
  - dimension: is_sold
    type: yesno
    sql: ${TABLE}.status = 'sold'

This will give names in the Looker UI like:

ORDERS Is Sold

ogkushy00

That's perfect well done.

Dillon_Morrison

Great stuff here Brett! This is a great overview of naming / formatting individual dimensions and measures.

Here’s a few additional best practices around naming explore views and eliminating redundant or unnecessary dimensions and measures:

##Dashboards

Dashboards are accessed through “Spaces”, which are essentially user/department folders within Looker. In Spaces, you can create folders for each of your organization’s departments, then house your dashboards within the most appropriate folder(s). With this set-up, when you wants to navigate to, say, a Revenue Year-over-Year dashboard, they’ll simply need to navigate to your “Revenue” folder, then select the desired dashboard.

There are some circumstances in which someone may not be exposed to Spaces prior to viewing a dashboard, such as downloading the dashboards as a PDF (for view-only users), embedding your dashboards, etc., For additional clarity in these instances, you can name the dashboard according to the folder in which it belongs by prefacing your dashboard titles with the department or use of the dashboard. For example, in your case, “Product”, “Revenue”, and “Marketing” are likely the best groupings. For example, I might label your “Channel Overview” dashboard “Revenue - Channel Overview”, and so on. Please note that this is just an additional layer of clarity on top of the Spaces categorization, and would be redundant to most normal users.

##Model & Explore Views

Hovering over the “Explore” tab will prompt Looker to display all the tables that can be used as starting points to begin the data exploration process. The scroll pane will group these tables according to the “models” they belong to.

In addition to this high level categorization, we can further organize the tables still exposed in the scroll pane. We’ll want to name these views with a title that conveys their intended use. In some cases, it might make sense to group the views by the type of information the view holds, such as “revenue drivers”, “customer information”, “vendor history”, etc… In other cases, it might be easier to group the views by the department that will be using them, such as “marketing”, “product”, “finance”, etc… Think about how your business users will be interacting with the application. How are your internal business users grouped? What sort of questions will they be asking? Use this to guide your naming conventions and organization.

In your case, for example, you might expose your “Sales” table, which holds all information around your sales metrics and locations, and includes all joined dimensions and measures from the associated tables. You would then hide your “Regions”, “Countries”, “Buyer Loyalty”, and any other supplemental or non-core tables from view, since these tables are already available for exploration from your Sales Explore view.

Individual Dimensions and Measures

Dimensions and measures are displayed in a the format of “TABLE field”. For example, the “count” measure from the “User” tables would read as USER count. These names likely make sense to your database analysts and developers, but they probably aren’t as intuitive for business users. For these individual dimensions and measures, I’d recommend that naming the table clause something that would be understandable to business users. For example, rather than displaying USER ORDER FACTS lifetime orders, we can label the dimension USER lifetime orders. This ensures that all dimensions and measures related to users are displayed with a “User” prefix rather than a “user_order_facts” prefix, which wouldn’t carry any meaning for a business user. Some sample code for this is below:

  - view: user_order_facts
    derived_table: ...
    ...
  - dimension: user.lifetime_orders  #The “user.” prefix declares a new display table name
    sql: ${TABLE}.lifetime_orders

In those instances in which you have multiple dimensions that display the same data, best practice is to hide all of the dimensions except the most fitting/intuitive dimension. For example, if you have a column for region name in your “Region”, “Country”, and “Sales” tables, you should hide the COUNTRY region name and the SALES region name dimensions, leaving only the REGION name dimension. This helps reduce redundancy and clutter.

   - view: country
     fields:
 
     - dimension: region_name       
       hidden: true                       #The hidden parameter ensures the dimension 
       sql: ${TABLE}.region_name            #exists, but is not exposed

It also helps to hide any dimensions that won’t be used by any of the business users. Fields like REGION region code, which is a randomly assigned integer, won’t have any significance to a business user, so there probably isn’t much use in exposing it in the Explore window.

Happy organizing!

rtblair

@Dillon_Morrison is this now obviated by the view_label feature within models?

Brett_Sauve

That’s correct @rtblair, using a period in a dimension name is now deprecated (see this post). If you wanted to do this type of thing you would use view_label.

Justin_DICE

This could be a minority opinion, but it would be GREAT to have the option to switch off ‘full field names’ as the default. Admittedly a less sustainable option in terms of model clarity, for business users it’s a lot easier.