Question

Organizing your LookML into layers with our new refinement syntax

  • 29 April 2020
  • 9 replies
  • 2597 views

Userlevel 6
Badge

With the latest Looker 7.6 release, we are introducing the concept of LookML refinements. In this article, I’ll briefly cover how they work, and then illustrate how they can be used to organize or “layer” your LookML code so that related code stays together.


Refinements at a Glance


Refinements are similar to extends in that they allow you to make changes to a base LookML object, but they make that change “in-place”, overwriting the already existing object, rather than requiring you to create a new name for a new object.


Let’s take a look at a basic example. Notice the plus signs:



view: fact {
sql_table_name: fact ;;
}

view: +fact {
label: "Facts"
dimension: date { type:date }
}

view: +fact {
label: "Quantified Facts"
dimension: amount { type:number }
}


These 3 separate declarations effectively get combined, using the same merging rules as with extends, and are equivalent to:


view: fact {
sql_table_name: fact ;;
label: "Quantified Facts"
dimension: date { type:date }
dimension: amount { type:number }
}

This is simultaneously a very small change but also a very big change! Refinements will have a number of uses, like making it easier to author and consume LookML blocks (pre-defined LookML modules) or cross-project imports. But, here we’re focusing on one particularly exciting and broadly-applicable use: helping a team of developers to better organize their LookML project.


LookML Layering


Traditionally, LookML has required you to organize your code according to Looker-centric objects: views, models, etc.


For example, if you wanted to put two related dimensions next to each other so that you could clearly read their shared logic, it was only possible if those dimensions were in the same view.


Now, with refinements, we have a lot more flexibility. We can define our views in one place, and then in another place add dimensions, even if they span multiple existing views.


In practice, this means organizing our code into a number of “layers”:


image




As a quick aside - curious about that `.layer.lkml` file type? (Click to expand)

Within the .lkml file types, other than changing the icon used in the field picker, only the .model.lkml file type has a concrete function. Any other .lkml file type you may see, including .view.lkml is purely for organization.


We chose .layer.lkml to represent the layering of LookML functionality via potentially multiple layers of refinement. However, there is no magic associated with this file type - it would work with any other .lkml type! That said, we recommend using it.



Now, when applying the concept of layers, there are lots of perfectly valid file naming conventions and folder structures, so let’s dive in to the “why” of each layer so you can apply the concept in a way that makes sense for your project.


The model file (or files)


Looker will always require a model file to know where to start from. The file needn’t contain much beyond a series of includes to the remaining files.




If you have multiple models files that should share code... (Click to expand)

Whether due to working with multiple customers, multiple departments, or multiple databases, some Looker instances will want to share some LookML across multiple models.


There’s an easy way to do this, and although not a new concept, I mention it here since it lines up well with the philosophy of layering and makes great use of refinements. You just move the declarations that would have normally gone directly in your model into a layer file and include it into each model.


For example:



# customer_a.model.lkml

include: "all_models.layer.lkml"

+view: facts { sql_table_name: customer_a.facts ;; }



# customer_b.model.lkml

include: "all_models.layer.lkml"

+view: facts { sql_table_name: customer_b.facts ;; }



# _all_models.layer.lkml

include: "_base.layer.lkml"

include: "_basic.layer.lkml"

include: "..."



The “base” or “raw” layer


This layer is dedicated to holding any machine-generated LookML. This way, when changes happen in the database schema, you can simply re-run the LookML generator to get all of these updates, without affecting any hand-written changes you will have put into subsequent layers.


Although the layer could be composed of many files, we think it’s quite convenient for all the machine-generated code to be in a single file. To better enable this workflow, we are working on the ability to have our LookML generator output its LookML into a single file instead of into one file per view.


File naming conventions may vary, but I recommend naming it with a leading underscore so it sorts before other layers in the field picker.


The “basic” or “standard” layer


In this layer, we use refinements to enrich our generated LookML with any standard declarations that follow the structure from the generated schema. The idea here is to only include things that are more closely related to the structure of the data than they are to specific business logic or use cases. Things like primary key declarations, hiding non-business fields, basic labeling & descriptions, etc. Even basic explores with many_to_one joins on foreign keys can live here.


As far as one vs multiple files, both are valid. You could maintain one monolithic file, you could separate them out into multiple files by “schema” or “dataset”, or you could keep to the traditional one-file-per-view approach. If you’re going to use more than one file, a folder can help keep these together.


Logical layers


Here is where the magic happens. We can have a number of layers, each containing related business logic.


Since this concept is pretty new, an example would probably go a long way!


Let’s say we wanted to define profit for our orders as price minus cost. Since this dimension does not directly follow from the database schema, I would break it out from the “standard” layer, and add it to a separate layer, with any related logic. And, we could even define related logic across multiple views in one file.



# Profit Logic

view: +orders {
dimension: profit { type:number sql: ${price} - ${cost} ;; }
measure: total_profit { type:sum sql: ${profit} ;; }
}

view: user_profit {
derived_table:{
explore_source: orders {
column: user_id {field: orders.user_id}
column: user_profit {field: orders.total_profit}
}
}
dimension: user_id { primary_key: yes hidden: yes }
dimension: user_profit { hidden: yes }
}

explore: +users {
join: user_profit {
view_label: "Users"
sql_on: ${user_profit.user_id} = ${users.id} ;;
relationship: one_to_one
}
}




See & compare a full example project... (Click to expand)

Running with the same users and orders schema, let’s see how an entire project might be organized under a traditional approach, and under layering.


Note in particular how the profit and customer concerns are separated, but the multiple objects adapted for each concern are grouped.


For brevity, the LookML is written in a more compact style, and a fairly minimal schema is assumed. Also, the machine-generated “base” layer is just an example and may be different depending on which process you used to generate it.


Traditional

# thelook.model.lkml

connection: "..."

include: "users.view.lkml"
include: "orders.view.lkml"
include: "user_summaries.view.lkml"

explore: orders {
join: users {
sql_on: ${users.id} = ${orders.user_id} ;;
relationship: many_to_one
}
join: user_summaries {
sql_on: ${user_summaries.user_id} = ${users.id} ;;
relationship: one_to_one
}
}

explore: users {
join: user_summaries {
sql_on: ${user_summaries.user_id} = ${users.id} ;;
relationship: one_to_one
}
always_filter: {
filters: [user_summaries.is_customer: "Yes"]
}
}

explore: _orders {
from: orders
view_name: orders
hidden: yes # Un-joined version just used for NDTs
}

# users.view.lkml

view: users {
sql_table_name: thelook.users ;;

dimension: id {primary_key: yes }
dimension: created { type:date }
}

# orders.view.lkml

view: orders {
sql_table_name: thelook.orders ;;

dimension: id { primary_key: yes }
dimension: user_id { hidden: yes }
dimension: cost { type:number }
dimension: price { type:number }
dimension: profit { type:number sql:${price} - ${cost};; }
dimension: date { type:date }

measure: count { type:count }
measure: total_revenue { type:sum sql:${price};; }
measure: total_cost { type:sum sql:${cost};; }
measure: total_profit { type:sum sql:${profit};; }
measure: total_margin { type:number sql:${total_profit}/${total_revenue};; }
measure: any_orders { type:yesno sql:MAX(${id} IS NOT NULL);; }
}

# user_summaries.view.lkml

view: user_summaries {
derived_table: {
explore_source: _orders {
column: user_id { field: orders.user_id }
column: is_customer { field: orders.any_orders }
column: user_profit { field: orders.total_profit }
}
}
dimension: user_id { primary_key:yes hidden: yes}
dimension: is_customer { type:yesno }
dimension: user_profit { type:number }
measure: total_user_profit { type:sum sql:${user_profit};;}
measure: any_customer { type:yesno sql:MAX(${is_customer});; }
}

Layered

# thelook.model.lkml

connection: "..."

include: "_basic.layer.lkml"
include: "profit.layer.lkml"
include: "customer.layer.lkml"

# _base.layer.lkml

# Machine-generated. Do not edit by hand.

explore: orders { hidden:yes }
view: orders {
sql_table_name: thelook.orders ;;
dimension: id { type:string}
dimension: user_id { type:string}
dimension: cost { type:number }
dimension: price { type:number }
dimension: date { type:date }
measure: count {type:count}
}

explore: users { hidden:yes }
view: users {
sql_table_name: thelook.users ;;
dimension: id { type:string }
dimension: created { type:date }
}

# _basic.layer.lkml

include: "_base.layer.lkml"

explore: +orders {
hidden:no
join: users {
sql_on: ${users.id} = ${orders.user_id} ;;
relationship: many_to_one
}
}

explore: +users { hidden:no }

view: +orders {
dimension: id { primary_key:yes }
dimension: user_id { hidden:yes }
}

# customer.layer.lkml

include: "_basic.layer.lkml"

view: +orders {
measure: any_orders { type:yesno sql:MAX(${id} IS NOT NULL);; }
}

view: user_is_customer {
derived_table: {
explore_source: orders {
column: user_id { field: orders.user_id }
column: is_customer { field: orders.any_orders }
}
}
dimension: user_id { primary_key:yes hidden: yes}
dimension: is_customer { type:yesno }
measure: any_customer { type:yesno sql:MAX(${is_customer});; }
}

explore: +users {
join: user_is_customer {
view_label: "Users"
sql_on: ${user_is_customer.user_id} = ${users.id} ;;
relationship: one_to_one
}
always_filter: {
filters: [user_is_customer.is_customer: "Yes"]
}
}

explore: +orders {
join: user_is_customer {
view_label: "Users"
sql_on: ${user_is_customer.user_id} = ${orders.user_id} ;;
relationship: one_to_one
}
}

# profit.layer.lkml

include: "_basic.layer.lkml"

view: +orders {
dimension: profit { type:number sql:${price}-${cost};; }
measure: total_revenue { type:sum sql:${price};; }
measure: total_cost { type:sum sql:${cost};; }
measure: total_profit { type:sum sql:${profit};; }
measure: total_margin { type:number sql:${total_profit}/${total_revenue};; }
}

view: user_profit {
derived_table:{
explore_source: orders {
column: user_id {field: orders.user_id}
column: user_profit {field: orders.total_profit}
}
}
dimension: user_id { primary_key: yes hidden: yes }
dimension: user_profit { type: number}
measure: total_user_profit { type:sum sql:${user_profit};; }
}

explore: +users {
join: user_profit {
view_label: "Users"
sql_on: ${user_profit.user_id} = ${users.id};;
relationship: one_to_one
}
}

explore: +orders {
join: user_profit {
view_label: "Users"
sql_on: ${user_profit.user_id} = ${orders.user_id} ;;
relationship: one_to_one
}
}


The Call to Action


By now, you have hopefully realized that this is a fundamental shift, powered by a small but significant new syntax.


As quite a novel approach, not all LookML developer teams will embrace it right away. However, I do see compelling parallels to similar shifts I’ve seen in other frameworks and languages, for example React’s rationale for moving from class-based declarations to hook-based declarations.


In short, I’d encourage everyone to try out layering and see how it can help their project!


So, what do you think of this approach? Will you try it out?


9 replies

Userlevel 2

@fabio: This is huge – can’t believe I’ve missed it till now.


The detailed example is especially helpful, but maybe I’m missing something…why are the view elements in _base.layer.lkml wrapped in hidden (and seemingly redundant) explores? Is that just an automatic aspect of machine-generated views in LookML?

Userlevel 6
Badge

Hey @LStanevich!


The hidden explores are the result of my personal preference. I think that for troubleshooting purposes and for power users, all views should have a hidden explore by default. (Unless there are specific security requirements to the contrary.) So, I just took to a convention of putting those in the base layer as a matter of course. However, it’s not required for any particular refinements-related reason 🙂 In retrospect, it is maybe a bit of an extraneous detail for this article’s purpose!

Userlevel 6
Badge +1

I used refinements recently. I had a lot of financial metrics added to my aggregation explore and decided to put it in a separate file, in order to have better visibility (the main explore has 100+ fields).


Very happy with how it works! I would be ecstatic if we could bring it to fields as well!

@fabio T

This write-up of the new feature is great! 😃👌 I like how you wrap the example refinements into the more holistic concept of layering.


It’s given me all kinds of crazy ideas to re-shape our LookML architecture … paradigm-shifting 🤪

Userlevel 6
Badge +1

I just realised that if I use view_label in refinement, it doesn’t work.


I split my financial information into a +view_name but I wanted to group all the fields under different section in the explore but the parameter view_label has no impact unless I add it to all the fields.

Userlevel 6
Badge

Hmm, that’s strange. Are you sure it isn’t being overwritten by a join/explore-level view_label?


Also, if you want a set of fields to have a different view_label than other fields in your original view, perhaps you want a field-only view (example in section 1 here) for those instead of a refinement?

Userlevel 6
Badge

Hi @fabio, I wanted to get your thoughts on something or at least let you know a scenario which I find annoying.

Ideally I am wanting to exclude a refinement...let me explain.

I have a model with 2 explores. Each explore makes use of a block (PoP and a TopN block).

The PoP block refines the main view that is used in both explores and it edits the main date dimension to include some specific PoP alterations. Given both explores live in the same model and the include statements for this model pull in the PoP refinements, the explore that doesnt use PoP (the TopN explore) errors since a view it is using is referencing PoP fields which are not joined into the explore.

I know I can split these explores into different models and then not include refinements not but wanted to chat about it. I could also probably extend the view for PoP to make my changes rather than refine which would not affect the TopN. Working through this problem I just thought it would be nice to exclude a specific refinement at the explore level or at least reference a specific refinement rather than all.

Userlevel 6
Badge

Hey @IanT 

it would be nice to exclude a specific refinement at the explore level

 

^ This sounds a lot like the extends mechanism though… I guess it’s including rather than excluding, but the point is that the explore is doing the declaration

Anyway, it seems like the friction you’re experiencing is because the blocks are using refinements within them? And perhaps refinements aren’t the ideal way for a block to be exposing its logic… I haven’t looked at these blocks recently, so I’m not 100% sure on that though. Does that sound right?

Userlevel 6
Badge

Yeh I agree with that comment, we have tailored some blocks and are advertising them internally as plug and play so whilst I wouldn’t implement them like that we need to package them like that for ease of implementation, it’s because of this I want to stay away from multiple models and extension as well as the people we are trying to push this to implement for their own teams don’t have the time and expertise to unpick some of the more complicated bits. I will probably just put the pop stuff into the topn block so they play nice with each other.

Reply