With the latest Looker 7.6 release, we are introducing the concept of LookML refinements. In this article, Iโll briefly cover how they work, and then illustrate how they can be used to organize or โlayerโ your LookML code so that related code stays together.
Refinements are similar to extends in that they allow you to make changes to a base LookML object, but they make that change โin-placeโ, overwriting the already existing object, rather than requiring you to create a new name for a new object.
Letโs take a look at a basic example. Notice the plus signs:
view: fact {
sql_table_name: fact ;;
}
view: +fact {
label: "Facts"
dimension: date { type:date }
}
view: +fact {
label: "Quantified Facts"
dimension: amount { type:number }
}
These 3 separate declarations effectively get combined, using the same merging rules as with extends, and are equivalent to:
view: fact {
sql_table_name: fact ;;
label: "Quantified Facts"
dimension: date { type:date }
dimension: amount { type:number }
}
This is simultaneously a very small change but also a very big change! Refinements will have a number of uses, like making it easier to author and consume LookML blocks (pre-defined LookML modules) or cross-project imports. But, here weโre focusing on one particularly exciting and broadly-applicable use: helping a team of developers to better organize their LookML project.
Traditionally, LookML has required you to organize your code according to Looker-centric objects: views, models, etc.
For example, if you wanted to put two related dimensions next to each other so that you could clearly read their shared logic, it was only possible if those dimensions were in the same view.
Now, with refinements, we have a lot more flexibility. We can define our views in one place, and then in another place add dimensions, even if they span multiple existing views.
In practice, this means organizing our code into a number of โlayersโ:
Within the .lkml
file types, other than changing the icon used in the field picker, only the .model.lkml
file type has a concrete function. Any other .lkml
file type you may see, including .view.lkml
is purely for organization.
We chose .layer.lkml
to represent the layering of LookML functionality via potentially multiple layers of refinement. However, there is no magic associated with this file type - it would work with any other .lkml
type! That said, we recommend using it.
Now, when applying the concept of layers, there are lots of perfectly valid file naming conventions and folder structures, so letโs dive in to the โwhyโ of each layer so you can apply the concept in a way that makes sense for your project.
Looker will always require a model file to know where to start from. The file neednโt contain much beyond a series of includes to the remaining files.
Whether due to working with multiple customers, multiple departments, or multiple databases, some Looker instances will want to share some LookML across multiple models.
Thereโs an easy way to do this, and although not a new concept, I mention it here since it lines up well with the philosophy of layering and makes great use of refinements. You just move the declarations that would have normally gone directly in your model into a layer file and include it into each model.
For example:
# customer_a.model.lkml
include: "all_models.layer.lkml"
+view: facts { sql_table_name: customer_a.facts ;; }
# customer_b.model.lkml
include: "all_models.layer.lkml"
+view: facts { sql_table_name: customer_b.facts ;; }
# _all_models.layer.lkml
include: "_base.layer.lkml"
include: "_basic.layer.lkml"
include: "..."
This layer is dedicated to holding any machine-generated LookML. This way, when changes happen in the database schema, you can simply re-run the LookML generator to get all of these updates, without affecting any hand-written changes you will have put into subsequent layers.
Although the layer could be composed of many files, we think itโs quite convenient for all the machine-generated code to be in a single file. To better enable this workflow, we are working on the ability to have our LookML generator output its LookML into a single file instead of into one file per view.
File naming conventions may vary, but I recommend naming it with a leading underscore so it sorts before other layers in the field picker.
In this layer, we use refinements to enrich our generated LookML with any standard declarations that follow the structure from the generated schema. The idea here is to only include things that are more closely related to the structure of the data than they are to specific business logic or use cases. Things like primary key declarations, hiding non-business fields, basic labeling & descriptions, etc. Even basic explores with many_to_one joins on foreign keys can live here.
As far as one vs multiple files, both are valid. You could maintain one monolithic file, you could separate them out into multiple files by โschemaโ or โdatasetโ, or you could keep to the traditional one-file-per-view approach. If youโre going to use more than one file, a folder can help keep these together.
Here is where the magic happens. We can have a number of layers, each containing related business logic.
Since this concept is pretty new, an example would probably go a long way!
Letโs say we wanted to define profit for our orders as price minus cost. Since this dimension does not directly follow from the database schema, I would break it out from the โstandardโ layer, and add it to a separate layer, with any related logic. And, we could even define related logic across multiple views in one file.
# Profit Logic
view: +orders {
dimension: profit { type:number sql: ${price} - ${cost} ;; }
measure: total_profit { type:sum sql: ${profit} ;; }
}
view: user_profit {
derived_table:{
explore_source: orders {
column: user_id {field: orders.user_id}
column: user_profit {field: orders.total_profit}
}
}
dimension: user_id { primary_key: yes hidden: yes }
dimension: user_profit { hidden: yes }
}
explore: +users {
join: user_profit {
view_label: "Users"
sql_on: ${user_profit.user_id} = ${users.id} ;;
relationship: one_to_one
}
}
Running with the same users and orders schema, letโs see how an entire project might be organized under a traditional approach, and under layering.
Note in particular how the profit
and customer
concerns are separated, but the multiple objects adapted for each concern are grouped.
For brevity, the LookML is written in a more compact style, and a fairly minimal schema is assumed. Also, the machine-generated โbaseโ layer is just an example and may be different depending on which process you used to generate it.
# thelook.model.lkml
connection: "..."
include: "users.view.lkml"
include: "orders.view.lkml"
include: "user_summaries.view.lkml"
explore: orders {
join: users {
sql_on: ${users.id} = ${orders.user_id} ;;
relationship: many_to_one
}
join: user_summaries {
sql_on: ${user_summaries.user_id} = ${users.id} ;;
relationship: one_to_one
}
}
explore: users {
join: user_summaries {
sql_on: ${user_summaries.user_id} = ${users.id} ;;
relationship: one_to_one
}
always_filter: {
filters: [user_summaries.is_customer: "Yes"]
}
}
explore: _orders {
from: orders
view_name: orders
hidden: yes # Un-joined version just used for NDTs
}
# users.view.lkml
view: users {
sql_table_name: thelook.users ;;
dimension: id {primary_key: yes }
dimension: created { type:date }
}
# orders.view.lkml
view: orders {
sql_table_name: thelook.orders ;;
dimension: id { primary_key: yes }
dimension: user_id { hidden: yes }
dimension: cost { type:number }
dimension: price { type:number }
dimension: profit { type:number sql:${price} - ${cost};; }
dimension: date { type:date }
measure: count { type:count }
measure: total_revenue { type:sum sql:${price};; }
measure: total_cost { type:sum sql:${cost};; }
measure: total_profit { type:sum sql:${profit};; }
measure: total_margin { type:number sql:${total_profit}/${total_revenue};; }
measure: any_orders { type:yesno sql:MAX(${id} IS NOT NULL);; }
}
# user_summaries.view.lkml
view: user_summaries {
derived_table: {
explore_source: _orders {
column: user_id { field: orders.user_id }
column: is_customer { field: orders.any_orders }
column: user_profit { field: orders.total_profit }
}
}
dimension: user_id { primary_key:yes hidden: yes}
dimension: is_customer { type:yesno }
dimension: user_profit { type:number }
measure: total_user_profit { type:sum sql:${user_profit};;}
measure: any_customer { type:yesno sql:MAX(${is_customer});; }
}
# thelook.model.lkml
connection: "..."
include: "_basic.layer.lkml"
include: "profit.layer.lkml"
include: "customer.layer.lkml"
# _base.layer.lkml
# Machine-generated. Do not edit by hand.
explore: orders { hidden:yes }
view: orders {
sql_table_name: thelook.orders ;;
dimension: id { type:string}
dimension: user_id { type:string}
dimension: cost { type:number }
dimension: price { type:number }
dimension: date { type:date }
measure: count {type:count}
}
explore: users { hidden:yes }
view: users {
sql_table_name: thelook.users ;;
dimension: id { type:string }
dimension: created { type:date }
}
# _basic.layer.lkml
include: "_base.layer.lkml"
explore: +orders {
hidden:no
join: users {
sql_on: ${users.id} = ${orders.user_id} ;;
relationship: many_to_one
}
}
explore: +users { hidden:no }
view: +orders {
dimension: id { primary_key:yes }
dimension: user_id { hidden:yes }
}
# customer.layer.lkml
include: "_basic.layer.lkml"
view: +orders {
measure: any_orders { type:yesno sql:MAX(${id} IS NOT NULL);; }
}
view: user_is_customer {
derived_table: {
explore_source: orders {
column: user_id { field: orders.user_id }
column: is_customer { field: orders.any_orders }
}
}
dimension: user_id { primary_key:yes hidden: yes}
dimension: is_customer { type:yesno }
measure: any_customer { type:yesno sql:MAX(${is_customer});; }
}
explore: +users {
join: user_is_customer {
view_label: "Users"
sql_on: ${user_is_customer.user_id} = ${users.id} ;;
relationship: one_to_one
}
always_filter: {
filters: [user_is_customer.is_customer: "Yes"]
}
}
explore: +orders {
join: user_is_customer {
view_label: "Users"
sql_on: ${user_is_customer.user_id} = ${orders.user_id} ;;
relationship: one_to_one
}
}
# profit.layer.lkml
include: "_basic.layer.lkml"
view: +orders {
dimension: profit { type:number sql:${price}-${cost};; }
measure: total_revenue { type:sum sql:${price};; }
measure: total_cost { type:sum sql:${cost};; }
measure: total_profit { type:sum sql:${profit};; }
measure: total_margin { type:number sql:${total_profit}/${total_revenue};; }
}
view: user_profit {
derived_table:{
explore_source: orders {
column: user_id {field: orders.user_id}
column: user_profit {field: orders.total_profit}
}
}
dimension: user_id { primary_key: yes hidden: yes }
dimension: user_profit { type: number}
measure: total_user_profit { type:sum sql:${user_profit};; }
}
explore: +users {
join: user_profit {
view_label: "Users"
sql_on: ${user_profit.user_id} = ${users.id};;
relationship: one_to_one
}
}
explore: +orders {
join: user_profit {
view_label: "Users"
sql_on: ${user_profit.user_id} = ${orders.user_id} ;;
relationship: one_to_one
}
}
By now, you have hopefully realized that this is a fundamental shift, powered by a small but significant new syntax.
As quite a novel approach, not all LookML developer teams will embrace it right away. However, I do see compelling parallels to similar shifts Iโve seen in other frameworks and languages, for example Reactโs rationale for moving from class-based declarations to hook-based declarations.
In short, Iโd encourage everyone to try out layering and see how it can help their project!
So, what do you think of this approach? Will you try it out?
@fabio1: This is huge โ canโt believe Iโve missed it till now.
The detailed example is especially helpful, but maybe Iโm missing somethingโฆwhy are the view
elements in _base.layer.lkml
wrapped in hidden (and seemingly redundant) explore
s? Is that just an automatic aspect of machine-generated view
s in LookML?
Hey @LStanevich!
The hidden explores are the result of my personal preference. I think that for troubleshooting purposes and for power users, all views should have a hidden explore by default. (Unless there are specific security requirements to the contrary.) So, I just took to a convention of putting those in the base layer as a matter of course. However, itโs not required for any particular refinements-related reason ๐ In retrospect, it is maybe a bit of an extraneous detail for this articleโs purpose!
I used refinements recently. I had a lot of financial metrics added to my aggregation explore and decided to put it in a separate file, in order to have better visibility (the main explore has 100+ fields).
Very happy with how it works! I would be ecstatic if we could bring it to fields as well!
@fabio1 T
This write-up of the new feature is great! ๐๐ I like how you wrap the example refinements into the more holistic concept of layering.
Itโs given me all kinds of crazy ideas to re-shape our LookML architecture โฆ paradigm-shifting ๐คช
I just realised that if I use view_label
in refinement, it doesnโt work.
I split my financial information into a +view_name
but I wanted to group all the fields under different section in the explore but the parameter view_label
has no impact unless I add it to all the fields.
Hmm, thatโs strange. Are you sure it isnโt being overwritten by a join/explore-level view_label?
Also, if you want a set of fields to have a different view_label than other fields in your original view, perhaps you want a field-only view (example in section 1 here) for those instead of a refinement?
Hi @fabio1, I wanted to get your thoughts on something or at least let you know a scenario which I find annoying.
Ideally I am wanting to exclude a refinement...let me explain.
I have a model with 2 explores. Each explore makes use of a block (PoP and a TopN block).
The PoP block refines the main view that is used in both explores and it edits the main date dimension to include some specific PoP alterations. Given both explores live in the same model and the include statements for this model pull in the PoP refinements, the explore that doesnt use PoP (the TopN explore) errors since a view it is using is referencing PoP fields which are not joined into the explore.
I know I can split these explores into different models and then not include refinements not but wanted to chat about it. I could also probably extend the view for PoP to make my changes rather than refine which would not affect the TopN. Working through this problem I just thought it would be nice to exclude a specific refinement at the explore level or at least reference a specific refinement rather than all.
Hey @IanT
it would be nice to exclude a specific refinement at the explore level
^ This sounds a lot like the extends mechanism thoughโฆ I guess itโs including rather than excluding, but the point is that the explore is doing the declaration
Anyway, it seems like the friction youโre experiencing is because the blocks are using refinements within them? And perhaps refinements arenโt the ideal way for a block to be exposing its logicโฆ I havenโt looked at these blocks recently, so Iโm not 100% sure on that though. Does that sound right?
Yeh I agree with that comment, we have tailored some blocks and are advertising them internally as plug and play so whilst I wouldnโt implement them like that we need to package them like that for ease of implementation, itโs because of this I want to stay away from multiple models and extension as well as the people we are trying to push this to implement for their own teams donโt have the time and expertise to unpick some of the more complicated bits. I will probably just put the pop stuff into the topn block so they play nice with each other.
Hey Fabio,
This has all been super helpful. Is there an update on simplifying the base/raw workflow by allowing the LookML generator output to be consolidated in a single file?
Thanks!