Announcing: LookML Generation for Python

  • 25 April 2017
  • 9 replies
  • 549 views

We’re announcing a new project that we’ve just open sourced: LookML-Gen for Python


##Overview


If you have dynamic data definitions or if aspects of your system are driven by changing metadata, reporting on it is often cumbersome and not at all user friendly. Entity-Attribute-Value (EAV) models are a classic example of this. (See this Looker discource post for background.)


As adopters of Looker know, LookML provides a powerful way to describe data. The Looker platform processes LookML and allows end users to explore data according to the descriptions and relationships found in LookML files.


LookML-Gen allows Python developers to examine dynamic data definitions then programmatically generate LookML to describe that data. By outputting LookML, end users get the benefit of reporting in a way that makes sense to them, picking dimensions and measures with meaningful names.


Simple Example Code


Install it:


pip install lookml-gen

Use it:


from lookmlgen.view import View
from lookmlgen.field import Dimension, DimensionGroup, Measure
from lookmlgen.base_generator import GeneratorFormatOptions

view_name = 'my_view'
view = View(view_name, sql_table_name='my_table')
view.add_field(Dimension('id', type='number', primary_key=True))
view.add_field(DimensionGroup('created'))
view.add_field(Dimension('name'))
view.add_field(Dimension('quantity', type='number'))
view.add_field(Measure('total_quantity', sql='${TABLE}.quantity', type='sum'))

with open('%s.view.lkml' % view_name, 'w') as file:
view.generate_lookml(file, GeneratorFormatOptions(view_fields_alphabetical=False))

After execution, the file ‘my_view.view.lkml’ will contain:


# STOP! This file was generated by an automated process.
# Any edits you make will be lost the next time it is
# re-generated.
view: my_view {
sql_table_name: my_table ;;

dimension: id {
type: number
primary_key: yes
sql: ${TABLE}.id ;;
}

dimension_group: created {
type: time
timeframes: ["time", "date", "week", "month"]
datatype: datetime
sql: ${TABLE}.created ;;
}

dimension: name {
sql: ${TABLE}.name ;;
}

dimension: quantity {
type: number
sql: ${TABLE}.quantity ;;
}

measure: total_quantity {
type: sum
sql: ${TABLE}.quantity ;;
}
}

##Real-World Use


Generating LookML for actual dynamic data would examine definitions found in the database and create fields in a view, roughly like this:


from lib.lookml_gen.view import View, DerivedTable
from lib.lookml_gen.field import Dimension, DimensionGroup, Measure
# imports for database access not shown...

view_name = 'my_dynamic_view'
view = View(name=view_name, label='My Dynamic View')

view.add_field(Dimension(name='entity_id', type='number', primary_key=True))

# Query for data definitions that are relevant to this view
query_results = db_connection.query('SELECT id, name FROM attribute_definition') #' Comment to fix markdown formatting

pdt_sql_lines = ['SELECT attribute_value.entity_id,']
for row in query_results:
pdt_sql_lines.append('\n MAX(IF(attribute_definition.id = {0}, attribute_value.value, NULL)) AS `{1}`'.format(row['id'], get_column_name(row['name']))
view.add_field(Dimension(row['name']))

pdt_sql_lines.append(
'\nFROM attribute_definition'
'\nLEFT JOIN attribute_value ON attribute_definition.id = attribute_value.attribute_definition_id'
'\nGROUP BY attribute_value.entity_id') #' Comment to fix markdown formatting

pdt_sql = ''.join([line for line in pdt_sql_lines])

dt = DerivedTable(pdt_sql, sql_trigger_value="SELECT DATE(CONVERT_TZ(NOW(), 'UTC', 'US/Pacific'))",
indexes=['entity_id'])
view.set_derived_table(dt)

with open('%s.view.lkml' % view_name, 'w') as file:
view.generate_lookml(file, GeneratorFormatOptions(newline_between_items=False, omit_default_field_type=False))

This would generate a LookML View for a Persistent Derived Table (PDT) that lays out EAV data in a one-entity per row fashion, with column names that match the attribute names, like this:


# STOP! This file was generated by an automated process. Any edits you make
# will be lost the next time it is re-generated.
view: my_dynamic_view {
label: "My Dynamic View"

derived_table: {
sql:
SELECT attribute_value.entity_id,
MAX(IF(attribute_definition.id = 1000000075, attribute_value.value, NULL)) AS `business_region`,
MAX(IF(attribute_definition.id = 1000000076, attribute_value.value, NULL)) AS `follow_up_requested`
FROM attribute_definition
LEFT JOIN attribute_value ON attribute_value ON attribute_definition.id = attribute_value.attribute_definition_id
GROUP BY attribute_value.entity_id ;;
sql_trigger_value: SELECT DATE(CONVERT_TZ(NOW(), 'UTC', 'US/Pacific')) ;;
indexes: ["entity_id"]
}

dimension: entity_id {
primary_key: yes
hidden: yes
type: number
sql: ${TABLE}.entity_id ;;
}

dimension: business_region {
label: "Business Region"
sql: ${TABLE}.business_region ;;
}

dimension: follow_up_requested {
label: "Follow-up Requested"
sql: ${TABLE}.follow_up_requested ;;
}
}

This example code assumes all values are string types, but it could easily be enhanced to examine a type column from the data definitions and generate LookML ‘type: number’ or dimension_group with ‘type: time’. (The implementation for the get_column_name() function isn’t shown, but it would replace spaces with underscores in attribute names and convert/remove any characters that shouldn’t appear in a column name.)


On top of this, you could write code to automate updating the repo where your models are stored. At SymphonyRM, we have periodic jobs that:



  • Clone a repo containing LookML models

  • Generate LookML views for our dynamic data (using LookML-Gen)

  • Check if the views have changed

  • If so, create a new branch, commit and push the changes, then create a Github pull request


Our team gets an email notification for the pull request, we do a quick review before merging, and then we activate the change in our production Looker instance. At some point, we’ll fully automate this process.


##Status


The early 0.1 version of LookML-Gen is available now. Full LookML support is far from complete: at the moment only basic aspects of Views and Fields are supported and there is no support for Explores yet. However, it does cover much of the common functionality, including Persistent Derived Tables, and we’ll be continuing to enhance it. The code can easily be extended and we’d love to get pull requests to fill out additional functionality.


The open source project is available at: https://github.com/symphonyrm/lookml-gen

Documentation is here: http://lookml-gen.readthedocs.io/


##Coming Soon


We’ll be adding support for generating sections of explores, i.e. joins. (Joins for dynamic data often comprise only part of an explore so the automated generation should process an existing model file and replace only the relevant joins, perhaps indicated by start/end marker comments.)


##Feedback


We’d love to get thoughts and feedback from the Looker community. You can reply to this thread, add issues on the Github project, or email jschmid at symphonyrm dot com


9 replies

Userlevel 3

If you’re looking for a LookML automation toolkit check out this discourse on pyLookml. Has a bunch of convenient methods and will continue to get extended. There is additional documentation here

Hi @Joe_Schmid,


Thanks for the detailed article. I am in search different use case now.


I have the data in SQL server and that server is connected to Looker. I have to create around 20 models and I have to join all model into a dashboard programatically. This job would run evevryday.


So, Just checking if you can share any working example to create a model by creating a view. I can automate this process. Also just checking if there any working example to merge different models into a dashboard.


Thanks

Vaibhav

Hi @Joe_Schmid,


I have seen your comments in this discourse community. I am also looking for same kind of working prototype, where we can create a view and then a model using python. I have to create around 7 models programatically and then join all the models into a dashboard and then schedule it… I was just looking for some working example . Just checking if you can help me with this ?

Userlevel 4

This is cool! Is there a way to get the LookML as a string object vs a file?

Looker can generate LookML from the DB schema, through the UI. In this code, you explicitly define the schema.


Is there a function to generate the LookML, given a database connection and schema/table name?

@Alexandre_Martins_de thank you for letting me know. Fixed!

Great post!


I believe your first code snippet should read view.generate_lookml instead of v.generate_lookml.

@Mintz thanks for the thoughts! We’re really excited about the project and are looking forward both to enhancing it ourselves and getting contributions from the community. We’d obviously welcome it if any sort of participation from Looker happened to make sense.


That’s a great idea on using Extends to isolate machine-generated Views and Explores from manually created ones. We use Extends extensively (!) for a lot of our manual LookML, but I hadn’t thought of using it for this separation of machine vs manual. That approach would make it really clean.

Userlevel 3

This is fantastic! This is something I’ve been playing with as well, but I love that you guys have actually wrapped it up into a library.


The things I’ve done have been more bespoke projects, mainly translating old-school Census codebooks into LookML. You can see some of the scripts I’ve done here: https://github.com/looker/census_looker


One tip that I figured out early on that you may already have figured out as well: The Extends properties of LookML are fantastically useful with machine-generated code.


Because any changes you make by hand will be wiped out next time you run your scripts, you can’t make edits directly to them. But if instead you treat the machine-generated code as a base and then extend it into other views, you can make changes by hand to those views that won’t get wiped out and are easy to keep up to date.

Reply