When you have tables and table schemas that are constantly in flux, but still want your LookML models to be kept up to date with the changing schemas, this article is for you. The following are basic instructions on how to deploy an AWS Lambda function and set up your Looker instance to poll an initiate changes to your LookML model through the Lambda function.
With no modifications and access to the necessary components listed below in Requirements, setup takes less than 15 minutes. Then your LookML will always be kept in sync with your frequently changing tables. 💥
This is intended to be a basic generation of LookML. The package is open sourced to make modifications as needed and instructions to do so are in development. Modifications may include more advanced LookML generation logic or deploying on other serverless platforms like Google Cloud Functions, Azure Functions or IBM OpenWhisk
This is in active development, please submit any issues or requests on the project’s GitHub.
Create the GenerateLookML Lambda function from the AWS Serverless Application Repository
Click Here to Create from the AWS Serverless Application Repository
Deploy the Lambda function through the AWS Console. This will create a CloudFormation Stack with the following:
Lambda Function
Environment Variable in the Lambda function called project_config
An API endpoint allowing a POST Method
An IAM Role to access the necessary resources
You can check the resources granted and what permissions were granted on the CloudFormation Page for the Stack
After its deployed, open up the GenerateLookML Lambda function in the console. We will need the API endpoint to put into Looker. But first we need to modify the environment variable.
project_config
This section has details on the necessary values to update the project_config environment variable. Example of the deployed default is here and a full example is here.
In the console for the function, you can find the environment variable here:
Found in your Looker Admin Panel in Admin > General > Settings. This is top-level key in the project_config.
If you have multiple Looker instances, you can put in each one as a key.
Found in your Looker Admin Panel in Admin > General > Settings
This function supports many projects on the same instance by providing an array of project metadata.
This is the name of the project as seen in the Develop dropdown.
The GitHub repository that is attached to the project_id. This is in the format of organization/repository
. If you need to find out what it is, you can find it in the project settings Remote Origin URL. For example: https://customer.looker.com/projects/<project_id>/edit
You can have Looker validate headers of incoming deploy webhooks by providing the project metadata with a value for X-Looker-Deploy-Secret
. If you have it enabled in your project settings (https://customer.looker.com/projects/<project_id>/edit
), then place the value of it in this key.
The Lambda function communicates with Github’s API and a Personal API Token is required to make modifications to the repository. Details on obtaining an API token are here. The user must have the ability to commit and push to master on the repository.
Using the information collected above, update the project_config environment variable, an example of a completed config is here.
Now that everything is setup on the AWS side, we can focus on the Looker side. First we need to create an explore for querying the information schema, then we need to schedule Looker to run it periodically and fire off the webhook.
Place generate_lookml.model.lkml in its own project or any currently deployed project. Then deploy the new model to production.
Note: Different dialects have different datatypes, please validate dimension: type_convert {...}
: Link to GitHub here. Make necessary changes to map to your databases datatypes to Looker datatypes.
You will need to tell Looker that this model is allowed to query a connection. To do that you need to Add the Model Configuration
Once you’ve deployed to production and added a model configuration, you can explore the dataset by going to this link. https://<your host>/explore/generate_lookml/schema_table_search?fields=schema_table_search.required*
Enter your schema and table into the Explore filters and validate the results, you can Save and Schedule the Explore from the gear icon in the top right. Give it a Title and put it in a Space.
The Looker scheduler modal will open and there’s a few changes you will need to make.
Change the delivery method to Webhook.
In the Webhook URL, you are going to use the API Endpoint provided in AWS found within the function.
Add query parameters to your webhook URL; the Lambda function requires two - project_id and filename. Your Webhook URL will look like this:
https://<your API slug>.amazonaws.com/Prod/schema_table_search?project_id=thelook&filename=users_autogen
Change format data as
in the scheduler to JSON - Simple
Set up your trigger schedule, you can have Looker poll every five minutes if you wish.
Open Advanced Options and change settings for Send this schedule if to there are results
and check the box and results changed since last run. This will have Looker only send the webhook if the results of the SQL query have changed.
Send a Test before saving. You can check for success in Admin > Scheduled Jobs to see success or failure messages. You can also navigate to the project and check for the new .view.lkml in your LookML.
Example Scheduler:
You can now duplicate this schedule in Looker’s UI for as many combinations of tables/files you want to be updated. Change the schema/table name in filters and filename/project_id in the webhook url for new table/file pairs.
Now that you have a LookML file that’s being auto-generated, you will want to make use of it. You will more than likely want to add dimensions and measures, or add properties to auto-generated LookML dimensions. In either case you will make use of Looker’s ability to extend view files. You can check out examples of extended views here.