Best practice: writing views in BigQuery vs Looker

  • 30 August 2022
  • 1 reply

We have a daily data feed to BigQuery for most of our data tables. In the past we used SQL Runner in Looker to query the data from BigQuery tables and then created views in Looker by saving these queries to the project. The views were then added to a datagroup to be refreshed daily as PDTs. 

Since then I found it was much easier to write these views in BigQuery instead. The query editor is a lot more user-friendly and I can write and save sub-queries as separate views under different datasets to be reused later. I kept replicating these queries in SQL Runner to maintain the former practice and keep the Looker model in a consistent format, however I wonder if it would be easier altogether to simply read the final view created in BigQuery with SQL Runner and then turn it into PDT.

So essentially I want to write all my views in BigQuery and then import a final view I want to visualize into Looker by querying it with SQL Runner as SELECT * FROM BigQuery_View, and saving it to the model from there, with auto-generated LookML.

My question is, are there any computational/procedural/organizational advantages of writing these views in Looker vs BigQuery? As a side note, most of our model is written in SQL, we only use LookML to create aggregate measures from the dimensions generated by SQL. 


Best answer by Dawid 30 August 2022, 22:17

View original

This topic has been closed for comments

1 reply

Userlevel 7
Badge +1

I don’t see any advantages. For Looker to give us ability to write PDTs and build views in SQL is great, as sometimes it comes handy. In my case it’s useful for proof of concepts or aggregation over different periods of time (instead of keeping daily agg, monthly agg, etc.)


But the SQL is in no way checked by Looker. It will only be checked when the query is run not at the point of committing a change to a repository. 


That’s the practice I follow. If I can create views/models in the Data Model (be it BQ views, dbt, or any other model) then I do it there unless having the SQL/PDT/NDT gives me something like in case of my afrementioned aggregations - I don’t have to keep multiple models (I can use Liquid to count distinct Ids on the fly for the timeframe that has been requested).