Question

How to have one view for multiple database tables

  • 22 December 2016
  • 13 replies
  • 3872 views

Our database is set up with a different schema for each customer, but the table structures are identical.

For example, we have many tables named customerA.data, customerB.data, … customerZ.data.


Is it possible to access any one of these tables without creating 26 identical views?

So one approach might be a filter in an Explore or Dashboard where I would enter “K” and it would then generate and send the SQL:

“select * from customerK.data”

How would that be set up?


13 replies

Userlevel 2

two ideas:




  1. you could wrap all of these tables into a view with derived_table using union/union all like so, but this will require you to update this definition each time a new customer table is added:


     view: customertables {
    derived_table: {
    sql:
    select customerID, someothercolumn from customerA
    union all
    select customerID, someothercolumn from customerB
    union all
    select customerID, someothercolumn from customerZ
    ;;
    }

    dimension: customerID {
    type: string
    sql: ${TABLE}.customerID ;;
    }


    dimension: someothercolumn {
    type: string
    sql: ${TABLE}.someothercolumn ;;
    }
    }

    explore: customertables {}



  2. some sort of liquid substitution but i havent been able to get this to work personally…would be great if someone from looker with more liquid experience could weigh in here 🙂


     view: customertables2 {
    derived_table: {
    sql:
    #NOTICE THE TABLE NAME HERE
    select customerID, someothercolumn from customer{{customerIDFilter}}
    ;;
    }

    filter: customerIDFilter {
    type: string
    # sql: ${TABLE}.customerID ;;
    }

    dimension: customerID {
    type: string
    sql: ${TABLE}.customerID ;;
    }


    dimension: someothercolumn {
    type: string
    sql: ${TABLE}.someothercolumn ;;
    }
    }

    explore: customertables2 {}


Yeah, we have a similar setup/requirement to have all the data separate. I used the extends feature and assigned the sql_table there. Then I used the “from” feature in the model to rename it back. This way I can reuse LookML dashboards. If your interested in this approach I can write a more comprehensive discourse article on it. It’s been in production for about 8 months now.

Userlevel 3

The implementation here depends on whether this is an external use case (utilizing SSO Embed) or an internal one (for dashboard filtering).


For an external case, you would want to use access filters:


{{ _access_filters["customer.schema"] }}

For an internal use case, the solution depends on a number of these:




  1. If we are talking about just a few, then extends is likely the right approach since it will allow you some customization down the road.




  2. if the number of these customers is large, then you need to query the schema dynamically using one of the way @mplooker described. I personally like the UNION approach because you can have end-user friendly customer names this way. However should you decide to use filters, the logic will be:




view: table {
derived_table: {
sql:
select * from customer {% parameter customerIDFilter %}.table
;;
}

filter: customerIDFilter {
type: number
}

Thanks Segah (and others),


I do have thousands of customers, so I like the simplicity of the Filter approach which would require just a single section of code.

As the only user, I can enter the customerID into a filter field in the Explore or the Dashboard and this gets passed into the sql table name. Then I should have easy access to all of the tables from one location.


But I’m new to Looker, and struggling to get your sample code to work.

When I Validate I get the error: “sql_render template processing error: variable not found ‘parameter’”.

Removing the word “parameter” does pass the Validate so I tried that:


view: table1 {
derived_table: {
sql:
select * from customer{{customerIDFilter}}.table1
;;
}

filter: customerIDFilter {
type: string
}

measure: count {
type: count
}
}

I also added to the model a simple explore:


explore: table1 {
always_filter: {
filters: {
field: customerIDFilter
value: "X"
}
}
}

But then when I go to Explore, this filter value is ignored and nothing is passed to the derived_table sql (e.g. it just becomes select * from customer.table1 instead of select * from customerX.table1).

Am I using the filter incorrectly? Do I need to set something else up also?


(side question: is there somewhere in Looker that describes how to use {{ }} like we are doing here?)

Userlevel 4
Badge

@JamesM @segahm - The function of parameters are in alpha in Looker. There is no documentation on them as we plan to expand upon them, and we may possibly completely change the way they work.


The above example will not work with parameters, because we render any value passed to a parameter filter as a string. I’ll correct your LookML from above in my below example for further explanation:


  view: table1 {
derived_table: {
sql:
select * from customer{% parameter customeridfilter %}.table1
;;
}

filter: customeridfilter {
type: string
}

measure: count {
type: count
}
}

The only problem with this is that values from parameters are passed in as strings values. Thus, any table reference will be a broken reference when the sql is written. For example, if we use the value of "X" for the above filter, the sql written will look like,


SELECT *
FROM customer'X'.table1

Which is not a valid table name. As of now the only supported way to apply a different table/schema name per customer is with access_filter_fields as @segahm mentioned above. This is a solid use case for parameters that we’ll consider as we continue to refine how they work in their future development.


Keep in mind the other approach (which may definitely replace access_filter_fields in the future) would be to use user attributes to determine the table name per user.


The method of using access_filter_fields or User attributes is also more secure, as the previous method (if it did work) would give users the opportunity to access any table on the database simply by guessing filter values.

Userlevel 2

This clarification is helpful, @Zam; thank you.


I think the access_filter_field approach mentioned is a solution to a slightly different problem, one where the actual looker user is the customer whose specific we want to query from (or at least where there is a specific and predefined relationship between the Looker User and the Customer they query set in User Attributes).


Instead, I think the original use case is something more like the following:



Looker User ‘A’ wants to query the data about Customer 1 from table ‘Customer1’ by entering ‘1’ in customeridfilter… BUT




Looker User ‘A’ also wants to be able to query the data about Customer 2 from table ‘Customer2’ by entering ‘2’ in customeridfilter.



Seems like there could be a function similar to parameters but instead of substituting with a quoted string can just be the raw text (ie no quotes).

Userlevel 4
Badge

@mplooker - Thanks for the clarification on the use case. As we build out the functionality further we’ll keep in mind the ability to have a user defined table statement. The issue with direct sql injection is that it opens up the ability for an uninformed user to blow out a database.


Your use case is a great one, but we need to make sure we do it correctly to ensure safety for our customers.

Thanks for explaining that Zam.


So are you saying it is not possible in Looker for a single user to access different tables from one view?

As mentioned, we have 1000+ customer schemas, so I’d rather not have 1000+ view definitions, and filtering on a giant UNION ALL gives performance problems.

If the user could only choose from a pre-defined list (such as fixed suggestions in a filter) then that might avoid the security issue…

Userlevel 4
Badge

@JamesM there may be some additional work we could do to get this to work as you’re projecting it to. We’re going to reach out to you personally to get this solved.

Userlevel 1

We’re interested in hearing about different ideas for this, as we are in a similar situation. I understand that Looker has lots of different development patterns with different pros and cons, and we’ve thought about a couple of options and are still considering if the path we’ve chosen is the right one.


In our case, we’re not dealing with 1,000+ database schemas representing different customers, but a handful (3 today, but maybe ~10 in the future) of different database schemas, each schema representing data from a different city, and each of the schemas contain ~100 Fact/Dim tables. Those ~100 tables usually have the same structure, but occasionally differ since updates are rolled out across them on independent schedules.


We’ve gone down the following development / modeling path, sounding similar to what @mplooker suggested and maybe what @Jerome_Myers1 has implemented.




  1. A set of “base” Looker Views which include references to the database schemas using liquid, as {{_model._name}}




  2. One model named base where the explores and join criteria between our base Looker Views are defined




  3. A carefully named model for each city, with model names reflecting the same convention used in the database schemas, so the nyc model relates to the nyc_prod and nyc_mapping database schemas, etc. Each city-specific model extends the base model




Given that setup, every explore can be quickly swapped over to a different city by altering the model name in the URL, and all LookML Dashboards can be viewed across models.


As @segahm alluded to, we can also introduce some city customization, for example create a ‘Central Park’ dimension in our nyc model, but make sure that dimension doesn’t show up in the boston model, by introducing city-specific views which extend the base views. We have done some of that as well, but are mostly taking advantage of that ability so we can persist PDTs on different schedules (objective: keep them as fresh as possible by triggering them based on successful data replication tasks which are city-specific).


I see the advantage in terms of reduced Looker model complexity going the union-all-data-into-one-table route, but my untested hypothesis is that expressing the union statements as a non-persisted derived table would result in performance issues, and trying to persist the data at the same interval as we currently do (once an hour) would overload the database. That leads me to believe we would need to accomplish this outside of Looker by re-jiggering our ETL process, which seems like a pretty major overhaul.

To conclude the original thread, I talked further with Looker privately in January.

But the final outcome was that Looker cannot support accessing multiple ‘identical’ schemas.


There are workarounds, such as extending to each schema manually which it sounds like you’ve done.

But for large companies like mine we don’t want to create 1000 copies of every Dashboard and Look and Explore etc, so we’re having to look for a new solution, which is a shame because Looker works so well on a smaller scale.

James,


@JamesM Good news! We actually can do this for you! Also it’s something we’re particularly good at. So I think there may have been some miscommunication on our part, which I want to remedy straight away. This post details an example for parametrizing connections based on user attributes: RETIRED: Parameterizing Connections with User Attributes.


Additionally, you can accomplish something similar by referencing a user attribute using liquid templating: By referencing the user attribute when you set sql_table_name.

E.g. sql_table_name: {{ _user_attributes['schema'] }}.my_sweet_table.


I’ll reach out over e-mail to walk through this for you guys!

Jeff

@Jeff_Garcia , We are facing the same issue. We have created multiple views files using derived table from a master database Table 1 in the model. Now, we have an another database Table 2 that contains similar columns as the Table 1 in the same model and we have to create the similar views from this table as well. So, can we use the previous views for Table 2 as well that were created for Table 1. Please help. 

Reply