Question

Looker Self-Hosting Knowledge Share

  • 23 September 2020
  • 3 replies
  • 123 views

Userlevel 2

Hello wonderful Looker people,


Is there anyone else out there who is self-hosting their Looker instances and willing to share best-practices, horror stories, or anything else regarding the challenges of self-hosting? Remember, this is for posterity.


I’ll kick it off. Right now I’m reviewing our scalability. We repackage Looker into our final product and we have quadrupled our customer base in the last 2 years. We are starting to run into performance issues. Before I share our current dilemma, it is important to note I’m a data analyst put in charge of taking care of our Looker nodes, not a proper devops person. And Looker staff, feel free to correct me I get this wrong (please, it would help me learn).


The current struggle I’m working through is getting more CPUs per node. Currently, we have 4 nodes each with 8 CPUs. It seems when a customer creates a dashboard with many tiles (25+) and schedules it to be rendered, all of the rendering jobs get assigned to the same node. So we see that node maxed out for 2-15 minutes while the job completes. And I believe, if a user is operating off the same node then their experience is horrific. I’m requesting we slowly add more CPUs to our nodes, while exploring enforcing a limit of queries per dashboard. Of course, the later is difficult because there doesn’t seem to be a built-in option.


Anyone else out there willing to share?


3 replies

Userlevel 2

I didn’t know this article existed. I thought I’d link it here in case others need it.


@thomas_brittain We are in the same boat as you in my current organization. We are just onboarding with Looker and looking for best practices and tips related to a self hosted Looker installation. Again like you we are a team of analysts but trying to work with our DEVOPS team to set this up. They are clueless and we are even more clueless!

Documentation seems to be very scarce and not answering a lot of questions. It's all based on an assumption that it will be Google/Looker hosted only which is bad!

For example, one of the questions we have - Is it recommended to set up the MySQL database during initial installation? What would be a typical scenario when the Hyper-SQL database size goes beyond 600MB and performance issues start cropping up - how many concurrent users OR how many dashboards being accessed? Also what tables does this metadata contain? It says configurations, users and other data? What is this "other data"? Is it this Hyper-SQL/MySQL DB we have to look into for monitoring license usage? Is there some schema diagram and documentation available for this DB?

This and many more such questions. Is there some group of Looker Administrators? (a rare breed I would think considering many seem to be taking the Looker hosted installation approach!)

 

 

Userlevel 3
Badge

Hi @thomas_brittain this is a great question. I specifically wanted to pull out this portion:
 

 

The current struggle I’m working through is getting more CPUs per node. Currently, we have 4 nodes each with 8 CPUs. It seems when a customer creates a dashboard with many tiles (25+) and schedules it to be rendered, all of the rendering jobs get assigned to the same node. So we see that node maxed out for 2-15 minutes while the job completes. And I believe, if a user is operating off the same node then their experience is horrific. I’m requesting we slowly add more CPUs to our nodes, while exploring enforcing a limit of queries per dashboard. Of course, the later is difficult because there doesn’t seem to be a built-in option.

 

One potential option is to create a set of renderer or schedule nodes. You can move those render specific nodes from behind your load balancer (to remove the potential of general traffi from hitting these vms) and set the renderer threads on the nodes behind the ELB (which I will call the UI nodes) to 0 in the lookerstart.cfg file.

We then set the render threads on the render threads to use the default number of threads or you could potentially increase it given the render nodes have enough memory allocated. This way render tasks will get directed to your renderer nodes and should remove some of the memory overhead from the UI nodes. 

 

Thanks,

Eric

Reply