Hello wonderful Looker people,
Is there anyone else out there who is self-hosting their Looker instances and willing to share best-practices, horror stories, or anything else regarding the challenges of self-hosting? Remember, this is for posterity.
I’ll kick it off. Right now I’m reviewing our scalability. We repackage Looker into our final product and we have quadrupled our customer base in the last 2 years. We are starting to run into performance issues. Before I share our current dilemma, it is important to note I’m a data analyst put in charge of taking care of our Looker nodes, not a proper devops person. And Looker staff, feel free to correct me I get this wrong (please, it would help me learn).
The current struggle I’m working through is getting more CPUs per node. Currently, we have 4 nodes each with 8 CPUs. It seems when a customer creates a dashboard with many tiles (25+) and schedules it to be rendered, all of the rendering jobs get assigned to the same node. So we see that node maxed out for 2-15 minutes while the job completes. And I believe, if a user is operating off the same node then their experience is horrific. I’m requesting we slowly add more CPUs to our nodes, while exploring enforcing a limit of queries per dashboard. Of course, the later is difficult because there doesn’t seem to be a built-in option.
Anyone else out there willing to share?