Question

Status Page (Looker hosted)

  • 4 June 2020
  • 8 replies
  • 9862 views

Userlevel 2

Hi Everyone,


I’m curious to find out if looker hosted installs have an SLA yet and if there’s a dedicated status page or RSS feed of outages in the Looker infrastructure? Knowing when looker is having an outage is important to monitor in an application that is critical to the business.


Thanks


8 replies

Userlevel 7
Badge +1

This is a great question. Looker hosted installs do have an SLA, which may vary based on your specific contract. The exact language should be available in your contract, if not you could ask your account team about it.


The status page topic is a bit more complex. It’s something we are grappling with and have had a few internal discussions about very recently, actually. The key complexity comes down to the fact that Looker-hosted instances are single tenant, meaning one Looker instance could suffer an outage for any number of reasons without impacting a single other instance. Also, many customers host their own instances of Looker “on-premise” and we don’t have insight into all of them.


Would we send an RSS update for every individual outage? Only outages of > x Lookers? Only major, system/infra level outages? If the latter, it might be confusing for someone whose Looker instance went down for a non system-wide issue, and is seeing the status page all green despite their Looker instance not working. Or, if a customer is hosted on-premise but unaware of it, they might experience the same confusion. These are all questions we are ironing out before creating a status page, but the need is very well understood. We totally appreciate how important it is to monitor Looker uptime.

Userlevel 3

I'm sure you all have your circumstances, but I just want to know the status of our Looker.

We had issues yesterday, so a status page would be extremely helpful.

Userlevel 2

+1

We’ve seen outages the last few weeks, so knowing that it is a widespread outage would be helpful.

https://status.looker.com/ does not work

We’ve had several outages as well. It is surprising Looker (hosted instances) does not provide a status page or an incidents page. Certainly not what one would expect for a service that expensive.

Userlevel 2

Hi team, 

You can test your own instance with this guide:

https://docs.looker.com/setup-and-management/on-prem-install/monitoring-instance

 

 

Hi team, 

You can test your own instance with this guide:

https://docs.looker.com/setup-and-management/on-prem-install/monitoring-instance

 

Hi. That documentation corresponds to on-prem (i.e. hosted by the customer) instances, not the ones hosted by Looker. Still, we have tested the URLs mentioned there and the same error is returned, even when the instance is working correctly!

 

Userlevel 2

Hi team, 

You can test your own instance with this guide:

https://docs.looker.com/setup-and-management/on-prem-install/monitoring-instance

 

Hi. That documentation corresponds to on-prem (i.e. hosted by the customer) instances, not the ones hosted by Looker. Still, we have tested the URLs mentioned there and the same error is returned, even when the instance is working correctly!

  • https://{instance_name}.looker.com/availability
  • https://{instance_name}.looker.com/alive

 

The “availability” URL does work for my instance (see below “Looker is up” screenshot), which is not on-prem. However, the “alive” URL returns a blank page.


 

Alternate solutions can include using the System Activity explores. There are numerous explores that Looker has created which gives back some useful information, such as for “Queries” and “Dashboards”. There’s even an “API Usage” explore so that you can monitor how many requests have been made per day to each API endpoint, although it doesn’t get any more detailed than that.

 

However, I still find it hard to determine when our Looker instance is up but the service is degraded (has slowed down a good bit). It’d be very useful to be able to track runtime on dashboards and API requests so that we can monitor health, not just whether the instance is up or not.

 

 

Reply