Prefetching (3.36+)

What is prefetching?

Prefetching is an API-only system through which a customer can pre-run certain dashboards for particular filter sets. This will make the dashboards load as if from cache, even on the first load of the day.

When should I use a prefetch?

Only use a prefetch if the following conditions are met:

  • You know all the possible filter values that will be used on the dashboard, and you’re comfortable with the load it will put on your database to run all the variations of the dashboard in sequence. You must be okay with filters outside this set being slow.

  • You have truly realtime data. This matters if you want other places (e.g., Explore pages) to be realtime, but you don’t want Dashboards to be re-run. It would also matter if it’s really important for you to have all the tiles in the dashboard be from the same time (e.g., all from 6am), and you have experienced situations where one tile is out of sync because it was refreshed with newer data at a later point in the day.

Example good use case:

You embed dashboards for 10 clients. Each dashboard only gets filtered for each of those 10 clients, and no other filter values are being used. You update all the data for those dashboards at the same time each night. In this case, you could set up the prefetch to run before the day starts, so that there’s never a first load experience for any user.

WARNING: Based on how the system matches prefetches, not having a prefetch present for a dashboard that expects it or not having a prefetch available for a certain filter value can result in very slow dashboards, as Looker goes through an entire process to attempt to match the prefetch before starting over and running the dashboard as if the user had just loaded the page. That’s why we only recommend it in unique cases.

Configuring Prefetches

Note: This feature requires technical understanding. A script is required to make API calls and create the prefetches for each dashboard.

Some important notes when prefetching:

  • The dashboard’s load_configuration must be set to prefetch_cache_run in order for the dashboard to look for a prefetch.
    • This is set on a LookML dashboard in the LookML with the parameter, load_configuration: prefetch_cache_run.
  • When the dashboard opens it will search for an existing prefetch for the given set of access_filters and dashboard filters. If a prefetch is not found, then the dashboard will search in cache, and if it is not in cache then it will run the dashboard.

###Check if prefetch is available for a dash

To test to see if a prefetch is available for a dashboard use the dashboard_prefetch (sdk) or /dashboards/{dashboard_id}/prefetch (http) request.

If not available it will 404:

###Set the load configuration
From here you’ll want to set the load_configuration to prefetch_cache_run, because by default dashboards are set to cache_run. To do so use the update_dashboard (sdk) or /dashboards/{dashboard_id} (http) method.

###Create the prefetch

Once the dashboard’s load_configuration is set appropriately you’ll be able to create the prefetch for the dashboard. You’ll want to use the create_dashboard_prefetch (sdk) or /dashboards/{dashboard_id}/prefetch (http) method.

An example body (w/ access filters and dashboard filters) can be found in the swagger API documentation (the below example sets it at 30 seconds):

###Confirm the prefetch

Once you’ve created the prefetch you’ll want to test to make sure that prefetch is now available. Keep in mind that the body needs to be congruent with the body of the create_dashboard_prefetch method:

5 22 2,439
22 REPLIES 22

What would be the cause of a 405 return code (during update_dashboard to change ‘load_configuration’) ? I already executed a create_dashboard_prefetch call without error.

It seems to only happen with LookML dashboards. I was able to get it to work with a ‘normal’ dashboard. Can you confirm and if applicable provide timeline for LookML support?

Thanks

Thanks Abby! We’ll test this out and provide feedback. Just to confirm, there should be no downside to setting this for all dashboards since it will just use cache or run the report if there is no prefetch…

What is the expected load time from prefetch? I’m testing a dashboard and it is averaging around 38 seconds of wall time. Is there anyway to speed it up?

I’m guessing this means that the prefetch was being used:
2016-02-23 00:12:59.723 +0000 [INFO|758b1|prefetch] :: UPDATE "PREFETCH" SET "HIT_COUNT" = ("HIT_COUNT" + 1)

Hi abbywest, tagging along on this thread, how do I reference a lookml dashboard? what would be the dashboard_id?
from the UI I only see the named url path, instead of /dashboard/id that user dashboards generate.

Thank you!
-john

MikeD
Participant II

Here is a handy shell script that can be used to set up a prefetch from a command line.

#!/bin/bash

CLIENT_ID=YOUR_CLIENT_ID_HERE
CLIENT_SECRET=YOUR_CLIENT_SECRET_HERE
END_POINT=https://learn.looker.com:19999/api/3.0
DASHBOARD=73
DASHBOARD_FILTER='[ { "name": "Date", "value": "20 days" } ]'

jq_location=`which jq`
status=$?

if [ $status -gt 0 ]
  then
    echo "This utility requires the jq command line utility to parse JSON data. Find more information here: https://stedolan.github.io/jq/"
    echo "On a mac with the brew utility installed, you should be able to install jq with the command \"brew install jq\"."
    exit 255
fi 

echo "Logging in"
response=`curl --request POST --fail --silent --show-error -d "client_id=${CLIENT_ID}&client_secret=${CLIENT_SECRET}" "${END_POINT}/login"`
status=$?

if [ $status -gt 0 ]
  then
    echo "Failed logging in"
    exit 1
fi 

ACCESS_TOKEN=`echo $response | jq --raw-output '.access_token'`
echo "Got access token \"${ACCESS_TOKEN}\""

echo "Setting the load_configuration"
response=`curl --request PATCH --fail --silent --show-error --header "Authorization: token  ${ACCESS_TOKEN}" --data "{ \"load_configuration\": \"prefetch_cache_run\" }" "${END_POINT}/dashboards/${DASHBOARD}"`
status=$?

if [ $status -gt 0 ]
  then
    echo "Failed setting load_configuration"
    exit 2
fi 

echo "Creating the prefetch"
response=`curl --request POST --fail --silent --show-error --header "Authorization: token  ${ACCESS_TOKEN}" --data "{ \"ttl\": 6000, \"dashboard_filters\": ${DASHBOARD_FILTER} }" "${END_POINT}/dashboards/${DASHBOARD}/prefetch"`
status=$?

if [ $status -gt 0 ]
  then
    echo "Failed creating prefetch"
    exit 3
fi 

echo "Requesting the prefetch"
response=`curl --request GET --fail --silent --show-error --header "Authorization: token  ${ACCESS_TOKEN}" --data-urlencode "dashboard_filters=${DASHBOARD_FILTER}" "${END_POINT}/dashboards/${DASHBOARD}/prefetch"`
status=$?

if [ $status -gt 0 ]
  then
    echo "Failed getting prefetch"
    exit 4
fi

echo "First 20 lines of response"
echo $response | jq '.' | head -20

echo "Loggin out"
curl --request DELETE --header "Authorization: token ${ACCESS_TOKEN}" "${END_POINT}/logout"

I have added Load configuration : prefetch_catch_run in lookml . In setting up load confugration (From here you’ll want to set the load_configuration to prefetch_cache_run, because by default dashboards are set to cache_run. To do so use the update_dashboard (sdk) or /dashboards/{dashboard_id} (http) method.)
As mentioned in document, I am getting response code as 405? if someone can help

MikeD
Participant II

The load_configuration needs to be set through the API, not in the LookML.

If I have a Prefetch created for a Dashboard, will I be able to clear the cache for the dashboard and refresh the data from the Looker UI?

Hey @bbbill - if you have prefetch set on a Dashboard, you won’t be able to clear cache & refresh on the dashboard from the Looker UI.

Okay, so if I have a Prefetch created for my dashboard with a ttl that will cause it to expire in, say, 8 hours, the only way for me to get updated information into that dashboard is to delete the Prefetch through the API?

EDIT: Actually, it looks like there is no delete prefetch API call. So it looks like once you have a prefetch, you’re stuck with it until it expires?

I could very much see an instance where you’ve prefetched a dashboard, but want to refresh the data. But it seems like that’s impossible?

That sound about right @bbbill. It sounds like for your use case you might rather use persist_for: 8 hours in your model or explore to keep results in cache for 8 hours. That way, where you previously prefetched, you can instead schedule a report to warm the cache. And then you’ll be able to refresh the data if you feel like it.

Hi, I’m attempting to use the prefetching functionality to preload dashboards before people get into work, so they load instantly in the morning. I’m successfully updating the load configuration of a dashboard I’m testing on to 'prefetch_cache_run. I’m also successfully creating a prefetch on a dashboard. I can also get the dashboard prefetch through the api, and I see the “hit_counter” increment by 1 each time I fetch it.

What confuses me is when I view the dashboard in a browser. It doesn’t seem to use the prefetched cache to load (and I can clear the cache and refresh unlike what is discussed above). Is there something I’m missing with prefetched dashboards when viewing from the browser as a non-technical user? Thank you!

Hi @psingman,

There are a few things that can cause a prefetch to get invalidated and cause the dashboard to rerun upon opening:

  1. The underlying SQL of any of the tiles changes. This could be any changes to the LookML, or different filters being used via user attributes or changes made on the front end.
  2. The dashboard itself changes (any tiles are rearranged, created, deleted, or edited)
  3. The time to live (TTL) that is declared when creating the dashboard prefetch expires.

Hope this helps!

Best,
Ryan

@rdunlavy, I noticed the prefetch API is deprecated now in the API docs. What is the suggested replacement instead of using prefetching if you want to cache a dashboard with a few commonly chosen filters.

Hey @grantnicholas

In order to cache a dashboard with certain dashboard filters, we recommend using datagroups in conjunction with schedules. The persist_with parameter can be used at the model or explore level along with a datagroup to set the caching policy for that explore or model.

Because a scheduled query follows the same caching policies as a query by a user, if a schedule is set up with the filter values you would like to cache that triggers with the same datagroup as the model or explore, then that query would be cached.

Hope this helps!

@rdunlavy Just to clarify what you are saying, to pre-cache a dashboard you are suggesting to schedule a dashboard export with a datagroup. When the datagroup triggers, the dashboard will run which will cache all of the looks inside the dashboard. To pre-cache multiple different filters on that dashboard, just schedule multiple exports each with a different combination of filters.

@grantnicholas Yes, that is a great summation!

If the schedule runs on the same datagroup that the explores in the different tiles use, then the dashboard will always pull from cache when viewed in the front end since the previous cache would be busted and a new query would be run and cached simultaneously.

Thanks! That makes sense.

Tommy_Wu
Participant I

I saw the above comment that pre-fetch api is being deprecated and that scheduling with data groups is the suggested solution.

Is it possible to trigger a schedule with the api, so that we have the same side-effects as prefetching did, but not to actually generate a PDF/report or send it anywhere?

Btw the api demo super nifty to play around with, glad yall made it and hope it remains maintained:
https://demo.looker.com:19999/api-docs/index.html

Tommy_Wu
Participant I

So we ended up solving for the no-emailing, no sftp constraint by setting up a webhook to just listen and not do anything with the information it gets.

Hello Tommy_Wu, I hope you’re doing well.
Could you please give me some tips how to set up a webhook to just listen and not do anything with the information it gets ?
I wanna create a prefetching as well.
Thanks a lot.
Best Regards.