What does `Error: Your scheduled job failed because the results stored in cache were expired or invalid. To resolve, please retry your schedule or wait until the next scheduled job.` mean on my failed schedule delivery?

TL:DR: The cached results for the queries expired before Looker was able to send out the results. This commonly occurs when lots of schedules are set to run at the same time, are particularly large deliveries, or to more complex endpoints, like Google Sheets. Staggering the schedules over a longer time period should help.
 

To understand what it going on with this error it will be useful to look into the entire schedule delivery process, which is more comprehensively outlined in this Community post but key points highlighted here. All schedule jobs that run will go through 4 phases, with 2 of them having concurrency limits:

  1. Enqueue for execution - waiting in line for query executing for the schedule

  2. Execute - actually running the queries against the database or verifying valid cache for the queries which is default limited to 10 concurrent threads. 

  3. Enqueue for delivery - waiting in line for the cached results from the Execute stage to be sent off

  4. Delivery - actually sending this content to its defined destination like email, Google Sheet, S3, etc. which has a hard limit set of 3 thread

The reason we end up seeing this error is essentially when step 2 gets too far ahead of step 4. That is, the ability to deliver the results of the schedules falls behind the ability of the queries to be executed on the database. The queries that are run by the execute threads will follow the caching policy defined on the explore that generated them, which by default is 60 minutes, or 12 hours for System Activity. If the cache for these queries expires before they are able to be picked up by one of the 3 delivery threads, we will get the `Your scheduled job failed because the results stored in cache were expired or invalid. To resolve, please retry your schedule or wait until the next scheduled job.` error message. This commonly will occur when many schedules are triggered to run at the same time and are delivering to places that carry more computational weight, such as Google Sheets or some other integration that utilizes the Action Hub. 

An audit of schedule job timing by hour (with average run times) found in the Instance Performance Dashboard from the Admin > System Activity panel can be useful tool to see if there is a large number of schedules set for the same time. There is no hard limit set for what is too many schedules for a given hour, as the details of the queries, database, and destination all play a part in what can be processed without error. If you come across this error and also see that the scheduled delivery time seems to be a hotspot in the heatmap, spacing these out would be recommended. 

7 0 4,635