At what rate does Google Pub/Sub deliver messages to a listener?

Hello,

I am trying to understand the policy Google Pub/Sub uses to deliver messages to a listener.

- is each message delivered as soon as it becomes available or is there any kind of batching? how to control the settings?

- does the answer change if we are using a pub/sub listener directly vs. a push subscription to a Spring (Java) service?

- let's say there is a topic and it has 1000 messages. When I add a new subscription, will I get the messages that were there in the topic before my subscription was created?

- there could be millions of messages accumulated before I added my subscription. Is there some setting to control the window of past messages? Past messages = messages before subscription was added

- will the past messages be delivered all at once in a burst or is there some policy controlling the rate of delivery?

- again, do the answers change if using a listener directly vs. push subscription to Spring Java service.

Sorry if this is documented somewhere but I could not find it.

Thanks!

Solved Solved
0 3 2,120
1 ACCEPTED SOLUTION

Hello Morpheus,

  • Technically, it depends on the type of subscription you build. For Push subscriptions including BigQuery Subscription, messages are pushed as soon as they arrive. For Pull subscriptions, they subscriber/listener needs to request the messages. If you are using Dataflow with Pull subscription then they are streamed/requested immediately. So in essence is you are using the in-built Google subscriptions, they are all "real-time". I have tested this with timestamps and the difference is barely in milliseconds.
  • If you are using an external process to retrieve messages via a Pull subscription, then it depends on your app to make the requests as frequently as you want. For Push, again Pub/Sub will send immediately. If there are rate limits in your app or destination or there are network delays, you can control how to retry when you create the subscription at the bottom: Retry Immediately OR Retry with Exponential Backoff.
  • Whether messages will be store or not depends on a setting when you are creating the topic.: "Enable message retention" You can choose from minutes to days. Max is 7 days. If this is not enabled when you created the topic, the messages are lost. If you have a subscription attached to the topic then the subscription also has a message retention setting which can go up to 7 days as well. Usually when creating s subscription, the default setting enables the retention, the topic does not.
  • Past messages will be attempted more or less all at once but if it gets pushback it will depend on the retry policy you have setup above. 
  • Again it depends on the subscription. If Pull, your app decides. If Push "Release the Kraken" immediately.

A lot of this is documented in little help snippets when you try to create a topic or subscription. The rest is from Google's documentation here: 
https://cloud.google.com/pubsub/docs/subscriber

Regards.

View solution in original post

3 REPLIES 3

Hello Morpheus,

  • Technically, it depends on the type of subscription you build. For Push subscriptions including BigQuery Subscription, messages are pushed as soon as they arrive. For Pull subscriptions, they subscriber/listener needs to request the messages. If you are using Dataflow with Pull subscription then they are streamed/requested immediately. So in essence is you are using the in-built Google subscriptions, they are all "real-time". I have tested this with timestamps and the difference is barely in milliseconds.
  • If you are using an external process to retrieve messages via a Pull subscription, then it depends on your app to make the requests as frequently as you want. For Push, again Pub/Sub will send immediately. If there are rate limits in your app or destination or there are network delays, you can control how to retry when you create the subscription at the bottom: Retry Immediately OR Retry with Exponential Backoff.
  • Whether messages will be store or not depends on a setting when you are creating the topic.: "Enable message retention" You can choose from minutes to days. Max is 7 days. If this is not enabled when you created the topic, the messages are lost. If you have a subscription attached to the topic then the subscription also has a message retention setting which can go up to 7 days as well. Usually when creating s subscription, the default setting enables the retention, the topic does not.
  • Past messages will be attempted more or less all at once but if it gets pushback it will depend on the retry policy you have setup above. 
  • Again it depends on the subscription. If Pull, your app decides. If Push "Release the Kraken" immediately.

A lot of this is documented in little help snippets when you try to create a topic or subscription. The rest is from Google's documentation here: 
https://cloud.google.com/pubsub/docs/subscriber

Regards.

Regarding your questions about messages that pre-date the subscription creation, quoting from the doc at https://cloud.google.com/pubsub/docs/subscriber#what-subscription:

"Only messages published to the topic after the subscription is created are available to subscriber clients. However, you can also enable topic retention to allow a subscription attached to the topic to seek back in time and replay previously published messages." 

Thanks a lot to both of you. Happy to receive more replies and insights into this complex topic.