GCP Datastream and Postgres CDC Integration. I change the rotation interval in cloud storage

i change the rotation interval in cloud storage, I take the change, however it doesn't create the object with the interval set, but every 60s

this is my code: @etaim

 

{
    "file_rotation_mb":20,
    "file_rotation_interval":"480s",
    "avro_file_format": {
        "schemaFileFormat": "AVRO_SCHEMA_FILE",
        "compression": "NO_COMPRESSION"
      }
    }


gcloud datastream streams update prueba-ser --location=central-1 --gcs-destination-config=/home/bi/GCS_DESTINATION_CONFIG.json

 

 

 

result: create object every 60s

0 8 344
8 REPLIES 8

The maximum file rotation interval is 60 seconds. This used to appear in the docs but I can't find it anywhere, I'll make sure we add it back.

Can you explain the use-case / reasoning for setting it to a higher value?

thanks for answering

I am currently making a pipeline for 32 tables, the architecture is, datastream, cloud storage, bigQuery, but the ETL is done with cloud fucntion,
Through query orchestration, the avro is loaded from cloud storage and then I start the transformations, however I have the limit per project of 10,000 loads and 1000 loads per table, if I increase the rotation time I can overlap this.


If I do the direct integration with bigquery, I lose the history.

Thanks for clarifying, I understand how longer file rotations would help. I'm afraid we can't currently support file rotation to be longer than 60 seconds, and no plan to change this behavior.

We *ARE* planning to support append-only mode so that all events are kept in BQ (ETA H2'23) - will this address your needs? You mentioned performing transformation on the data, in this case you'll need to do those in after the data is in BigQuery...

Hi @etaim , ref: the below, is this still on the roadmap? We are interested in this feature. We want to use Datastream to create an append-only audit log that tracks all changes and versions of each record from the Postgres database.

We *ARE* planning to support append-only mode so that all events are kept in BQ (ETA H2'23) - will this address your needs? You mentioned performing transformation on the data, in this case you'll need to do those in after the data is in BigQuery...

Hi @chack, append-only is still on the roadmap, but unfortunately timelines have somewhat shifted. I will reply back here when we have a confirmed ETA.

Thanks so much for getting back to me @etaim . Please keep us posted on the ETA. 

 

Hi @etaim , hope you are well. Not to pester, but any update on the ETA for this feature? We are very interested in it. 

Thanks in advance for any info!

All the best,

Charlie Hack
MANTL

Hi @chack , I don't have an updated ETA at this time. I'll update this thread as soon as I do.