Your favourite data ingestion tool for GCP? Easy extract/load for raw data

So I need to select a data ingestion tool for data platform based on GCP.

At first glance - Cloud Data Fusion makes sense - has pre-built connectors. Easy, just extract and sink it. And we need ingest just raw data, no transformations. But from various sources, like SAP and other DBs. No-code raw data ingestion makes sense.

However CDF has some annoying bits:

  • Loads of add-ons (orchestration, metadata), which duplicates other tools already selected, like Composer.

  • Instances, licences, updates, plugin updates - quite a bit of management/maintenance required

  • Just reading the network options/configs is giving me headache...

  • Also don't like this heavy focus on UI / No code, but it's ok, as the plan is only to use it for ingestion. But don't like that no-code focus in general.

So what's you go-to data ingestion tool on GCP and why?

0 2 1,538
2 REPLIES 2

Based on your requirements, some good options for data ingestion tool on Google Cloud could be:

  • Cloud Storage Transfer Service: A fully-managed service for transferring data between online data sources and GCS. It's a good option for ingesting large amounts of raw data into GCS without the need for any code. 

  • Cloud Composer: Composer can orchestrate data ingestion tasks from a variety of sources, including APIs, databases, and messaging systems. It's especially useful if you have complex workflows or need to coordinate multiple tasks. 

  • Dataflow: Dataflow is a fully-managed service that runs Batch and Streaming pipelines. It's suitable for ingesting data that requires complex transformations and can handle data from a variety of sources, including streaming sources. 

If you're seeking a no-code solution for raw data ingestion, Storage Transfer Service is a solid choice. For more flexibility and the ability to handle data from diverse sources and perform transformations, consider Apache Airflow (via Cloud Composer) or Apache Beam with Cloud Dataflow.

Ultimately, the best data ingestion tool will align with your specific requirements.

 

You can also use the application integration service to connect to 3rd party saas. 

Otherwise for traditional data transfer I prefer dataflow and orchestration using cloud composer