Big Query can't connect to Cloud SQL

Please help

Invalid table-valued function EXTERNAL_QUERY Failed to connect to MySQL database. Error: MysqlErrorCode(2013): Lost connection to MySQL server during query at [1:15]

Region: Instance and Connection = asia-southeast1

Already set instance to Public IP

Already set BQ admin, Connection Admin

Already enabled API

Solved Solved
0 15 4,492
2 ACCEPTED SOLUTIONS

To resolve this issue, you can follow these steps:

  1. Ensure that you have used valid credentials when creating the connection for Cloud SQL.
  2. Check if the service account associated with the Cloud SQL connection has the Cloud SQL Client (roles/cloudsql.client) role. The service account follows the format service-PROJECT_NUMBER@gcp-sa-bigqueryconnection.iam.gserviceaccount.com.
  3. Verify that the connection details for the Cloud SQL instance are correctly specified in the EXTERNAL_QUERY function.

View solution in original post

You will need to set the  default_authentication_plugin database flag on your CloudSQL to  caching_sha2_password

Please refer to https://cloud.google.com/sql/docs/mysql/flags#list-flags-mysql 

 

 

 

View solution in original post

15 REPLIES 15

To resolve this issue, you can follow these steps:

  1. Ensure that you have used valid credentials when creating the connection for Cloud SQL.
  2. Check if the service account associated with the Cloud SQL connection has the Cloud SQL Client (roles/cloudsql.client) role. The service account follows the format service-PROJECT_NUMBER@gcp-sa-bigqueryconnection.iam.gserviceaccount.com.
  3. Verify that the connection details for the Cloud SQL instance are correctly specified in the EXTERNAL_QUERY function.

I've already checked all and after I add cloud SQL client role I get this error instead ('caching_sha2_password)

Invalid table-valued function EXTERNAL_QUERY Failed to connect to MySQL database. Error: MysqlErrorCode(2059): Authentication plugin 'caching_sha2_password' cannot be loaded: /usr/lib/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory at [1:15]

You will need to set the  default_authentication_plugin database flag on your CloudSQL to  caching_sha2_password

Please refer to https://cloud.google.com/sql/docs/mysql/flags#list-flags-mysql 

 

 

 

Thank you!!! It's work after added default_authentication_plugin and changed credential to root

Hey, I am facing this error 
Invalid table-valued function EXTERNAL_QUERY Failed to connect to MySQL database. Error: MysqlErrorCode(2059): Authentication plugin 'mysql_clear_password' cannot be loaded: /usr/lib/plugin/mysql_clear_password.so: cannot open shared object file: No such file or directory at [1:15]

I have added the flag as u mentioned above. But still, I am facing the same error. Any other possible problems that might be causing it

The error indicates a mismatch between the authentication plugin your MySQL client expects and the one your Cloud SQL instance uses.

Potential Causes and Troubleshooting Steps:

  1. Authentication Plugin Mismatch

    • MySQL 8.0 and beyond: Ensure your MySQL client libraries and tools support caching_sha2_password or mysql_native_password. Update them if necessary.
    • Server Configuration: If needed, adjust the default_authentication_plugin on your Cloud SQL instance, but proceed cautiously and research implications as this is not always the first course of action.
  2. Permissions and Connectivity

    • Service Account Permissions: Verify your service account has the "Cloud SQL Client" role.
    • Firewall Rules: Ensure network and firewall rules allow communication between BigQuery and Cloud SQL. Check both Public IP and Private IP configuration (including VPC peering).
  3. Propagation Delay and Restart

    • Wait for Changes: Allow time for configuration changes to take effect across your instance.
    • Consider a Restart: Restarting your Cloud SQL instance can sometimes force updated settings to fully propagate.
  4. SSL Configuration

    • Enforce SSL: For security reasons, configure your connection to use SSL, especially if your current setup requires the less secure mysql_clear_password.

Additional Tips:

  • Gather Details: Provide your MySQL client version, CloudSQL version, and connection method when seeking further assistance.
  • Documentation: Refer to the official Google Cloud documentation for in-depth guidance.
  • Support: If still facing issues, contact Google Cloud Support for personalized help.

Hi,

Thanks for your inputs.

By any chance can we bypass this requirement to use the Public IP enabled on the Cloud SQL instance? Our Application team has decided to disable this option considering it a security risk and since our pipeline was using the External Query to source Cloud SQL data into BQ, our pipelines will break going ahead.

Appreciate any workarounds or help in this regards.

Certainly! Disabling the Public IP on the Cloud SQL instance is a common security practice, and there are ways to connect to Cloud SQL from BigQuery without using a Public IP. Here are some alternatives:

1. Use Cloud SQL Proxy:

Cloud SQL Proxy provides a secure way to connect to Cloud SQL instances without a Public IP. You can set up a Cloud SQL Proxy in a Compute Engine instance or a GKE cluster within the same VPC as your Cloud SQL instance.

Here's a high-level overview of how to set it up:

  • Deploy Cloud SQL Proxy on a Compute Engine instance.
  • Configure the proxy to connect to your Cloud SQL instance.
  • Update your BigQuery connection to use the proxy's internal IP.

Detailed instructions can be found in the Cloud SQL Proxy documentation.

2. Use Private IP with VPC Peering:

You can enable Private IP on the Cloud SQL instance and set up VPC peering between the Cloud SQL network and the network where BigQuery is running.

Here's how to do it:

  • Enable Private IP on the Cloud SQL instance.
  • Set up VPC peering between the Cloud SQL network and the BigQuery network.
  • Update your BigQuery connection to use the Private IP.

More information on this approach can be found in the Private IP documentation.

3. Use Cloud SQL Federation with BigQuery:

Cloud SQL Federation allows you to query Cloud SQL data directly from BigQuery without copying the data into BigQuery. This can be done using Private IP.

Here's how to set it up:

  • Enable Private IP on the Cloud SQL instance.
  • Create a connection in BigQuery using the Private IP.
  • Use the EXTERNAL_QUERY function in BigQuery to query the Cloud SQL data.

More details can be found in the Cloud SQL Federation documentation.

4. Consider Using Data Transfer Service (DTS):

If the above options are not suitable, you might consider using Google's Data Transfer Service to move data from Cloud SQL to BigQuery. This can be configured to use Private IP and provides a managed way to transfer data.

Each of these approaches has its own considerations in terms of complexity, performance, and security. It's essential to evaluate them in the context of your specific requirements and infrastructure.

Thanks for your prompt response. This is very helpful !

In the above post from 08-03-2023, several times you mention 


@ms4446 wrote:

Create a connection in BigQuery using the Private IP.


This seems to indicate an External Connection can be made using an IP (and presumably a port), but I have only found references to using a ConnectionID in console, bq CLI, and REST API.  I feel like I must have overlooked something.  Could you point me in the right direction?

Thanks

-Kevin

You are correct in noting that when creating a connection in BigQuery to access external data sources like Cloud SQL, you typically use a Connection ID rather than directly specifying an IP address and port in your queries. Let me clarify

Connecting BigQuery to Cloud SQL Using Private IP

Securely access your Cloud SQL data from BigQuery using Private IP connections. Here's a step-by-step guide:

1. Enable Private IP on Cloud SQL:

  • In the Google Cloud Console, go to your Cloud SQL instance settings.
  • Ensure "Private IP" is enabled.
  • If needed, configure a VPC network for your instance.

2. Create a BigQuery Connection Resource:

  • In the BigQuery UI, navigate to your project's "Connections."
  • Click "+ CREATE CONNECTION" and choose "Cloud SQL connection."
  • Fill in these details:
    • Connection ID: A unique, descriptive name (e.g., "my-project-to-cloudsql").
    • Database type: MySQL or PostgreSQL (match your Cloud SQL instance).
    • Connection info: Enter your Cloud SQL instance's Private IP and database name.
    • Credentials: Provide the database username and password.

3. Use the Connection in BigQuery Queries:

  • Use the EXTERNAL_QUERY function, referencing your Connection ID:
SELECT * FROM EXTERNAL_QUERY("project_id.region.connection_id", "SELECT * FROM your_table");
  • Replace placeholders with your actual values.

Security Considerations:

  • VPC Network: Ensure traffic is allowed between BigQuery and your Cloud SQL instance. If they're in different VPCs, set up VPC Network Peering.
  • IAM Permissions: Grant the BigQuery service account the roles/cloudsql.client role on your Cloud SQL instance.

Benefits of Using Private IP:

  • Enhanced Security: Data stays within Google's internal network, reducing exposure.
  • Improved Performance: Potential for lower latency and faster queries.

For More Details:

Additional Tips:

  • Troubleshooting: If you encounter issues, check firewall rules, VPC peering, and IAM permissions.
  • Performance: Consider using a connection pool for frequent queries to improve efficiency.

Example:

 
SELECT * FROM EXTERNAL_QUERY("my-project.us-central1.my-cloudsql-connection", "SELECT name, age FROM customers WHERE age > 30");

Thank you for responding. This feels so close.  I feel like I'm not looking in quite the right spot for 2. Create BigQuery Connection Resource: because what I see doesn't quite match up with your explanation.  I'll include an image below,  but here is how I got there:

  1. Choose BigQuery from the "hamburger" menu next to Google Cloud. BigQuery Studio is the default.
  2. Open the project where I am interested in adding the connection (dataanalytics-dev-330120 in the picture)
  3. Tap + ADD next to Explorer (this is where I seem to be going off your script)
  4. Tap Connections to external data sources 
  5. This is what I see after that.  I see I can name this Connection ID, but I only see Cloud SQL connection name below, no place to add the private IP (which would be the private IP of a Compute Engine instance running cloud_sql_proxy in my case)

Perhaps I don't have some service enabled on my project, or misread your instructions.

-Kevin

koconnorampion_0-1712861303306.png

 

Hi @koconnor-ampion ,

Thanks for providing the screenshot. the steps you've taken have been in BigQuery Studio, which is a newer interface. This differs slightly from the classic BigQuery UI. While the overall concepts remain the same, the exact navigation and options can vary.

Scenario 1: Using BigQuery Studio

1. Enable Private IP on Cloud SQL:

  • Navigate to your Cloud SQL instance settings in the Google Cloud Console.

  • Ensure that "Private IP" is enabled, which may require configuring a VPC network if not already set up.

2. Create BigQuery Connection Resource:

  • In BigQuery Studio:

    • Click "ADD DATA" instead of "+ ADD" next to Explorer.

    • Select "Google Cloud" followed by "Cloud SQL".

    • Provide the following details:

      • Connection ID: A unique name for your connection.

      • Cloud SQL Instance Name: Select your instance from the dropdown list.

      • Database: Name of the database within your Cloud SQL instance.

      • Credentials: Input the username and password.

Scenario 2: Using Cloud SQL Proxy

1. Enable Private IP on Cloud SQL:

  • Same steps as in Scenario 1.

2. Configure Cloud SQL Proxy:

  • Deploy the Cloud SQL Proxy on a Compute Engine instance within the same VPC or a connected network environment.

  • Configure the proxy to use the Private IP of the Cloud SQL instance.

  • Note the port number on which the proxy is listening (default ports are typically 3306 for MySQL, 5432 for PostgreSQL).

3. Connect Applications to Cloud SQL Proxy:

  • Ensure that applications connecting to the Cloud SQL through the proxy are on the same network or connected via VPC peering.

  • Direct connection settings in BigQuery Studio to the Cloud SQL instance do not apply here; instead, focus on application-level configurations that use the proxy setup.

Important Notes

VPC Peering: If your Compute Engine instance (running the Cloud SQL Proxy) and your BigQuery environment are in different VPCs, you'll need to establish VPC Network Peering.

IAM Permissions: The BigQuery service account should be granted the roles/cloudsql.client role on the Cloud SQL instance to facilitate interactions.

Documentation Link: For more detailed guidance, refer to Connecting to Cloud SQL using the Cloud SQL Proxy.

Sigh, you must be thinking that I am rather dense, or the interface for BigQuery Studio has changed again.  Perhaps your directions work for a brand new BigQuery Studio, but I do not find an ADD DATA link that then allows me to select Google Cloud followed by Cloud SQL.  I've been looking at some of the release docs and they would lead me to believe that I should find an +ADD DATA button on the Welcome Page .  Mine, unfortunately, does not match the one in the docs.  I have tried the links under Add your own data (see my screenshot again), but those add individual tables to a dataset.  

Sorry to keep bugging you about this, but it really sounds like it should allow what I am trying to achieve.

Thanks again

-Kevin

koconnorampion_0-1712867444906.png

 

service-PROJECT_NUMBER@gcp-sa-bigqueryconnection.iam.gserviceaccount.com this service account attached with connection but not in service account list.