Ray on Vertex AI: Head Node not reachable

I am spinning up a Ray on Vertex AI cluster and trying to connect to the cluster from Colab Enterprise.

Trying to connect using:
ray.init(address='vertex_ray://projects/my-project-id/locations/us-central1/persistentResources/test-ray')
 
I am seeing this error:
[Ray on Vertex AI]: Cluster State = State.RUNNING
ValueError Traceback (most recent call last)
<ipython-input-2-5a46410e358a> in <cell line: 6>() 4 5 import ray ----> 6 ray.init(address='vertex_ray://projects/my-project-id/locations/us-central1/persistentResources/test-ray')
/usr/local/lib/python3.10/dist-packages/google/cloud/aiplatform/preview/vertex_ray/client_builder.py in __init__(self, address)
95 if address is None: 96 persistent_resource_id = self.resource_name.split("/")[5] ---> 97 raise ValueError( 98 "[Ray on Vertex AI]: Ray Cluster ", 99 persistent_resource_id, ValueError: ('[Ray on Vertex AI]: Ray Cluster ', 'test-ray', ' Head node is not reachable. Please ensure that a valid VPC network has been specified.')

I have setup VPC peering based on directions from here: https://cloud.google.com/vertex-ai/docs/general/vpc-peering
 
Stuck for more than a week on this. Any help is appreciated.
5 1 54
1 REPLY 1

The error message indicates that the head node of the Ray cluster is not reachable. This could be due to various reasons:

  • Ensure that the VPC peering setup is correctly configured. Double-check that the VPC network specified for the Ray cluster allows traffic from the Colab Enterprise environment.
  • Verify that the firewall rules allow incoming and outgoing traffic for the necessary ports used by Ray.
  • Make sure that the service account used by your Colab environment has the necessary permissions to access resources in the specified project and location.
  • Ensure that the Ray cluster on Vertex AI is in a healthy state and all necessary components are running.
  • Confirm that the address provided to ray.init() is correct and matches the address of the Ray cluster on Vertex AI.

Here's a checklist to follow:

  • Double-check the VPC peering setup and firewall rules to ensure they are configured correctly.
  • Verify the permissions of the service account used by Colab Enterprise.
  • Ensure the Ray cluster on Vertex AI is running and accessible.
  • Confirm the correctness of the cluster address provided to ray.init().