How to run the tutorial on TPUs ?

Apologies for the beginners questions, I just wanted to get started with Cloud TPUs, but it seems nothing I try works.

I've been trying to follow https://cloud.google.com/tpu/docs/run-calculation-jax, but it fails at the first command to create a TPU VM.

I bumped into several issues:

1. Attempt 1:

$ gcloud compute tpus tpu-vm create gomlx-tpu \
--zone=europe-west4-a \
--accelerator-type=v2-8 \
--version=tpu-ubuntu2204-base
Create request issued for: [gomlx-tpu]
Waiting for operation [projects/gomlx-392709/locations/europe-west4-a/operations/operation-1690529132849-60186fc722abe-8f3e7b49-d7def2be] to complete...failed.
ERROR: (gcloud.compute.tpus.tpu-vm.create) {
"code": 7,
"message": "User does not have permission to access the OS image used by this Cloud TPU runtime version. [EID: 0x2770e45e99305f9e]"
}

What is the issue with permission to use the Ubuntu image ? Where do I get it ?

Attempt 2, 3 and 4:

$ gcloud compute tpus tpu-vm create gomlx-tpu \
--zone=europe-west4-a \
--accelerator-type=v2-8 \
--version=tpu-ubuntu2204-base
Create request issued for: [gomlx-tpu]
Waiting for operation [projects/gomlx-392709/locations/europe-west4-a/operations/operation-1690529420564-601870d985980-063be324-16c82c83] to complete...failed.
ERROR: (gcloud.compute.tpus.tpu-vm.create) {
"code": 8,
"message": "There is no more capacity in the zone \"europe-west4-a\"; you can try in another zone where Cloud TPU Nodes are offered (see https://cloud.google.com/tpu/docs/r
egions) [EID: 0x7f281e57cca4ac51]"
}

And then i tried in different regions with similar results. Do I need to try all combinations of TPU types and zones myself ? Can't they create a page that lists me what is available instead ?

Notice the link given (https://cloud.google.com/tpu/docs/regions)  doesn't list availability.

Attempt 5, 6, 7, ...:

So I manually created a TPU "something" (? "TPU Node" ?  What is this ? The term is not linked in the console) in the console UI. But the "SSH" link just says "This TPU's architecture is not TPU VM". So what ... how do I ssh to it ? Or otherwise how I interact with this to follow on the tutorial ?

So I tried again the command line:

$ gcloud compute tpus tpu-vm ssh gomlx-tpu --zone=us-central1-b
ERROR: (gcloud.compute.tpus.tpu-vm.ssh) Invalid value for [TPU]: this command is only available for Cloud TPU VM nodes. To access this node, please see https://cloud.google.com/tpu/docs/creating-deleting-tpus.

Following the documentation I tried:

$ gcloud compute ssh gomlx-tpu --zone=us-central1-b
ERROR: (gcloud.compute.ssh) Could not fetch resource:
- The resource 'projects/gomlx-392709/zones/us-central1-b/instances/gomlx-tpu' was not found

But it is, I see it in my console ...

Thanks in advance for any pointers! 

 

 

0 3 1,209
3 REPLIES 3

Check your quotas. In my case it was necessary to ask for TPU, emailing Google Cloud at: cloud-tpu-pm-team@google.com

Regarding your question about SSH - Cloud TPU has two different VM architectures: TPU Nodes and TPU VMs. It sounds like you created a TPU with the Node architecture in the console. Newer TPU versions don't support the TPU Node architecture.

SSHing to a TPU with the Node architecture is not supported in the console. You can SSH using the command line, but the command is different for TPU Nodes: gcloud compute ssh <tpu-name>. See Connecting to a Cloud TPU for more info.

 

Hey @jan0000 , I am also facing the same issue you faced during creation of a TPU v5 VM in us-west-4-a region.  Did you solve this issue? Please let me know.