Traffic split setting: no match found

Hi all,

I am currently struggling to deploy my endpoint. I always get the following error: 
{
  "error": {
    "code": 400,
    "message": "Endpoint projects/helical-history-370511/locations/europe-west1/endpoints/6313430951061880832 doesn't have traffic_split.",
    "status": "FAILED_PRECONDITION"
  }
}

 

When trying to set the traffic split it seems to me that the deployed model is not matched in the command below:

gcloud ai endpoints deploy-model $NEW_ENDPOINT \
--project='helical-history-370511' \
--region='europe-west1' \
--model='1799311196436824064' \
--display-name='stable-diffusion-v2' \
--machine-type='n1-standard-8' \
--accelerator='type=nvidia-tesla-t4,count=1' \
--deployed-model-id='stable-diffusion-v2' \
--service-account='3XXXXXXXXX8-compute@developer.gserviceaccount.com' \
--traffic-split=['stable-diffusion-v2'=100]
 
Does anybody know what I am doing wrong? The docs at https://cloud.google.com/sdk/gcloud/reference/ai/endpoints/deploy-model state that the DEPLOYED_MODEL_ID needs to be the same for both --deployed-model-id and --traffic-split, which it is. It also matches the name of the model in the model registry.
 
Any help is greatly appreciated 😄 
Cheers,
Friedi

 

Solved Solved
0 2 612
1 ACCEPTED SOLUTION

@christianpaulaCan you please help on the below error
ERROR: (gcloud.ai.endpoints.deploy-model) argument --traffic-split: Invalid value [100]]
Command used
gcloud ai endpoints deploy-model XX --project=XX --region=asia-northeast1 --model=XX --accelerator=type=nvidia-tesla-t4,count=1 --machine-type="n1-highmem-2" --display-name=fine-tuned-flan5 --deployed-model-id=fine-tuned-flan5 --traffic-split=['fine-tuned-flan5'=100]

View solution in original post

2 REPLIES 2

Hi @fschestag,

Welcome to Google Cloud Community!

It looks like you are trying to deploy an endpoint using the gcloud command-line tool, and you are getting an error that says the endpoint doesn't have a traffic split.
 
The `--traffic-split` flag specifies the traffic split for the endpoint, which determines the percentage of traffic that will be routed to each of the deployed models. The value of the `--traffic-split` flag should be a list of model-id=percentage pairs, separated by commas.
 
In your command, you are specifying a traffic split of ['stable-diffusion-v2'=100], which means that 100% of the traffic will be routed to the model with ID 'stable-diffusion-v2'. This model ID should match the model ID that you specified with the `--deployed-model-id` flag.
 
If the model ID specified in the `--traffic-split` flag does not match the model ID specified in the `--deployed-model-id` flag, the deployment will fail with the error message you provided.
 
To fix this issue, make sure that the model ID specified in the `--traffic-split flag` is the same as the model ID specified in the `--deployed-model-id` flag. If you want to route all of the traffic to this model, you can set the traffic split to ['stable-diffusion-v2'=100].
 
Thanks

@christianpaulaCan you please help on the below error
ERROR: (gcloud.ai.endpoints.deploy-model) argument --traffic-split: Invalid value [100]]
Command used
gcloud ai endpoints deploy-model XX --project=XX --region=asia-northeast1 --model=XX --accelerator=type=nvidia-tesla-t4,count=1 --machine-type="n1-highmem-2" --display-name=fine-tuned-flan5 --deployed-model-id=fine-tuned-flan5 --traffic-split=['fine-tuned-flan5'=100]