Solved: Traffic split setting: no match found

fschestag · 12-22-2022 04:53 AM

Hi all,

I am currently struggling to deploy my endpoint. I always get the following error:
{
"error": {
"code": 400,
"message": "Endpoint projects/helical-history-370511/locations/europe-west1/endpoints/6313430951061880832 doesn't have traffic_split.",
"status": "FAILED_PRECONDITION"
}
}

When trying to set the traffic split it seems to me that the deployed model is not matched in the command below:

gcloud ai endpoints deploy-model $NEW_ENDPOINT \

--project='helical-history-370511' \

--region='europe-west1' \

--model='1799311196436824064' \

--display-name='stable-diffusion-v2' \

--machine-type='n1-standard-8' \

--accelerator='type=nvidia-tesla-t4,count=1' \

--deployed-model-id='stable-diffusion-v2' \

--service-account='3XXXXXXXXX8-compute@developer.gserviceaccount.com' \

--traffic-split=['stable-diffusion-v2'=100]

Does anybody know what I am doing wrong? The docs at https://cloud.google.com/sdk/gcloud/reference/ai/endpoints/deploy-model state that the DEPLOYED_MODEL_ID needs to be the same for both --deployed-model-id and --traffic-split, which it is. It also matches the name of the model in the model registry.

Any help is greatly appreciated 😄

Cheers,

Friedi

jayadevi

@christianpaulaCan you please help on the below error
ERROR: (gcloud.ai.endpoints.deploy-model) argument --traffic-split: Invalid value [100]]
Command used
gcloud ai endpoints deploy-model XX --project=XX --region=asia-northeast1 --model=XX --accelerator=type=nvidia-tesla-t4,count=1 --machine-type="n1-highmem-2" --display-name=fine-tuned-flan5 --deployed-model-id=fine-tuned-flan5 --traffic-split=['fine-tuned-flan5'=100]

View solution in original post

christianpaula

Hi @fschestag,

Welcome to Google Cloud Community!

It looks like you are trying to deploy an endpoint using the gcloud command-line tool, and you are getting an error that says the endpoint doesn't have a traffic split.

The `--traffic-split` flag specifies the traffic split for the endpoint, which determines the percentage of traffic that will be routed to each of the deployed models. The value of the `--traffic-split` flag should be a list of model-id=percentage pairs, separated by commas.

In your command, you are specifying a traffic split of ['stable-diffusion-v2'=100], which means that 100% of the traffic will be routed to the model with ID 'stable-diffusion-v2'. This model ID should match the model ID that you specified with the `--deployed-model-id` flag.

If the model ID specified in the `--traffic-split` flag does not match the model ID specified in the `--deployed-model-id` flag, the deployment will fail with the error message you provided.

To fix this issue, make sure that the model ID specified in the `--traffic-split flag` is the same as the model ID specified in the `--deployed-model-id` flag. If you want to route all of the traffic to this model, you can set the traffic split to ['stable-diffusion-v2'=100].

Thanks

jayadevi

@christianpaulaCan you please help on the below error
ERROR: (gcloud.ai.endpoints.deploy-model) argument --traffic-split: Invalid value [100]]
Command used
gcloud ai endpoints deploy-model XX --project=XX --region=asia-northeast1 --model=XX --accelerator=type=nvidia-tesla-t4,count=1 --machine-type="n1-highmem-2" --display-name=fine-tuned-flan5 --deployed-model-id=fine-tuned-flan5 --traffic-split=['fine-tuned-flan5'=100]