I am using vertex AI online prediction with a custom container. To save on autoscaling time, I am using mutateDeployedModel API(https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints/mutateDeploye...)
The strange issue that I am facing is, I am able to change minimum replicas a few times and then it stops working.
The API call was successful with replica_target got increased but actual replica count coun't increase as shown in below screenshot:
Thanks in advance!
Regards,
Anil
It is possible that it took so many request that it is timing out for changing replica count. Also I believe that replica count varies to the actual resource available at that moment, it is not reaching the target due to availability of the resource. Even then it will still try to reach its target.
I would recommend to file a support case here as they have a better view for your logs and resource usage.
User | Count |
---|---|
13 | |
2 | |
1 | |
1 | |
1 |