Welcome to the

Google Cloud Community

Meet industry peers, ask questions, collaborate to find answers, and connect with Googlers who are making the products you use every day.

cancel
Showing results for 
Search instead for 
Did you mean: 
Bronze 1
Since ‎03-17-2024
‎04-11-2024

My Stats

  • 5 Posts
  • 0 Solutions
  • 0 Likes given
  • 6 Likes received

Yash2384's Bio

Badges Yash2384 Earned

View all badges

Recent Activity

I was looking into the code# Set docker and quantization for AWQ quantized models VLLM_DOCKER_URI = "us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20231127_0916_RC00" quantized_model_id = "TheBloke/Llama-2-70B-chat...
I am using this library to make a prediction request to the model deployed on Vertex AI. I am getting a timeout exception, Not sure if I need to increase the timeout and up to what value . Also what is the default value , I can find nothing in the do...
I've integrated an LLM model into the Model Registry using a custom Docker container. The model is hosted correctly, and I can consistently execute prediction requests. However, occasionally I encounter a '503 Service Unavailable' error.This issue be...
I've deployed a container hosting a customized model in Vertex AI. I encounter connection timeout exceptions, particularly when there are 5 or more concurrent requests.I'm exploring an alternative approach that is cost-effective and capable of autosc...