Vertex AI resource exhausted error while trying medLM model

Hi, I want to try out the new medLM model. However, when I run the API code I'm receiving this error: 

{'error': {'code': 429, 'message': 'Quota exceeded for aiplatform.googleapis.com/online_prediction_requests_per_base_model with base model: MedLM. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/quotas.', 'status': 'RESOURCE_EXHAUSTED'}}

 

1 10 1,177
10 REPLIES 10

I am also having this issue. I cannot submit a quote increase (can't find the medlm models). Furthermore, the page https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai states that the rate limit should be 3 requests per minute under us-central-1

How did you get access to medlm can you please guide me?

What code are you using?

It looks like you've hit the quota limit for making requests to the MedLM model via the API. This means you've exceeded the maximum number of requests allowed within a certain timeframe. To resolve this, you'll need to request a quota increase from Google Cloud Platform (GCP).

1. Visit the GCP Console, navigate to the quotas page for your project, and confirm the current quota limits for the MedLM model.
2. Submit a quota increase request through the GCP Console.
3. Provide the reasoning for your increase request, detailing why you need more requests for the MedLM model.

Once you've submitted the request, it may take some time for Google to review and approve it. They'll assess the request based on your usage patterns and the available resources.

While waiting for the quota increase, you might want to optimize your usage or temporarily switch to another model if available within your existing quota to continue your work.

Remember, GCP's quotas are in place to ensure fair usage and to manage resources effectively. If you consistently need more quota, demonstrating why and how you're using the service might help expedite the approval process for an increased limit.

How do I get access to MedLM in the first place please help?

 

Can you possibly help me with this request? I've gotten a response from @Roderick where he mentioned getting me in touch with the right folks. 

https://www.googlecloudcommunity.com/gc/AI-ML/MedLM-Access/m-p/684309

Any luck?

 

Similar to the original poster, I'm having the issue that I get this on my very first request. The docs state that the quota should be 3 requests/minute, but this does not seem to ebe true.

I've been in touch with the GCP team (product manager + engineer). They linked me to this resource. Apparently the models are not open source at all - you need to get allow-listed but this is very selective, and is contingent on intensive review of how you will be using the models.