> ## Documentation Index
> Fetch the complete documentation index at: https://phidatainc-agui.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Tokens-per-minute rate limiting

<img src="https://mintcdn.com/phidatainc-agui/z86_O3EeJ5wD0p21/images/tpm_issues.png?fit=max&auto=format&n=z86_O3EeJ5wD0p21&q=85&s=019b8e444c3b5a2f3ceb8be563d0bd81" alt="Chat with pdf" width="698" height="179" data-path="images/tpm_issues.png" />

If you face any problems with proprietary models (like OpenAI models) where you are rate limited, we provide the option to set `exponential_backoff=True` and to change `delay_between_retries` to a value in seconds (defaults to 1 second).

For example:

```python theme={null}
from agno.agent import Agent
from agno.models.openai import OpenAIChat

agent = Agent(
    model=OpenAIChat(id="gpt-5-mini"),
    description="You are an enthusiastic news reporter with a flair for storytelling!",
    markdown=True,
    exponential_backoff=True,
    delay_between_retries=2
)
agent.print_response("Tell me about a breaking news story from New York.", stream=True)
```

See our [models documentation](/models/overview) for specific information about rate limiting.

In the case of OpenAI, they have tier based rate limits. See the [docs](https://platform.openai.com/docs/guides/rate-limits/usage-tiers) for more information.
