Rate Limits

At AI Ark, we manage API usage through rate limits to ensure fair access and optimal performance for all users. This document outlines the key concepts and terminology to help you understand how our system manages API requests.

Rate Limits

Rate limits control the number of API requests a user can make per second. This prevents abuse and ensures our services remain responsive and available.

At AI Ark, we use the term concurrency interchangeably with rate limits. It's important to note that while they share similarities, they are not identical to the concept of concurrency in programming. Concurrency, in this context, refers to a rate limit per second implemented through our API gateway.

Each service has its own default concurrency (rate limit), ensuring tailored performance and reliability. These rate limits are customizable based on user needs and usage patterns.

Default Concurrency

Every API service under AI Ark has a different default concurrency limit. These limits are set to balance performance and availability. You can find the specific concurrency limits for each service in the service quotas under billing section of the developer portal.

Important Notes on Usage

  • Rate Limits: If you exceed your allocated quota, API calls will return a 429 Too Many Requests error.
  • Rate Limit Reset: Your rate limits reset every 60 seconds.
  • Higher Limits: Need more capacity? Contact our team to discuss upgrading your plan.