A guide to managing REST API rate limits
API providers use REST API rate limits to control the frequency of client requests to their web servers. This allows the providers to maintain their server's reliability and efficiency and to distribute resources equally among users.
REST API limits differ from API throttling, a more dynamic form of control. While rate limiting sets fixed caps on the number of requests over predetermined intervals, API throttling adjusts the flow of requests based on the current load.
In this article, we'll explore REST API rate limiting use cases, how providers implement them, and the best practices for managing these limits as a consumer.
How do REST API rate limits work?
API rate limits are enforced by tracking requests that use identifiers, like API keys or IP addresses. The API typically responds with a 429 "Too Many Requests" status code when a client exceeds this limit.
API rate limits can come in different forms:
- Fixed-window rate limiting limits the number of requests you can make over a specified period, such as a day or a month, to manage the overall load on a server. It's useful for long-term usage control and preventing abuse over extended periods. This method is primarily adopted by cloud services and third-party API providers with tiered usage plans.
- Concurrent rate limiting limits the number of concurrent requests a user can make at a given time. It's useful in scenarios where high levels of parallel processing could overload the system. It's mainly adopted by web servers handling parallel requests.
- Geographic rate limiting limits requests based on geographic location, which can be useful for complying with legal regulations or managing server load. It's usually adopted by content streaming services with regional restrictions.
- Token bucket rate limiting involves allocating a certain number of tokens in a bucket. Each request costs a token, and the bucket is refilled steadily. This approach allows short bursts of high traffic followed by lower sustainable rates. It's usually adopted by streaming services, file downloads, or network routers.
Related: Everything you need to know about REST API authentication
Why REST API providers use rate limits
Even though a rate limiting notification might be frustrating for those on the receiving end, rate limits are important to ensure that the API you're using can perform reliably.
Rate limits ensure that resources are distributed fairly among all users. If a few users consume excessive resources, the API service could degrade for others. Moreover, by avoiding excessive loads from simultaneous requests, rate limits help servers perform optimally. This reduces latency and ensures the API remains efficient and responsive during high demand so that API providers can meet SLA commitments regarding uptime and response times.
Rate limits also safeguard an API from exploitation for malicious purposes, such as abusive data scraping and denial of service (DOS) attacks. These attacks often involve making a large number of requests in a short time. Hence, by implementing rate limiting, API providers can effectively prevent these attacks from overloading their servers and disrupting their services.
Lastly, implementing rate limits helps manage the costs of data transmission and server maintenance, which can be significant at high volumes. Rate limits help API providers handle service growth predictably by preventing infrastructure overload during demand spikes. These "demand spikes" provide insights into the patterns of user requests, which help them plan and allocate resources more efficiently and promote scalability.
Examples of REST API rate limits
Many popular B2B software solutions implement API rate limits. Let's take a look at how three popular solutions—QuickBooks, HubSpot, and Salesforce—do so.
QuickBooks Online
QuickBooks—a platform that provides online accounting software for small businesses—implements fixed-window rate limiting to ensure fair usage among its users.
The platform tracks the number of requests users make using their realm ID and allows only 500 requests per minute for each ID for requests made to the QuickBooks Online API endpoints. If you access other endpoints, the API limits your requests after a combined 500 requests per minute across all realms (endpoints) or 500 requests per minute to a single endpoint, whichever occurs first.
HubSpot
HubSpot—a customer platform that offers a variety of products for your GTM teams—combines frequency-based limiting and fixed-window limiting to prevent spikes in traffic and ensure a steady, manageable API load despite the large volume of real-time data it deals with.
HubSpot has a 10-second limit that ranges from 100 to 200 requests and a daily limit that ranges from 250,000 to 1,000,000. Your plan determines where you fall in the range. The free starter plan, for instance, would be at the lowest point in this range.
Salesforce
Salesforce—best known for its widely-used CRM software—provides a baseline of 15,000 API requests per 24-hour period for most Salesforce editions with the ability to purchase additional capacity.
That said, Salesforce uses a flexible and sophisticated approach to rate limiting: Its API continuously recalculates the number of available requests based on usage in the past 24 hours.
This approach provides more flexibility than a strict daily cap as it can accommodate bursts of activity as long as they are balanced by quieter periods. Accommodating varying levels of demand also makes the API suitable for a broad spectrum of businesses with different sizes and needs.
Related: Examples of REST API integration
Best practices for managing REST API rate limits
Here are some best practices to help you manage rate limits effectively:
Understand the rate limits
Efficient API management requires studying the relevant platform's API documentation to understand its rate-limiting rules.
As you've seen, APIs have different rate limits, and limits can even vary between different API endpoints. For instance, an API might allow more requests for data retrieval than actions that alter data. Some APIs also use dynamic rate limiting influenced by subscription levels.
Understanding these details from the get-go helps ensure that your applications' requests comply with the limits.
Use HTTP headers
Most APIs include HTTP headers with details about your current rate limit status. Using these response headers lets you get real-time rate limit information.
Headers such as X-RateLimit-Limit inform you of the maximum number of requests allowed in a given time frame, while X-RateLimit-Remaining shows the number of requests you can still make before hitting the limit. The X-RateLimit-Reset header is particularly useful as it indicates the time when the rate limit will reset, allowing you to plan your subsequent requests accordingly.
Closely monitoring these headers allows you to dynamically adjust your API use to stay within the limits.
Related: Best practices for managing API rate limits
Categorize endpoints by their rate limits
Implementing granular control over your API requests—or tailoring your request frequency according to different API endpoints' specific needs and limitations—allows you to optimize API interactions.
As you've learned, not all endpoints are created equal. Some might have tighter rate limits due to the nature of the data they provide or the computational load they carry. For instance, an endpoint providing real-time data might have stricter limits than one delivering less time-sensitive information.
You can minimize the risk of hitting these limits by categorizing endpoints based on their rate limit constraints and adjusting your request strategy accordingly.
Handle rate limits gracefully
Create mechanisms in your application to respond when a rate limit is reached.
For instance, most APIs return specific HTTP status codes, such as 429 Too Many Requests, to signal that a rate limit has been exceeded. When such a code is detected, pause further requests to that endpoint for a period. Ideally, use the period suggested by the Retry-After header, which indicates how long you should wait before making another request in seconds.
Approach requests strategically
Understanding the time window associated with the rate limit (e.g. per hour, per day) allows you to plan your requests accordingly.
For APIs that reset their limits at a specific time, schedule your high-volume requests soon after the reset to maximize usage. In cases where limits are rolling (refreshing continuously over a period), distribute your requests evenly to avoid sudden spikes.
Additionally, if the API provides real-time feedback on limit usage via headers, dynamically adjust your request frequency based on this data.
Monitor where you stand in relation to the limits frequently
Regularly checking the number of requests made against limits and understanding how close you are to reaching them helps you stay compliant.
Use monitoring tools or built-in analytics provided by the API to get real-time insights into your API consumption and alerts when you're nearing your limit. Some providers even allow you to set up automated alerts that are sent to your preferred platform, such as email or Slack.
If your application's needs evolve and you consistently hit rate limits, it might be time to renegotiate them with the API provider.
Related: The top best practices for building and maintaining API integrations
Final thoughts
When it comes to product integrations, it can be extremely time-consuming and technically complex to build and maintain customer-facing integrations that comply with 3rd-party API providers' varying rate limits.
Merge, the leading product integration platform, helps simplify this work at scale; by building to Merge's Unified API, you can offer hundreds of integrations across key software categories—from HRIS to ATS to CRM—and only have to follow one rate limit.
To learn more about how Merge's Unified API handles rate limiting, among other key items, you can schedule a demo with one of our integration experts.