7 API rate limit best practices worth following

As your organization builds to an API, you’ll need to ensure that the build carefully abides by the API provider’s rate limit policies.

Violating them can lead the provider to throttle your requests, block you from accessing their endpoints for a defined period of time, and/or charge you more.

To help you avoid these consequences, we’ll break down several best practices for staying within your rate limit for a given window.

Understand your specific rate limits 

Rate limits don’t just vary from application to application; they can also vary based on the endpoints you’re interested in, whether your requests are authenticated or not, the subscription you’re on, etc.

A screenshot from GitHub's documentation that confirms that authenticated requests have a higher rate limit than unauthenticated requests
GitHub, a developer platform, states in their API docs that authenticated requests have a higher rate limit than unauthenticated requests

Given all the variables that can influence a provider’s rate limits, it’s worth taking a close look at their API documentation to understand your specific situation.

Related: Best practices for managing API logs

Use webhooks when they’re available

Polling API endpoints to see if a resource has been modified, added, or removed can lead to countless wasted API calls. This is especially true when the requested resources change infrequently.

Webhooks allow you to avoid making these calls; you’ll simply receive responses from an application in real-time once the event you care about (e.g., new lead gets created in a CRM) is detected.

It’s worth noting that not all applications support webhooks for the endpoints you care about. So, like the previous best practice, you should review the application’s documentation to see whether they cover it.

A screenshot of GitHub's webhooks documentation
GitHub has dedicated ample documentation on its webhooks, allowing its API consumers to not only confirm which endpoints they support but also how you can create and troubleshoot them effectively 

Adopt exponential backoffs to minimize your rate limit errors

Exponential backoffs, which extend the length of time that passes for retrying requests every time a rate limit error occurs, aren’t a foolproof solution. That said, they can help minimize the number of times that you exceed an API provider’s limits in a given window.

Your exponential backoffs can start by doubling the waiting period for every failed attempt. But if that proves insufficient, you can use more aggressive increments for the delay period.

Related: API pagination best practices

Track your API usage through the provider’s dashboard and/or endpoint

You can proactively stay on top of your rate limits in a given window by using the API providers’ dashboards. Among other insights, they can show you how many API requests you’ve made in a given timeframe, along with how many requests you have available.

Similarly, some API providers let you call specific endpoints to get in-depth insights on your rate limit for a certain window. 

For example, you can make the following API call to GitHub’s /rate_limit endpoint—using cURL—to uncover all of the following information for a given window: the number of requests you’ve made to the endpoint, the amount of requests remaining, your total request limit, and when the window expires.


curl -L \
  -H "Accept: application/vnd.github+json" \
  -H "Authorization: Bearer " \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  https://api.github.com/rate_limit
  

Note: Many API providers share information on the API calls remaining in the current rate limit window, when the window resets, and the total number of requests allowed during a window within their response headers. When that's the case, it won't make sense to call endpoints like the one referenced above.

Cut down on your API requests by batching them

You can make less requests while still performing the same set of actions on resources by grouping, or batching, your requests. 

However, not all API providers support batch requests, so you’ll need to review their API documentation to confirm that they do. In addition, API providers might have specific requirements, whether that's a certain limit in the number of requests you can make in a given batch, a maximum payload size per batch, etc. So you’ll need to research and understand these requirements for each API provider before going on try this approach.

Related: How to poll API endpoints effectively

Cache responses for data that changes infrequently 

In some cases, you’ll need to fetch the same resources over time and they won’t change often. This can take the form of an image, details on a product, information on an employee, etc.

Whenever that’s the case for your integration(s), you can cache the initial API response, allowing you to fetch it in the future quickly and without having to make additional API calls. Moreover, you can adopt caching mechanisms that use expiration policies (based on how often the data typically changes), and time your API requests at points of expiration.

Avoid concurrent requests

Concurrent requests, or requests that overlap in timing for a given API, can also lead you to hit your rate limit quickly, preventing you from accessing the data you need on a steady cadence over time.

To help prevent concurrent requests, you can adopt any number of approaches, from using request queues to implementing request throttling to leveraging caching (see previous best practice).

Integrate with the applications you care about without worrying about their rate limits with Merge

Merge, the leading unified API solution, lets you add hundreds of integrations to your product by integrating with its unified API. 

Merge's unified approach to integration removes the complexities associated with building and maintaining individual customer-facing API integrations. In other words, you don’t need to worry about individual API provider’s rate limits, along with other aspects, like how they approach pagination and authentication.  

You can learn more about Merge and its unified API by scheduling a demo with one of our integration experts.

API rate limit FAQ

In case you have any more questions on API rate limits, we’ve addressed several more below.

What is API rate limiting?

It’s the maximum number of requests that an API provider permits a client. This limit is imposed to prevent the provider’s server from getting overwhelmed, avoid security incidents, allow for equal usage across consumers, and more. 

What is a typical API rate limit?

There isn’t a typical API rate limit; every API provider imposes unique limits on their endpoints. With this in mind, you’ll need to carefully review each API provider’s documentation to better understand the limits for the endpoints you’re planning to use.

How long does a rate limit last?

There isn’t a hard set rule for a rate limit window’s duration. It depends on the API provider and the specific endpoint you’re looking at using, so you should review the API provider's documentation to get a better idea of a given endpoint's rate limit window.

What is the difference between API rate limiting and API throttling?

API rate limiting is a subset of API throttling. The latter can include other methods to manage clients’ requests over time, such as the Token Bucket Algorithm or the Leaky Bucket Algorithm. 

What is the main goal of an API rate limit?

It’s to control the volume of requests clients make to an endpoint over time. By doing so, the API provider can provide a consistent level of service to consumers, ensure access to more consumers, avoid security issues (e.g., denial-of-service attacks), control computational costs, and more.

What is the problem with rate limiting?

It can prevent consumers from accessing a resource on time, which, depending on the scenario, can have negative consequences for clients. 

For instance, if a client is trying to get warm leads from a marketing automation system and they have to wait until the rate limit window resets, enough time may pass such that the leads go cold and have a lower chance of responding.

What is a good rate limit for an API?

It depends on the type of resource you’re looking to access and your specific syncing needs. In some cases, a good rate limit can be thousands of requests per hour while in other cases it can be tens of requests within that same timeframe.

Email Updates

Subscribe to the Merge Blog

Get stories from Merge straight to your inbox