Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
rishabhdhakarwal
Product and Topic Expert
Product and Topic Expert
1,691

Introduction


A rate limiter is a software component that controls the rate at which requests are processed. It is commonly used to protect systems from overload and denial-of-service attacks. Rate limiters can be implemented in a variety of ways, but the basic principle is to allow only a certain number of requests to be processed within a given period of time.

Rate limiters are used in a wide range of applications, including:

  • Web APIs: To prevent excessive usage of APIs and protect systems from overload.

  • Microservices: To manage communication between microservices and prevent cascading failures.

  • Messaging systems: To control the rate at which messages are published and consumed.


Rate Limiter Algorithms:


There are several different algorithms for rate limiting, each with its own advantages and disadvantages. Some of the most common algorithms include:




  • Token Bucket: This algorithm works by imagining a bucket that is filled with tokens. Each request requires a token to be processed. If there are no tokens available, the request is rejected. The bucket is refilled at a constant rate, which determines the maximum rate at which requests can be processed.

  • Leaky Bucket: This algorithm is similar to the token bucket algorithm, but the bucket leaks at a constant rate, even if there are no requests to process. This means that the bucket can never be completely full, which can help to prevent bursts of traffic from overwhelming the system.

  • Fixed Window Counter: This algorithm tracks the number of requests that have been processed in a fixed window of time. If the number of requests exceeds the limit, all subsequent requests are rejected until the window has expired.

  • Sliding Window Log: This algorithm maintains a log of all requests that have been processed in the recent past. The log is weighted, with more recent requests given more weight. This allows the algorithm to account for bursts of traffic without being overwhelmed.

  • Sliding Window Counter: This algorithm is a hybrid of the fixed window counter and sliding window log algorithms. It tracks the number of requests that have been processed in a sliding window of time, but it also accounts for the weighted value of the previous window's request rate. This helps to smooth out bursts of traffic and provides a more accurate measurement of the current request rate.


Here is a table that summarizes the key characteristics of each algorithm:

































Algorithm Advantages Disadvantages
Token Bucket Simple to implement, accurate rate-limiting Can be less effective at absorbing bursts of traffic
Leaky Bucket Can absorb bursts of traffic, and prevent the bucket from being completely full Can be less accurate at rate limiting, more complex to implement
Fixed Window Counter Simple to implement, efficient Can be less effective at absorbing bursts of traffic, and can lead to spikes in traffic at the start of each window
Sliding Window Log More accurate at rate limiting, can absorb bursts of traffic More complex to implement, can be less efficient
Sliding Window Counter Combines the advantages of the fixed window counter and sliding window log algorithms Can be more complex to implement than the fixed window counter algorithm

In general, the token bucket and leaky bucket algorithms are the most widely used rate-limiting algorithms. They are both relatively simple to implement and provide accurate rate limiting. The fixed window counter and sliding window log algorithms are less commonly used, but they can be useful in certain situations, such as when it is important to absorb bursts of traffic or to provide a more accurate measurement of the current request rate.

Level of Implementation:


The best approach to choose depends on your specific needs. If you need to implement a simple rate-limiting strategy that applies to all requests to your microservices, then implementing rate limiting at the reverse proxy layer is a good option. If you need to implement a more complex rate-limiting strategy, or if you need to implement rate limiting at the granular level of specific resources or endpoints, then implementing rate limiting at the microservice level is a better option.


Here are some additional factors to consider when choosing between the two approaches:




  • Performance: If performance is a critical concern, then implementing rate limiting at the reverse proxy layer may be a better option, as it can add less latency to requests.

  • Scalability: If you need to scale your microservices horizontally, then implementing rate limiting at the microservice level is a better option, as it can be implemented independently of the reverse proxy.

  • Complexity: If you need to implement a complex rate-limiting strategy, then implementing rate limiting at the microservice level is a better option, as it allows for more granular control.


Here are some examples of when you might choose to implement rate limiting at the reverse proxy layer or at the microservice level:




  • Implementing rate limiting to protect your microservices from DDoS attacks: In this case, you would want to implement rate limiting at the reverse proxy layer, so that all requests to your microservices are rate limited.

  • Implementing rate limiting to control the number of requests that a particular user can make to a specific endpoint: In this case, you would want to implement rate limiting at the microservice level so that you can control the rate limit for each user and endpoint independently.

  • Implementing rate limiting to apply different rate limits to different resources: In this case, you would want to implement rate limiting at the microservice level, so that you can apply different rate limits to different resources, such as database queries or file downloads.


Ultimately, the best way to decide which approach to choose is to carefully consider your specific needs and requirements.



Code Implementation on Reverse Proxy Level:


In the following sample, we will be using SAP Approuter as the reverse proxy.
All the requests coming to the app router endpoint can be tracked and the rate limit can be applied to relevant paths. Here is the official documentation for injecting a middleware to all the incoming requests:

Extending the Application Router | SAP Help Portal


approuter-extend.js
const approuter = require("@sap/approuter");
const { rateLimiter } = require("./rate-limiter");
const ar = approuter();
ar.beforeRequestHandler.use("/", rateLimiter);
ar.start();


In the following implementation, we have used express-rate-limit library depending on our use case, there are multiple packages available on npm | Home (npmjs.com) that can be used depending on the algorithms/features they provide.

rate-limiter.js


const rateLimit = require("express-rate-limit");
const WHITELISTED_PATH = "/skipThisPath";
const rateLimiter = rateLimit({
windowMs: 1*10*1000, // 10 seconds
max:30, // Limit each User ID to 10 requests per 10 seconds
standardHeaders: false,
legacyHeaders:false,
keyGenerator:(req) => {
return req.user.name;
},
skip: (req) => {
let requestURL = req.url;
if (requestURL.includes(WHITELISTED_PATH)) {
console.log("whitelisted path called");
return true;
} else {
return false;
}
},
handler: (req, res, next, options) => {
let timeInMss = new Date();
let timeDiff = req.rateLimit.resetTime.getTime() - timeInMss.getTime();
let seconds = Math.round(timeDiff / 1000);
res.writeHead(429, { "Content-Type": "application/json" });
res.end(
JSON.stringify({
error: {
code: "429",
time: {
value: seconds,
},
},
})
);
},
});
module.exports = { rateLimiter };


Code Implementation on Microservice Level:


Library Used:


In the following sample example, we have used the resilience4j rate limiter library based on our use case, but there are other rate limiter libraries available that can be used depending on your requirements

The following are some common rate limiter libraries for Java Spring Boot apps:




  • Bucket4j is a Java rate-limiting library based on the token-bucket algorithm. It is a thread-safe library that can be used in either a standalone JVM application or a clustered environment. It also supports in-memory or distributed caching via the JCache (JSR107) specification.

  • Guava RateLimiter is a rate limiter implementation from the Guava library. It is a simple and lightweight library, but it does not support all of the features of Bucket4j, such as distributed caching.

  • Resilience4j RateLimiter is a rate limiter implementation from the Resilience4j library. It is a more feature-rich library than Guava RateLimiter, but it is also more complex to use. It supports distributed caching and other features such as retry and fallback.


Which rate limiter library you choose will depend on your specific needs and requirements. If you need a simple and lightweight library, then Guava RateLimiter is a good option. If you need a more feature-rich library, then Resilience4j RateLimiter or Bucket4j are good options.




  1. pom.xml:
    <dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-ratelimiter</artifactId>
    </dependency>


  2. Rate Limiter Configuration: We can provide the necessary rate limiter configurations in this class and create a RateLimiterRegistry bean using this configuration.

    1. Refresh period: 10 seconds

    2. Limit for Period: 30 requests

    3. Timeout duration for acquiring permission: 10ms

    4. Level: User ID




package *;
import io.github.resilience4j.ratelimiter.RateLimiterRegistry;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import io.github.resilience4j.ratelimiter.RateLimiterConfig;

import java.time.Duration;

@Configuration
public class RateLimiterConfiguration {
@Autowired
RateLimiterConfigProperties config;
@Bean
public RateLimiterConfig rateLimiterConfig() {
return RateLimiterConfig.custom()
.limitRefreshPeriod(Duration.ofSeconds(config.getRefreshPeriod())) // Limit refresh period
.limitForPeriod(config.getLimitForPeriod()) // Number of requests allowed in the limit refresh period
.timeoutDuration(Duration.ofMillis(config.getTimeoutDuration())) // Timeout duration for acquiring a permission
.build();
}
@Bean
public RateLimiterRegistry rateLimiterRegistry(){
return RateLimiterRegistry.of(rateLimiterConfig());
}
}


     3. Rate Limiter Filter: This class will filter all the requests hitting the microservice endpoints and apply the rate limit as per the configuration. The paths that need to be whitelisted from the rate limit are also added here.
import com.sap.cds.services.request.UserInfo;
import io.github.resilience4j.ratelimiter.RateLimiter;
import io.github.resilience4j.ratelimiter.RateLimiterRegistry;
import io.github.resilience4j.ratelimiter.RequestNotPermitted;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.core.annotation.Order;
import org.springframework.stereotype.Component;
import org.springframework.web.filter.OncePerRequestFilter;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.*;
import java.io.IOException;
import java.text.MessageFormat;
import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;

@Component
@Order(1)
public class RateLimiterFilter extends OncePerRequestFilter {
private final RateLimiterRegistry rateLimiterRegistry;
private final Set<String> filteredPaths = new HashSet<>();
@Autowired
UserInfo userInfo;
private static final Logger LOGGER = LoggerFactory.getLogger(RateLimiterFilter.class);
@Autowired
public RateLimiterFilter(RateLimiterRegistry rateLimiterRegistry) {
this.rateLimiterRegistry = rateLimiterRegistry;
this.filteredPaths.addAll(Arrays.asList("/samplePath1","/samplePath2"));
}
@Override
protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)
throws ServletException, IOException {
UserInfo userid=userInfo.getId();
//check whitelisted paths here
if(request.getRequestURI()!=null && filteredPaths.contains(request.getRequestURI())){
filterChain.doFilter(request, response);
return;
}
RateLimiter rateLimiter = rateLimiterRegistry.rateLimiter(userid);
//acqurire request permission for the user based on the rateLimiter, return if succeeds else send 429 error
try {
if(rateLimiter.acquirePermission()){
filterChain.doFilter(request, response);
return;
}
response.sendError(429,"Rate limit exceeded");
} catch (RequestNotPermitted ex) {
response.sendError(429,"Rate limit exceeded");
}
}
}

Caching: 

As can be seen from the code sample above, the RateLimiterRegistry class creates a new RateLimiter for each user since we have used it on a user level (it can also be added at the tenant level). Hence we also need to implement a caching mechanism to clear the registry. We have not covered caching handling as a part of this blog, but this needs to be handled for production systems.

Let's discuss two popular ways to handle caching:




  1. Redis:
    Redis is a distributed in-memory data store that is known for its speed and scalability. It is a good choice for storing rate limiter user info because it can handle a large volume of concurrent requests and can be easily scaled up or down as needed. Additionally, Redis supports a variety of data structures, including sorted sets, which are ideal for implementing rate-limiting algorithms.


  2. Caching on the application layer:
    Caching on the application layer involves storing rate limiter user info in the application's memory. This approach is simpler to implement than using Redis, but it also has some drawbacks. First, caching on the application layer is not as scalable as Redis. If the application receives a large volume of traffic, the cache can quickly become overwhelmed. Second, caching on the application layer is not as reliable as Redis. If the application crashes or restarts, the cache will be lost.


Which approach should you choose?


The best approach for storing rate limiter user info depends on your specific needs. If you need a scalable and reliable solution, then Redis is a good choice. If you need a simpler solution, then caching on the application layer may be sufficient.


Here is a table that summarizes the pros and cons of each approach:

























Approach Pros Cons
Redis Scalable, reliable More complex to implement
Caching on the application layer Simpler to implement Not as scalable or reliable as Redis





Recommendation


If you are expecting a large volume of traffic or need a highly reliable solution, then I recommend using Redis to store rate limiter user info. However, if you are on a tight budget or need a simpler solution, then caching on the application layer may be sufficient.
Ultimately, the best way to decide which approach is right for you is to experiment and see what works best for your application



Conclusion


In conclusion, rate limiting is a powerful technique for protecting your API from abuse and ensuring that your users have a good experience. There are many different algorithms and implementations for rate limiters, but the basic concept is the same: to limit the number of requests that a particular client can make within a certain period of time.

Rate limiters can be implemented at different levels of your API stack, from the load balancer to the individual application server. The best approach for you will depend on your specific needs and architecture.
Rate limiting is an essential part of any API security strategy. By implementing a rate limiter, you can protect your API from abuse and ensure that your users have a good experience.
1 Comment