Adaptive Rate Limiting¶
Cello's adaptive rate limiter dynamically tightens capacity when error rates climb above a configurable threshold and relaxes it again once the service recovers. This helps protect downstream services without requiring manual tuning during traffic spikes.
The example below starts with a bucket of 100 tokens, refills at 100 tokens per second, and shrinks to a minimum capacity of 5 when errors exceed 20 % of requests.
Features Demonstrated¶
- Token-bucket rate limiter configured with
RateLimitConfig.adaptive() - Dynamic capacity adjustment driven by real-time error rate
min_capacityfloor to preserve a small throughput window even under high error conditionserror_thresholdcontrolling when adaptive reduction kicks in- Enabling the limiter globally via
app.enable_rate_limit(config)
Complete Source Code¶
from cello import App, RateLimitConfig, Response
import time
import threading
app = App()
config = RateLimitConfig.adaptive(
capacity=100,
refill_rate=100,
min_capacity=5,
error_threshold=0.20
)
app.enable_rate_limit(config)
@app.get("/")
def home(request):
return {"status": "ok", "message": "System is healthy"}
@app.get("/trigger-errors")
def trigger_errors(request):
return Response.json({"error": "Simulated Warning"}, status=500)
if __name__ == "__main__":
app.run(port=8080)
Running This Example¶
# Check healthy endpoint
curl http://localhost:8080/
# Trigger errors to drive up the error rate
curl http://localhost:8080/trigger-errors
# Flood the server — watch for 429 responses as capacity shrinks
for i in $(seq 1 120); do curl -s -o /dev/null -w "%{http_code}\n" http://localhost:8080/; done
Key Concepts¶
- Token bucket algorithm — each request consumes one token; tokens refill at
refill_rateper second up tocapacity - Adaptive reduction — when the proportion of 5xx responses exceeds
error_threshold, the effective capacity is reduced towardmin_capacity - Automatic recovery — as the error rate falls back below the threshold the limiter gradually restores full capacity
- HTTP 429 Too Many Requests — clients receive this status code when the bucket is exhausted, allowing them to back off and retry
min_capacitysafety floor — prevents capacity from reaching zero so the service never becomes completely unavailable