Skip to content

Rate Limiting

Rate limiting controls how fast jobs are processed, protecting downstream services from overload. OJS supports three complementary strategies: concurrency limiting, window-based rate limiting, and throttling.

{
"type": "api.call",
"args": ["https://api.example.com/data"],
"rate_limit": {
"key": "api.example.com",
"concurrency": 5,
"rate": {
"limit": 100,
"period": "PT1M"
},
"throttle": {
"interval": "PT1S"
},
"on_limit": "wait"
}
}
FieldTypeDescription
keystringGrouping key for rate limit buckets
concurrencyintegerMax simultaneous active jobs per key
rate.limitintegerMax jobs per time window
rate.periodstringISO 8601 duration for the window
throttle.intervalstringMinimum time between job starts
on_limitstringAction when limit reached: "wait", "reschedule", or "drop"

Concurrency limiting caps the number of simultaneously active jobs sharing the same key:

{
"rate_limit": {
"key": "stripe-api",
"concurrency": 10
}
}

When a job is fetched and the concurrency limit is reached for its key, the job remains in the queue until a slot opens. This is the simplest and most common form of rate limiting.

Window-based limiting caps the total number of jobs processed within a time period:

{
"rate_limit": {
"key": "email-provider",
"rate": {
"limit": 1000,
"period": "PT1H"
}
}
}

Backends MAY implement fixed windows (reset at period boundaries) or sliding windows (rolling count over the period). Sliding windows provide smoother throughput but are more complex to implement.

Throttling ensures even spacing between job executions:

{
"rate_limit": {
"key": "webhook-delivery",
"throttle": {
"interval": "PT0.1S"
}
}
}

This ensures at most one job per 100ms for the webhook-delivery key, regardless of how many jobs are queued.

When a rate limit is reached, the on_limit field determines what happens:

ActionBehavior
waitHold the job in the queue until the limit allows processing (default)
rescheduleReturn the job to the queue with a delay based on the rate limit
dropDiscard the job silently

Workers can signal rate limit changes by returning a 429-equivalent response with rate limit information. The backend dynamically adjusts the limit:

{
"error": {
"type": "rate_limited",
"retry_after": 30
}
}
  • Priority: Rate limits take precedence over priority ordering. Within the rate limit, higher-priority jobs are processed first.
  • Retry: Retried jobs count against rate limits. Both new and retried jobs share the same rate limit bucket.
  • Multi-tenancy: Rate limits can be partitioned by tenant using composite keys (e.g., tenant_123:api.example.com).
Terminal window
# Inspect current rate limit state
curl http://localhost:8080/ojs/v1/rate-limits/stripe-api
# Override a rate limit dynamically
curl -X PUT http://localhost:8080/ojs/v1/rate-limits/stripe-api \
-H "Content-Type: application/json" \
-d '{"concurrency": 20}'