Global Rate Limiting in Istio with Envoy Filters
A practical, production-ready guide to protecting your services with Envoy’s global rate limit service
You don’t need a malicious attacker to take down your services. A bug in a single internal client, a misconfigured CronJob, or an over-eager retry policy can generate enough traffic to overload your mesh.
In this post, we’ll build a global rate limiting setup in Istio using Envoy Filters + a Redis-backed rate limit service. By the end, you’ll have:
- A working global rate limiter on your ingress gateway
- Configurable path-based limits
- A mental model of how Envoy’s HTTP filter chain and the rate limit service fit together
What is Rate Limiting?
Rate limiting is a mechanism to prevent services from being overwhelmed with requests. Think of it as traffic shaping for your APIs — it controls how many requests a client can make within a time window.
Why Rate Limiting is Critical? (The Stakes)
Without rate limiting, you’re vulnerable to:
Accidental DDoS attacks: A buggy client with an infinite retry loop can generate more traffic than a malicious actor. I’ve seen a single misconfigured Kubernetes CronJob bring down a production cluster serving millions of users.
Cascading failures: When one service gets overwhelmed, it starts timing out. Those timeouts trigger retries. Retries create more load. More load causes more timeouts. Before you know it, your entire service mesh is in a death spiral—and your on-call engineer is having a very bad day.
Resource exhaustion: That expensive AI inference endpoint you’re running? Without rate limits, a single user can rack up your cloud bill by thousands of dollars in minutes.
The noisy neighbor problem: In a multi-tenant environment, one customer’s traffic spike shouldn’t degrade service for everyone else. But without rate limiting, that’s exactly what happens.
The Solution Landscape (Setting up your specific approach)
Rate limiting can be implemented at multiple layers of your stack:
- CDN/Edge layer (Cloudflare, AWS WAF): Great for public APIs, but doesn’t help with internal service-to-service traffic
- API Gateway (Kong, Tyk): Effective, but creates a single point of failure and doesn’t integrate with your service mesh
- Application layer (custom code): Maximum flexibility, maximum maintenance burden—and every team implements it differently
- Service mesh layer (Istio/Envoy): The sweet spot for microservices architectures
Rate limits
Envoy supports two kinds of rate limiting: local and global.
- Local rate limiting is used to limit the rate of requests per service instance.
- Global rate limiting uses a global gRPC rate limiting service to provide rate limiting for the entire mesh. Local rate limiting can be used in conjunction with global rate limiting to reduce load on the global rate limiting service.
Istio Local vs Global Rate Limiting : which one should you choose?
| Local | Global | |
|---|---|---|
| Reduce load per pod/proxy | ✅ | ❌ |
| Cheaper & reliable | ✅ | ❌ |
| Rate limiting based on client IP | ❌ | ✅ |
| Rate limiting on wildcard basis | ❌ | ✅ |
| Easier path or header-based rate limiting | ❌ | ✅ |
Today, we’re diving deep into Istio’s global rate limiting with Envoy Filters—a solution that gives you centralized control, consistent enforcement across your mesh, and the flexibility to rate limit based on any request attribute you can imagine.
Global Rate Limiter Architecture
flowchart LR
%% Define nodes
Client([Client]) -- request --> Gateway
subgraph IstioGateway["Istio Gateway"]
Gateway["Envoy Proxy
(Envoy Filter)"]
end
subgraph ServicePod["Pod"]
PodEnvoy["Envoy Proxy"]
Service["Service"]
PodEnvoy --> Service
end
subgraph RateLimiterPod["Global Rate Limiter"]
GRL["Global rate limiter"]
Redis[(Redis)]
GRL <--> Redis
end
%% Define connections between main components
Gateway --> PodEnvoy
Gateway --> GRL
%% Styling with solid colors and transparent borders for pods
classDef envoyStyle fill:#8a63d2,stroke:#8a63d2,color:white
classDef serviceStyle fill:#4682b4,stroke:#4682b4,color:white
classDef grlStyle fill:#9370db,stroke:#9370db,color:white
classDef podStyle fill:none,stroke-dasharray:5 5,stroke:#ccc,stroke-opacity:0.5
classDef redisStyle fill:#ff6b6b,stroke:#ff6b6b,color:white
classDef clientStyle fill:#555555,stroke:#555555,color:white
class Gateway,PodEnvoy envoyStyle
class Service serviceStyle
class GRL grlStyle
class IstioGateway,ServicePod,RateLimiterPod podStyle
class Redis redisStyle
class Client clientStyle
what is istio Envoy filter?
- Swiss army knife of customizing Istio proxy (envoy) configuration
- For when the native Istio CRDs (VirtualService, DestinationRules) do not provide enough control over envoy configuration.
Under the hood, Istio generates Envoy configs for you. An EnvoyFilter is a way to patch those generated configs, so you can:
- Insert new HTTP filters (like the global rate limit filter)
- Modify routes, clusters, listeners, etc. without abandoning the rest of Istio’s abstractions.
Prerequisites
- Setup Istio in a Kubernetes cluster by following the instructions in the Installation Guide.
- Deploy the HTTPBin sample application to test rate limiting.
- Create istio gateway and virtualservice to route traffic to httpbin service
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: gateway
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
namespace: istio-system
spec:
hosts:
- "*"
gateways:
- gateway
http:
- route:
- destination:
port:
number: 8000
host: httpbin.default.svc.cluster.local
Implementing Global Rate Limiting
Step 1: Configure the Rate Limit Rules
Use the following configmap to configure the reference implementation to rate limit requests to the path /get at 5 req/min/ip and / at 3 req/min/ip. Its a prerequisite to have configmap for the global ratelimiting service deployment.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: v1
kind: ConfigMap
metadata:
name: ratelimit-config
data:
config.yaml: |
domain: ratelimit
descriptors:
- key: remote_address
descriptors:
- key: PATH
value: "/get"
rate_limit:
unit: minute
requests_per_unit: 5
- key: PATH
value: "/"
rate_limit:
unit: minute
requests_per_unit: 3
Step 2: Deploy the Global Rate Limit Service
Create a global rate limit service which implements Envoy’s rate limit service protocol. As a reference, a demo configuration can be found here , which is based on a reference implementation provided by Envoy.
1
2
curl -O https://raw.githubusercontent.com/istio/istio/refs/heads/release-1.28/samples/ratelimit/rate-limit-service.yaml
kubectl apply -f rate-limit-service.yaml
Step 3: Apply the Envoy Filter
Apply an EnvoyFilter to the ingressgateway to enable global rate limiting using Envoy’s global rate limit filter.
The patch inserts the envoy.filters.http.ratelimit global envoy filter into the HTTP_FILTER chain. The rate_limit_service field specifies the external rate limit service, outbound|8081||ratelimit.default.svc.cluster.local in this case.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: filter-ratelimit
namespace: istio-system
spec:
workloadSelector:
# select by label in the same namespace
labels:
istio: ingressgateway
configPatches:
# The Envoy config you want to modify
- applyTo: HTTP_FILTER
match:
context: GATEWAY
listener:
filterChain:
filter:
name: "envoy.filters.network.http_connection_manager"
subFilter:
name: "envoy.filters.http.router"
patch:
operation: INSERT_BEFORE
# Adds the Envoy Rate Limit Filter in HTTP filter chain.
value:
name: envoy.filters.http.ratelimit
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
# domain can be anything! Match it to the ratelimter service config
domain: ratelimit
failure_mode_deny: true
timeout: 10s
rate_limit_service:
grpc_service:
envoy_grpc:
cluster_name: outbound|8081||ratelimit.default.svc.cluster.local
authority: ratelimit.default.svc.cluster.local
transport_api_version: V3
Apply another EnvoyFilter to the ingressgateway that defines the route configuration on which to rate limit. This adds rate limit actions for any route from a virtual host. The descriptor key is PATH and the value is the request path. This will match the descriptors defined in the rate limit service configmap.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: filter-ratelimit-svc
namespace: istio-system
spec:
workloadSelector:
labels:
istio: ingressgateway
configPatches:
- applyTo: VIRTUAL_HOST
match:
context: GATEWAY
routeConfiguration:
vhost:
name: ""
route:
action: ANY
patch:
operation: MERGE
# Applies the rate limit rules.
value:
rate_limits:
- actions:
- remote_address: {}
- request_headers:
header_name: ":path"
descriptor_key: "PATH"
Testing Your Rate Limits
Let’s verify everything works. Get your ingress gateway’s external IP:
1
export GATEWAY_URL=$(kubectl get -n istio-system svc istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
Test the /get endpoint (limited to 5 requests/minute):
1
2
3
4
for i in {1..10}; do
curl -s -o /dev/null -w "Request $i: %{http_code}\n" http://$GATEWAY_URL/get
sleep 1
done
You should see the first 5 requests return 200 OK, and subsequent requests return 429 Too Many Requests.
Check the rate limiter logs to see it in action:
1
kubectl logs -n istio-system -l app=ratelimit -f
How a Rate-Limited Request Flows
sequenceDiagram
participant Client
participant Gateway as Istio Gateway<br/>(Envoy)
participant RLS as Rate Limit Service
participant Redis
participant Service as httpbin
Client->>Gateway: GET /get
Gateway->>RLS: Check rate limit<br/>(IP: 1.2.3.4, PATH: /get)
RLS->>Redis: INCR counter for<br/>1.2.3.4:/get
Redis-->>RLS: Current count: 3
RLS-->>Gateway: OK (under limit)
Gateway->>Service: Forward request
Service-->>Gateway: 200 OK
Gateway-->>Client: 200 OK
Note over Client,Service: 6th request within 1 minute
Client->>Gateway: GET /get
Gateway->>RLS: Check rate limit<br/>(IP: 1.2.3.4, PATH: /get)
RLS->>Redis: INCR counter for<br/>1.2.3.4:/get
Redis-->>RLS: Current count: 6
RLS-->>Gateway: OVER_LIMIT
Gateway-->>Client: 429 Too Many Requests
Understanding Envoy Filter Chain
flowchart LR
%% Main flow
Downstream([Client]) --> Gateway
subgraph IstioIngressGateway["Istio Ingress Gateway"]
Gateway["Envoy Proxy"]
subgraph NetworkFilterChain["Network Filter Chain"]
direction LR
HttpConnManager["HTTP Connection Manager"]
subgraph HttpFilterChain["HTTP Filter Chain"]
direction LR
%% Your first EnvoyFilter is inserted here
RateLimitFilter["RateLimit Filter
(filter-ratelimit)"]
RouterFilter["Router Filter"]
RateLimitFilter --> RouterFilter
end
HttpConnManager --> HttpFilterChain
end
subgraph VirtualHostConfig["Virtual Host Configuration"]
direction LR
%% Your second EnvoyFilter applies here
RateLimitActions["Rate Limit Actions
(filter-ratelimit-svc)"]
end
Gateway --> NetworkFilterChain
HttpFilterChain -.-> VirtualHostConfig
end
RouterFilter --> Service
%% Styling
classDef gatewayStyle fill:#6baed6,stroke:#6baed6,color:white
classDef networkStyle fill:#9e9ac8,stroke:#9e9ac8,color:white
classDef httpFilterStyle fill:#bcbddc,stroke:#bcbddc,color:white
classDef rateLimitStyle fill:#ff9896,stroke:#ff9896,color:white
classDef routerStyle fill:#c49c94,stroke:#c49c94,color:white
classDef clientStyle fill:#525252,stroke:#525252,color:white
classDef serviceStyle fill:#74c476,stroke:#74c476,color:white
classDef configStyle fill:#ffbb78,stroke:#ffbb78,color:white
classDef subgraphStyle fill:none,stroke-dasharray:5 5,stroke:#ccc,stroke-opacity:0.7
class Gateway gatewayStyle
class HttpConnManager networkStyle
class RateLimitFilter rateLimitStyle
class RouterFilter routerStyle
class Downstream clientStyle
class Service serviceStyle
class RateLimitActions configStyle
class NetworkFilterChain,HttpFilterChain,IstioIngressGateway,VirtualHostConfig subgraphStyle
%% Place Service on far right
Service["Upstream Service"]
Common Pitfalls (Learn from My Mistakes)
Pitfall #1: Forgetting the ConfigMap The rate limit service won’t start without the ConfigMap mounted. Always apply the ConfigMap before deploying the service.
Pitfall #2: Wrong workload selector If your EnvoyFilter uses name: gateway but your ingress gateway has label istio: ingressgateway, the filter won’t apply. Verify labels with: kubectl get pods -n istio-system --show-labels | grep ingressgateway
Pitfall #3: Domain mismatch The domain in your EnvoyFilter must exactly match the domain in your ConfigMap. A mismatch means no rate limiting happens (and no error messages).
Pitfall #4: Descriptor order mismatch The descriptors in your EnvoyFilter’s rate_limits.actions must match the descriptor hierarchy in your ConfigMap in the exact same order.
For example, if your EnvoyFilter sends descriptors in order [remote_address, PATH], your ConfigMap must have remote_address as the parent key with PATH nested inside. If the order doesn’t match or keys are missing, the rate limit service won’t find a matching rule and won’t enforce limits. Always verify the descriptor chain matches between both configurations.
Pitfall #5: Not monitoring the rate limiter The rate limit service is now a critical component. Monitor its availability and latency. If it goes down with failure_mode_deny: true, all traffic gets blocked.
In production, you may want to start with failure_mode_deny: false while you harden and monitor the rate limit service. That way, a failure in the rate limit service fails open (no rate limiting, but requests still flow) instead of closed (all traffic blocked).
Performance Considerations
Redis is your bottleneck: Every request triggers a Redis query. For high-traffic scenarios:
- Use Redis Cluster for horizontal scaling
- Consider Redis persistence (RDB/AOF) to survive restarts
- Monitor Redis memory usage - rate limit counters accumulate
Wrapping Up
We started with a simple goal: protect the entire mesh from bad or bursty traffic without rewriting every service.
You now have:
- A global rate limit service wired into the Istio ingress gateway via EnvoyFilters
- Path-based limits (
/getvs/) enforced centrally - A clear picture of how Envoy’s HTTP filter chain and rate limit descriptors work together
To turn this from “PoC” into “production”:
- Add observability: Export rate limit metrics to Prometheus and create Grafana dashboards
- Implement dynamic limits: Use Envoy’s runtime configuration to adjust limits without redeploying
- Test failure scenarios: What happens when Redis goes down? When the rate limit service is slow?
- Scale Redis appropriately: Use Redis Cluster or a managed service and monitor CPU, memory, and key cardinality