2025-04-15Hünkar Döner

WebSocket Traffic Management on EKS

WebSocketEKSALBReal-time
W

WebSocket Traffic Management on EKS

WebSocket provides a persistent connection between the server and client. This is different from protocols working with "request-response" logic like HTTP and requires special attention regarding scaling.

Challenges and solutions when managing WebSocket traffic on Amazon EKS:

1. Load Balancer Setting (Stickiness)

During WebSocket connection establishment (Handshake), client and server must shake hands. If the Load Balancer sends traffic to another pod in the middle of the handshake, the connection drops.

  • Solution: You can ensure a client always talks to the same pod by enabling Sticky Sessions (Session Affinity) on Application Load Balancer (ALB). However, since WebSocket is inherently a single TCP connection, stickiness is not needed after the connection is established.

2. Connection Timeouts

Load Balancers usually have a default idle timeout of 60 seconds. If no data flows over WebSocket for 60 seconds, LB cuts the connection.

  • Solution: Increase ALB Idle Timeout (e.g., 3600 seconds). Also, keep the line alive by setting up a "Heartbeat" (Ping/Pong) mechanism in your application.

3. Scaling Difficulty

It is hard to shut down (scale down) a WebSocket server (Pod A) with 10k users and move users to Pod B. Because connections are live.

  • Solution: When the pod is shutting down (receives SIGTERM signal), it should stop accepting new connections, wait for existing connections to finish, or direct them to other pods by sending a "Disconnect and Reconnect" message to clients.

4. Ingress Annotations

If using NGINX Ingress, add these annotations for WebSocket support:

nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"

WebSocket is indispensable for the modern web and can support millions of concurrent connections with correct configuration on EKS.