2025-02-20Hünkar Döner
EKS Network Troubleshooting Guide: How to Solve Problems?
EKSNetworkingTroubleshootingDebug
E
EKS Network Troubleshooting Guide
The "Pod is running but cannot access the network" issue is one of the most common and difficult problems to solve in Kubernetes. You must follow a systematic path to debug network issues on Amazon EKS.
1. DNS Issues (CoreDNS)
If the pod cannot access google.com or database-service, the first suspect is DNS.
- Test: Enter the pod and run
nslookup google.com. - Solution: Check if
corednspods are running (kubectl get pods -n kube-system).
2. Security Group Issues
In EKS, Security Groups are applied at both the Pod level (Security Groups for Pods) and the Node level.
- Scenario: Pod cannot connect to RDS database (Timeout).
- Solution: Check if the Node's Security Group has outbound permission to the RDS Security Group and if RDS has inbound permission for it.
3. VPC CNI Issues
If Pods cannot get an IP address and stay in "ContainerCreating" stage, there might be an issue with the VPC CNI plugin.
- IP Exhaustion: There might be no free IP addresses left in your subnet.
- Logs: Check the logs of the
aws-nodepod (kubectl logs -n kube-system -l k8s-app=aws-node).
4. Tools
- Netshoot: Launch a temporary pod containing all network tools like
dig,curl,tcpdump.kubectl run tmp-shell --rm -i --tty --image nicolaka/netshoot - VPC Reachability Analyzer: A great tool in the AWS console that simulates whether there is network access between two points (e.g., Node and RDS).
Network issues are complex, but can be solved with the right tools (and a little patience).