istio troubleshooting issues in ingress controller vs virtual service (destination rule + service and pod level)

When trying to hit your Kubernetes with istio enabled clustered, you might need to do some troubleshooting to ensure traffic flowing in correctly. Here are some of the steps that I used

To check if traffic or request is coming in. This can be alot but you can narrow it down. For example, in my setup, I have httpbin configure and if there's a request coming into the pod, it will be logged here :- 

Turn on logging on the istio pod level too 

istioctl proxy-config log POD --level=debug

And then turn on logging in the ingress as well. 

k logs -l app.kubernetes.io/name=istio-ingressgateway  -n istio-system -f

Let's look at the logs for ingress gateway in details :- 


You can see that in the red circle, I am getting 503 error and this means 

1. Traffic or the request is coming in

2. You're getting a 503 error is because your virtual service not configure correctly

3. 503 error can also means that your service has been deleted (not configure correctly) or your pod is crashing out.

4. So that is the first thing we should check to see if there is a request ACTUALLY does come in. 

5. If you have a badly configure virtual service or service - then it will be shown here and not in POD level logs


Looking at logs and tracking in Istio Ingress 

ERROR LOG for service not working

[2026-05-02T20:48:27.139Z] "GET /status/200 HTTP/1.1" 503 NC cluster_not_found - "-" 0 0 0 - "192.168.65.3" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36" "f383fe30-6cb0-90f6-aa26-466500870891" "localhost" "-" - - 10.1.4.177:8080 192.168.65.3:57410 - -

Log SegmentValueMeaning
Timestamp[2026-05-02T20:48:27.139Z]When the request was received.
Request"GET /status/200 HTTP/1.1"The HTTP method, path, and protocol used by the client.
Status Code503The HTTP status returned to the client (Service Unavailable).
Response FlagsNCNo Cluster. The upstream cluster was not found for this request.
Envoy Codecluster_not_foundThe exact reason Envoy failed to route the request.
Upstream Service-No upstream service was reached (because of the NC flag).
Downstream IP"192.168.65.3"The IP address of the client making the request.
User Agent"Mozilla/5.0 ..."The client's browser/agent string.
Request ID"f383fe30-...-466500870891"The Envoy request ID (useful for tracing).
Host Header"localhost"The host requested by the client.
Upstream Host-No upstream host received the traffic.
Gateway Listen IP10.1.4.177:8080The internal IP and port of the Istio Ingress Gateway pod that handled the request.

Service not available 

When your services are deleted or not configure correctly, this is what your kiali looks like. As you can see ingress controller doesn't know where to route it - and it goes into unknown. 




This is what a  pod unavailable logs looks like 

[2026-05-02T22:32:36.941Z] "GET / HTTP/1.1" 503 UH no_healthy_upstream - "-" 0 19 1 - "192.168.65.3" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36" "0f2827ba-0d23-9b98-a22b-ac811778fad2" "localhost" "-" outbound|8000||httpbin.default.svc.cluster.local - 10.1.4.177:8080 192.168.65.3:40344 - -

And this is what it looks like in Kiali (notice it stops at the service)


And this is what a successful log connection looks look

[2026-05-02T20:44:29.129Z] "GET /html HTTP/1.1" 200 - via_upstream - "-" 0 3742 3 2 "192.168.65.3" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36" "2d31fe66-2a5e-9ba1-85a5-4821fbccbc98" "localhost" "10.1.4.184:8080" outbound|8000||httpbin.default.svc.cluster.local 10.1.4.177:36060 10.1.4.177:8080 192.168.65.3:57410 - -[2026-05-02T20:44:29.129Z] "GET /html HTTP/1.1" 200 - via_upstream - "-" 0 3742 3 2 "192.168.65.3" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36" "2d31fe66-2a5e-9ba1-85a5-4821fbccbc98" "localhost" "10.1.4.184:8080" outbound|8000||httpbin.default.svc.cluster.local 10.1.4.177:36060 10.1.4.177:8080 192.168.65.3:57410 - -

Log SegmentValueMeaning
Timestamp[2026-05-02T20:44:29.129Z]When the Ingress Gateway received the request.
Request"GET /html HTTP/1.1"The HTTP method, URI path, and HTTP protocol version.
Status Code200The backend successfully handled the request (OK).
Response Flags-No error flags (the request completed successfully).
Response Code Detailsvia_upstreamThe 200 status code was returned directly by the backend service.
Upstream Service Flag"-"N/A (No specific upstream flag was triggered).
Bytes Received0The client sent no body payload in this GET request.
Bytes Sent3742The size of the response returned to the client (~3.7 KB).
Total Duration3Total time in milliseconds from receiving the request to completing it.
Upstream Time2Time in milliseconds the backend spent processing the request.
Downstream IP"192.168.65.3"The client's IP address.
User Agent"Mozilla/5.0 ..."The client's browser and operating system details.
Request ID"2d31fe66-...-4821fbccbc98"Unique tracking ID generated by Envoy.
Host Header"localhost"The host requested by the client.
Upstream Host IP"10.1.4.184:8080"The internal IP and port of the backend pod (httpbin).
Upstream Clusteroutbound|8000||httpbin.default.svc.cluster.localThe exact Istio service and port that the gateway routed the request to.
Downstream Local IP10.1.4.177:36060The internal IP of the gateway making the call.
Gateway Listen IP10.1.4.177:8080The gateway pod's IP and port that received the traffic.

This is your ingres gateway's IP address
Downstream Remote IP192.168.65.3:57410The client's IP and ephemeral source port.
SNI / Router Name- -No SNI was used (this was plain HTTP, not HTTPS).

And you can see it in actions here :- 



Tracking by trace id

Istio would generate a trace id on ingress controller level which can be trace all the way to Istio proxy on the pod

For example, here is the output for an ingress log with trace id '38a0ba34-99c6-923e-ae5b-95cbceac71dd'




Then you can run this command to see it all the way through in the pod level (please do not run kubectl log only) and our intention is to get the STREAM_ID. With the stream id, we can trace the lifecycle of this request. Stream_ID is not the same as trace_id. 

kubectl logs $POD -c istio-proxy  

And it should look like this :-


It is also important to differentiate scenario when we do infact get a 4xx or 5xx from an application. In that case, you will notice the entire flow would be red like this here 



[2026-05-02T23:18:11.478Z] "GET /status/418 HTTP/1.1" 418 - via_upstream - "-"
0 13 3 1 "192.168.65.3" "Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36"
"f79f5b87-1a8e-902d-9326-dbe698a8463c" "localhost"
"10.1.4.186:8080" inbound|8080|| 127.0.0.6:48943 10.1.4.186:8080 192.168.65.3:0
outbound_.8000_._.httpbin.default.svc.cluster.local default

Conclusion 

To debug traffic in Istio if we are trying to determine if there's an actual request that came in, look at the ingress controller level. Badly configured service or destination rule too. 

Once we have made that confirmation, we can start looking at the pod level istio-proxy to see the traffic there. 



Comments

Popular posts from this blog

vllm : Failed to infer device type

android studio kotlin source is null error

gemini cli getting file not defined error