One of the cool thing about istio is that it automatically expose metrics and if you have prometheus, you can easily query it
Lets start by setting up your mesh
istioctl install --set profile=ambient --skip-confirmation
Install kubernetes gateway API
kubectl get crd gateways.gateway.networking.k8s.io &> /dev/null || \
kubectl apply --server-side -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/experimental-install.yaml
Setup bookinfo app
You can refer to this link here: https://istio.io/latest/docs/ambient/getting-started/deploy-sample-app/
Configure your namespace to use ambient mode
kubectl label namespace default istio.io/dataplane-mode=ambient
Deploy prometheus and kiali
kubectl apply -f samples/addons/prometheus.yaml
kubectl apply -f samples/addons/kiali.yaml
Please ensure that you have update your mesh to include more metrics
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
meshConfig:
defaultConfig:
proxyStatsMatcher:
inclusionRegexps:
- ".*outlier_detection.*"
- ".*upstream_rq_retry.*"
- ".*upstream_rq_pending.*"
- ".*upstream_cx_.*"
inclusionSuffixes:
- upstream_rq_timeout
Some of the key metrics for http basic traffic are below:-
Generic HTTP request metrics
istio_tcp_sent_bytes_total
istio_tcp_connections_opened_total
istio_tcp_received_bytes_total
Circuit breaker envoy_cluster_upstream_rq_pending_overflow
envoy_cluster_upstream_cx_overflow
We can also change the behaviour using the telemetry API. For example we are removing response code using the following yaml
apiVersion: telemetry.istio.io/v1
kind: Telemetry
metadata:
name: remove-response-code
namespace: default
spec:
selector:
matchLabels:
service.istio.io/canonical-name: bookinfo-gateway-istio
metrics:
- providers:
- name: prometheus
overrides:
- match:
metric: REQUEST_COUNT
tagOverrides:
response_code:
operation: REMOVE
- match:
metric: REQUEST_DURATION
tagOverrides:
response_code:
operation: REMOVE
- match:
metric: REQUEST_SIZE
tagOverrides:
response_code:
operation: REMOVE
- match:
metric: RESPONSE_SIZE
tagOverrides:
response_code:
operation: REMOVE
Then you will see this is proogated by istiod
And after hitting productpage for a couple of time, we can query prometheus and as you can see the response_code is missing in the upper row.
And if you remove the telemetry above and you hit the workload product page 100 times, you will see that the row with response code starts to increase.
The code that will trigger these metrics are pilot/pkg/networking/core/listener_builder.go which then call pilot/pkg/model/telemetry.go generateStatsConfig to build the final Envoy configuration for the istio_requests_total metric (delegated to Envoy)
Comments