MeshMetric
This policy uses new policy matching algorithm. Do not combine with Traffic Metrics.
Kong Mesh facilitates consistent traffic metrics across all data plane proxies in your mesh.
You can define metrics configuration for a whole Mesh, and optionally tweak certain parts for individual data plane proxies. For example, you might need to override the default metrics port if it’s already in use on the specified machine.
Kong Mesh provides full integration with Prometheus:
- Each proxy can expose its metrics in Prometheus format.
- Kong Mesh exposes an API called the monitoring assignment service (MADS) which exposes proxies configured by
MeshMetric
.
Moreover, Kong Mesh provides integration with OpenTelemetry:
- Each proxy can publish its metrics to OpenTelemetry collector.
To collect metrics from Kong Mesh, you need to expose metrics from proxies and applications.
In the rest of this page we assume you have already configured your observability tools to work with Kong Mesh. If you haven’t already read the observability docs.
TargetRef support matrix
To learn more about the information in this table, see the matching docs.
Configuration
There are three main sections of the configuration: sidecar
, applications
, backends
.
The first two define how to scrape parts of the mesh (sidecar and underlying applications), the third one defines what to do with the data (in case of Prometheus instructs to scrape specific address, in case of OpenTelemetry defines where to push data).
In contrast to Traffic Metrics all configuration is dynamic and no restarts of the Data Plane Proxies are needed. You can define configuration refresh interval by using
KUMA_DATAPLANE_RUNTIME_DYNAMIC_CONFIGURATION_REFRESH_INTERVAL
env var orkuma.dataplaneRuntime.dynamicConfiguration.refreshInterval
Helm value.
Sidecar
This part of the configuration applies to the data plane proxy scraping. In case you don’t want to retrieve all Envoy’s metrics, it’s possible to filter them.
Below are different methods of filtering. The order of the operations is as follows:
- Unused metrics
- Profiles
- Exclude
- Include
Unused metrics
By default, metrics that were not updated won’t be published.
You can set the includeUnused
flag that returns all metrics from Envoy.
Profiles
Profiles are predefined sets of metrics with manual include
and exclude
functionality.
There are 3 sections:
-
appendProfiles
- allows to combine multiple predefined profiles of metrics. Right now you can only define one profile but this might change it the future (for example there might be feature related profiles like “Fault injection profile” and “Circuit Breaker profile” so you can mix and match the ones that you need based on your features usage). Today only 3 profiles are available:All
,Basic
andNone
.All
profile contains all metrics produced by Envoy.Basic
profile contains all metrics needed by Kong Mesh dashboards and golden 4 signals metrics.None
profile removes all metrics -
exclude
- after profiles are applied you can manually exclude metrics on top of profile filtering. -
include
- after exclude is applied you can manually include metrics.
Examples
Include unused metrics of only Basic profile with manual exclude and include
Include only manually defined metrics
Exclude all metrics apart from one manually added
Applications
Metrics exposed by the application need to be in Prometheus format for the Dataplane Proxy to be able to parse and expose them to either Prometheus or OpenTelemetry backend.
In addition to exposing metrics from the data plane proxies, you might want to expose metrics from applications running next to the proxies.
Kong Mesh allows scraping Prometheus metrics from the applications endpoint running in the same Pod
or VM
.
Later those metrics are aggregated and exposed at the same port/path
as data plane proxy metrics.
It is possible to configure it at the Mesh
level, for all the applications in the Mesh
, or just for specific applications.
Here are reasons where you’d want to use this feature:
- Application metrics are labelled with your mesh parameters (tags, mesh, data plane name…), this means that in mixed Universal and Kubernetes mode metrics are reported with the same types of labels.
- Both application and sidecar metrics are scraped at the same time. This makes sure they are coherent (with 2 different scrapers they can end up scraping at different intervals and make metrics harder to correlate).
- If you disable passthrough and your mesh uses mTLS and Prometheus is outside the mesh this is the only way to retrieve these metrics as the app is completely hidden behind the sidecar.
Example section of the configuration:
applications:
- name: "backend" # application name used for logging and to scope OpenTelemetry metrics (optional)
path: "/metrics/prometheus" # application metrics endpoint path
address: # optional custom address if the underlying application listens on a different address than the Data Plane Proxy
port: 8888 # port on which application is listening
Backends
Prometheus
backends:
- type: Prometheus
prometheus:
port: 5670
path: /metrics
This tells Kong Mesh to expose an HTTP endpoint with Prometheus metrics on port 5670
and uri path /metrics
.
The metrics endpoint is forwarded to the standard Envoy Prometheus metrics endpoint and supports the same query parameters.
You can pass the filter
query parameter to limit the results to metrics whose names match a given regular expression.
By default, all available metrics are returned.
Secure metrics with TLS
Kong Mesh allows configuring metrics endpoint with TLS.
backends:
- type: Prometheus
prometheus:
port: 5670
path: /metrics
tls:
mode: ProvidedTLS
In addition to the MeshMetric
configuration, kuma-sidecar
requires a provided certificate and key for its operation.
activeMTLSBackend
We no longer support activeMTLSBackend, if you need to encrypt and authorize the metrics use Secure metrics with TLS with a combination of one of the authorization methods.
Running multiple Prometheus deployments
If you need to run multiple instances of Prometheus and want to target different set of Data Plane Proxies you can do this by using Client ID setting on both MeshMetric
(clientId
) and Prometheus configuration (client_id
).
Support for
clientId
was added in Prometheus version2.50.0
.
Example Prometheus configuration
Let’s assume we have two prometheus deployments main
and secondary
. We would like to use each of them to monitor different sets
of data plane proxies, with different tags.
We can start with configuring each Prometheus deployments to use Kuma SD.
Prometheus’s deployments will be differentiated by client_id
parameter.
Main Prometheus config:
scrape_configs:
- job_name: 'kuma-dataplanes'
# ...
kuma_sd_configs:
- server: http://kong-mesh-control-plane.kong-mesh-system:5676
refresh_interval: 60s # different from prometheus-secondary
client_id: "prometheus-main" # Kuma will use this to pick proper data plane proxies
Secondary Prometheus config:
scrape_configs:
- job_name: 'kuma-dataplanes'
# ...
kuma_sd_configs:
- server: http://kong-mesh-control-plane.kong-mesh-system:5676
refresh_interval: 20s # different from prometheus-main
client_id: "prometheus-secondary"
Now we can configure first MeshMetric
policy to pick data plane proxies with tag prometheus: main
for main Prometheus discovery.
clientId
in policy should be the same as client_id
in Prometheus configuration.
And policy for secondary Prometheus deployment that will pick data plane proxies with tag prometheus: secondary
.
OpenTelemetry
backends:
- type: OpenTelemetry
openTelemetry:
endpoint: otel-collector.observability.svc:4317
refreshInterval: 60s
This configuration tells Kong Mesh Dataplane Proxy to push metrics to OpenTelemetry collector.
Dataplane Proxy will scrape metrics from Envoy and other applications in a Pod/VM
and push them to configured OpenTelemetry collector, by default every 60 seconds (use refreshInterval
to change it).
When you configure application scraping make sure to specify application.name
to utilize OpenTelemetry scoping.
Pushing metrics from application to OpenTelemetry collector directly
Right now if you want to expose metrics from your application to OpenTelemetry collector you can access collector directly.
If you have disabled passthrough in your Mesh you need to configure ExternalService with you collector endpoint. Example ExternalService:
Examples
With custom port, path, clientId, application aggregation and service override
The first policy defines a default MeshMetric
policy for the default
mesh.
The second policy creates an override for workloads tagged with framework: example-web-framework
.
That web framework exposes metrics under /metrics/prometheus
and port 8888
.