Skip to main content

Metrics

KubeOps emits OpenTelemetry metrics for its reconciliation pipeline through a Meter named after the operator (OperatorSettings.Name) — the same identifier used for the tracing ActivitySource.

Collecting metrics is enabled by default and is virtually free when no exporter is attached. To actually scrape the data you register an OpenTelemetry exporter for the meter.

Enabling / disabling

Metrics collection is controlled by OperatorSettings.EnableMetrics (default true). Disable it via the fluent builder:

builder.Services.AddKubernetesOperator(settings => settings
.WithMetrics(false));

When disabled, the metrics infrastructure is not registered and the instrumentation in the watcher, queue, and reconciler is skipped entirely.

Instruments

All instruments carry a kubeops.entity.type tag (the watched entity's type name, e.g. V1MyResource).

NameTypeUnitAdditional tags
kubeops.operator.queue.depthObservableGauge{items}kubeops.queue.state (scheduled | ready)
kubeops.operator.queue.enqueuedCounter{items}kubeops.trigger.source (api_server | operator)
kubeops.operator.queue.requeuedCounter{items}kubeops.requeue.reason (conflict | error_retry | operator_requeue)
kubeops.operator.queue.discardedCounter{items}
kubeops.operator.reconciliationCounter{reconciliations}kubeops.reconciliation.type (added | modified | deleted), kubeops.reconciliation.status (success | failure), error.type (on failure)
kubeops.operator.reconciliation.durationHistogramskubeops.reconciliation.type, kubeops.reconciliation.status, error.type (on failure)
kubeops.operator.watcher.eventsCounter{events}kubeops.watcher.event.type (added | modified | deleted | bookmark)
kubeops.operator.watcher.reconnectionsCounter{reconnections}

The kubeops.operator.queue.depth gauge reports two series: scheduled (entries waiting for a delayed requeue) and ready (entries waiting to be picked up by the reconciliation loop).

note

kubeops.operator.queue.requeued is a subset of kubeops.operator.queue.enqueued: every requeue (conflict, error-retry, or operator requeue) also increments the enqueued counter. Do not add the two together when building dashboards — use requeued for the per-reason breakdown of requeues only.

The kubeops.trigger.source tag on kubeops.operator.queue.enqueued reflects the original event source. An error-retry therefore keeps its original source (e.g. api_server) rather than operator; use kubeops.operator.queue.requeued{kubeops.requeue.reason="error_retry"} to count retries explicitly.

note

The queue runs side-by-side with the watcher rather than strictly in front of the reconciler, so the queue instruments give a good — but not exhaustive — view of throughput. See issue #1037 for context.

The error.type attribute is only present on failed reconciliations and carries the failing exception's full type name (or _OTHER when a reconciliation reports failure without an exception). It follows the OpenTelemetry error.type convention and is bounded by the set of exception types your controllers throw.

The kubeops.operator.reconciliation.duration histogram uses second-scale bucket boundaries (5ms … 60s) tuned for typical reconcile latencies, so histogram_quantile() over kubeops_operator_reconciliation_duration_seconds_bucket yields meaningful percentiles out of the box.

Prometheus exposition names

The instrument names above are the OpenTelemetry names. The Prometheus exporter translates them (dots → underscores, _total suffix for counters, unit suffix for the histogram, UCUM annotation units such as {items} dropped). The scrape endpoint therefore exposes:

OpenTelemetry instrumentPrometheus time series
kubeops.operator.queue.depthkubeops_operator_queue_depth
kubeops.operator.queue.enqueuedkubeops_operator_queue_enqueued_total
kubeops.operator.queue.requeuedkubeops_operator_queue_requeued_total
kubeops.operator.queue.discardedkubeops_operator_queue_discarded_total
kubeops.operator.reconciliationkubeops_operator_reconciliation_total
kubeops.operator.reconciliation.durationkubeops_operator_reconciliation_duration_seconds (_bucket / _sum / _count)
kubeops.operator.watcher.eventskubeops_operator_watcher_events_total
kubeops.operator.watcher.reconnectionskubeops_operator_watcher_reconnections_total

Exposing a Prometheus endpoint (KubeOps.Operator.Web)

Metrics export is configured through the standard OpenTelemetry pipeline, separate from the operator registration chain. KubeOps.Operator.Web provides two helpers: AddKubeOpsInstrumentation() on the MeterProviderBuilder subscribes to the operator's meter (the operator name is resolved from the registered OperatorSettings, so you don't have to repeat it), and MapOperatorMetricsEndpoint() exposes the Prometheus scraping endpoint:

var builder = WebApplication.CreateBuilder(args);

builder.Services
.AddKubernetesOperator()
.RegisterComponents();

// NuGet: OpenTelemetry.Extensions.Hosting
builder.Services
.AddOpenTelemetry()
.WithMetrics(m => m
.AddKubeOpsInstrumentation() // subscribes to the operator meter
.AddPrometheusExporter());

var app = builder.Build();
app.UseRouting();
app.MapControllers();
app.MapOperatorMetricsEndpoint(); // exposes GET /metrics

app.Run();

Pass the name explicitly with AddKubeOpsInstrumentation(operatorName) if AddKubernetesOperator() has not run yet on the same service collection.

Manual exporter configuration

Without KubeOps.Operator.Web you can register any OpenTelemetry exporter yourself. Add the meter by the operator name (== OperatorSettings.Name) and pick an exporter:

// Standalone HttpListener (no ASP.NET Core)
// NuGet: OpenTelemetry.Exporter.Prometheus.HttpListener
.WithMetrics(m => m
.AddMeter(operatorName)
.AddPrometheusHttpListener(o => o.UriPrefixes = ["http://+:9464/"]));
// 9464 is the Prometheus convention for the metrics scrape port.
// OTLP to an OpenTelemetry Collector
// NuGet: OpenTelemetry.Exporter.OpenTelemetryProtocol
.WithMetrics(m => m
.AddMeter(operatorName)
.AddOtlpExporter());
tip

If you already use .NET Aspire via KubeOps.Aspire, the meter is picked up automatically by AddKubeOpsServiceDefaults, which configures OpenTelemetry with OTLP export.