Monitoring — 21tunnel admin docs

1 The metrics endpoint

Metrics are configured with the [metrics] block in /etc/qnt/server.toml. Once enabled, qnt-server serves a Prometheus exposition endpoint — but it binds to localhost by default, so nothing is exposed to the network until you decide to front it.

The `[metrics]` block

Three fields control it: enabled (a bool), bind (a socket address, defaulting to 127.0.0.1:9090), and endpoint (the path, e.g. /metrics).

[metrics]
enabled  = true
bind     = "127.0.0.1:9090"
endpoint = "/metrics"

With those defaults the metrics live at http://127.0.0.1:9090/metrics — reachable only from the box itself.

Localhost-only, on purpose

The default bind is 127.0.0.1, so the endpoint is not exposed off the host. Scrape it from a Prometheus running on the same machine, or front it carefully behind your own auth — don't bind it to 0.0.0.0 and call it a day.

2 Scrape with Prometheus

You run your own Prometheus; qnt-server just exposes the endpoint. The repo ships a starting point so you don't write the config from scratch.

What ships in the repo

deploy/prometheus.yml — a scrape config that points Prometheus at the qnt-server /metrics endpoint.
deploy/prometheus.service — a systemd unit to run Prometheus as a managed service.

Because the endpoint is localhost-only, the simplest topology is a Prometheus on the same VM scraping 127.0.0.1:9090/metrics.

Install and enable

Copy the unit into place and start Prometheus against the shipped config:

cp deploy/prometheus.yml      /etc/prometheus/prometheus.yml
cp deploy/prometheus.service  /etc/systemd/system/
systemctl daemon-reload
systemctl enable --now prometheus

3 What to watch

These are the kinds of things worth a dashboard and an alert rule — the signals that tell you something is wrong before a customer does. Wire your alert thresholds to whichever exported series match these in your build.

The signals that matter

Request rate and latency — the headline of whether traffic is flowing and how fast.
Active tunnel count — how many tunnels are live right now.
Agent connection count — how many agents are connected.
Error rate — the share of requests failing.
DB pool saturation — connection-pool pressure against Postgres.

Treat these as categories to alert on, not exact metric names — check the live /metrics output for the series your build actually exports before writing rules against them.

4 Logs & the journal

qnt-server emits structured logs to the systemd journal. There's no log file to rotate — you read it with journalctl.

Tail the logs

Follow the journal live, or pull the last 100 lines for a quick look:

journalctl -u qnt-server -f
journalctl -u qnt-server -n 100 --no-pager

Turn up the verbosity

Log level is set by the --log-level flag in the unit's ExecStart. Bump it from info to debug when you're chasing something, then reload and restart:

# edit ExecStart … --log-level debug
systemctl daemon-reload
systemctl restart qnt-server

5 Trace a slow request

When something feels slow, you have two angles: grep the journal for the slow lines, and pull a tunnel's recent activity from the API inspector.

Grep the journal for high latency

Pull the recent window and filter for requests that took four or more digits of latency (a thousand-plus units):

journalctl -u qnt-server --since "10 min ago" --no-pager \
  | grep -E "finished processing request.*latency=[0-9]{4,}"

Inspect a tunnel's recent activity

The API inspector returns recent requests for a tunnel. Swap <jwt> for a bearer token and <tid> for the tunnel id:

curl -H "Authorization: Bearer <jwt>" \
  "https://login.example.com/api/tunnels/<tid>/inspector?limit=20" | jq

Between the two you can usually tell whether the slowness is in qnt-server, in the upstream the tunnel points at, or in the DB.

6 Optional: OpenTelemetry

If you already run a tracing stack, qnt-server can export spans to it. This is off by default — most self-hosters never turn it on.

Enable the `[tracing]` block

Tracing is controlled by the [tracing] block in /etc/qnt/server.toml. Enable it and point the OTLP/gRPC exporter at your collector — something like http://localhost:4317, or a named collector such as http://tempo:4317.

Disabled by default. Leave it off unless you have a collector ready to receive the spans.

→ Next

Metrics flowing and logs at your fingertips. Close the loop on alerting and durability.

See everything.

1 The metrics endpoint

The `[metrics]` block

Localhost-only, on purpose

2 Scrape with Prometheus

What ships in the repo

Install and enable

3 What to watch

The signals that matter

4 Logs & the journal

Tail the logs

Turn up the verbosity

5 Trace a slow request

Grep the journal for high latency

Inspect a tunnel's recent activity

6 Optional: OpenTelemetry

Enable the `[tracing]` block

→ Next

Alerts

Backups

Admin guide

PRODUCTION.md

1 The metrics endpoint

The [metrics] block

Localhost-only, on purpose

2 Scrape with Prometheus

What ships in the repo

Install and enable

3 What to watch

The signals that matter

4 Logs & the journal

Tail the logs

Turn up the verbosity

5 Trace a slow request

Grep the journal for high latency

Inspect a tunnel's recent activity

6 Optional: OpenTelemetry

Enable the [tracing] block

→ Next

Alerts

Backups

Admin guide

PRODUCTION.md

The `[metrics]` block

Enable the `[tracing]` block