Tailscale Monitoring Stack

Unified observability across your tailnet mesh
Research Briefing — February 26, 2026
3 hosts • Prometheus + Grafana + Loki • Grafana Alloy

Arrow keys or swipe to navigate • 16 slides

The Challenge

3 hosts on a Tailscale mesh, 1 central monitoring stack

Mac Mini (Ocasia)

macOS • 100.x.x.4

Central monitoring stack via Colima Docker: Prometheus, Grafana, Loki, Alertmanager, cAdvisor, node-exporter

DigitalOcean Docker

Linux • 100.x.x.45

penny-backtest, penny-web containers. No monitoring agent yet.

Raspberry Pi 5

Ubuntu 24.04 • 100.x.x.8

Brand new, 8GB RAM, nothing installed yet. Future AI agent host.

Goals

The Answer: Grafana Alloy

Verdict: Deploy Grafana Alloy on every remote host. It replaces both node-exporter AND Promtail in a single binary. Promtail hits EOL on March 2, 2026.

Before (Old Stack)

node-exporter   → Prometheus scrapes :9100
promtail        → pushes logs to Loki
                   (2 processes, 2 configs)

After (Alloy)

grafana-alloy   → pushes metrics via remote_write
                → pushes logs via loki.write
                   (1 process, 1 config)

Why Alloy Wins

Promtail is Dead

Promtail End of Life: March 2, 2026 — No new features since early 2025. Grafana Labs has officially deprecated it in favor of Alloy.

Migration Path

# Convert existing promtail config to Alloy format automatically
alloy convert --source-format=promtail --output=/etc/alloy/config.alloy promtail.yaml

Log Shipper Comparison

AgentRAMLoki SupportVerdict
Grafana Alloy~100-200 MiBNativeRecommended
Fluent Bit~10-20 MiBPluginFallback
Vector~100-200 MiBNativeGood, but not Grafana-native
Promtail~50 MiBNativeEOL Mar 2026

If the Pi ever gets RAM-constrained, Fluent Bit (10 MiB) for logs + bare node-exporter (20 MiB) is the ultra-lean fallback.

Alloy Resource Footprint

Official Estimates (from Grafana docs)

WorkloadCPURAM
Metrics (per 1M active series)~0.4 cores~11 GiB
Logs (per 1 MiB/s throughput)~1 core~120 MiB
Homelab (basic metrics + low-volume logs)0.05-0.15 cores80-200 MiB

Comparison on Raspberry Pi 5 (8GB)

Alloy (unified)

~100-200 MiB RAM as systemd service

~500 MiB if run in Docker (overhead)

Run as systemd, not Docker, on Pi

Separate (old way)

node-exporter: ~15 MiB

Promtail: ~50 MiB

Total: ~65 MiB (but 2 processes)

The tradeoff: ~3x more RAM for consolidation + future-proofing. On an 8GB Pi with 7.4 GiB free, 200 MiB is nothing.

Alloy Configuration

Complete remote host agent config

// /etc/alloy/config.alloy — drop on every remote host

// ═══ METRICS (replaces node-exporter) ═══
prometheus.exporter.unix "host" {
  set_collectors = ["cpu","meminfo","diskstats",
    "filesystem","netdev","loadavg","uname","processes"]
}

prometheus.scrape "host_metrics" {
  targets    = prometheus.exporter.unix.host.targets
  forward_to = [prometheus.relabel.add_host.receiver]
  scrape_interval = "30s"
}

prometheus.relabel "add_host" {
  forward_to = [prometheus.remote_write.central.receiver]
  rule {
    target_label = "host"
    replacement  = constants.hostname
  }
}

prometheus.remote_write "central" {
  endpoint {
    url = "http://100.x.x.4:9090/api/v1/write"
  }
}

// ═══ LOGS (replaces promtail) ═══
loki.source.journal "journal" {
  max_age    = "12h"
  forward_to = [loki.write.central.receiver]
  labels     = { host = constants.hostname, job = "journal" }
}

loki.write "central" {
  endpoint {
    url = "http://100.x.x.4:3100/loki/api/v1/push"
  }
  external_labels = { host = constants.hostname }
}

Tailscale Native Metrics

Built-in since Tailscale v1.78 — zero dependencies

Each Tailscale node exposes Prometheus metrics natively. Enable with:

tailscale set --webclient    # exposes :5252/metrics over the tailnet

Available Metrics

MetricTypeWhat it tells you
tailscaled_inbound_bytes_totalcounterInbound bytes by path (direct_ipv4, derp, etc.)
tailscaled_outbound_bytes_totalcounterOutbound bytes by path
tailscaled_inbound_dropped_packets_totalcounterDropped packets with reason labels
tailscaled_home_derp_region_idgaugeWhich DERP relay the node uses
tailscaled_health_messagesgaugeHealth warnings (type label)
tailscaled_advertised_routesgaugeSubnet routes advertised
tailscaled_approved_routesgaugeSubnet routes approved

Path labels on throughput: direct_ipv4, direct_ipv6, derp — tells you if traffic is direct or relayed.

Tailscale Exporters

Fleet-level visibility via the Tailscale API

Recommendation: Run adinhodovic/tailscale-exporter on the Mac Mini for tailnet-wide device monitoring. Combine with native client metrics on each host.
ExporterWhat it monitorsAuth neededStatus
adinhodovic/tailscale-exporter Fleet: devices, users, keys, DNS OAuth client v0.3.0 Dec 2025
Native client metrics (:5252) Per-host: throughput, DERP, health None Built-in v1.78+
josh/tailscale_exporter Device status, auth expiry API key Active
cfunkhouser/tailscalesd Service discovery (not metrics) API key Niche

Pre-built Grafana Dashboards

Pull vs Push Architecture

Verdict: Hybrid — Alloy pushes metrics + logs from remotes (simplest config), central Prometheus pulls Tailscale client metrics on :5252, and tailscale-exporter runs centrally.

Push (Alloy remote_write)

  • No Prometheus scrape targets to manage
  • Works through any network topology
  • Alloy does metrics + logs in one config
  • Remote hosts are self-contained

Pull (Prometheus scrape)

  • Traditional, well-understood model
  • "up" metric works (host-down alerting)
  • Tailscale eliminates the firewall objection
  • Best for Tailscale native metrics (:5252)

Why Hybrid?

Push for host metrics + logs (Alloy handles both, zero Prometheus config per host). Pull for Tailscale metrics (native :5252 endpoint, already exposed). Best of both worlds.

Security Model

Tailscale provides the security layer

Best Practices

PracticeHow
Bind to Tailscale IP only--web.listen-address=100.x.x.x:9100
Tailscale ACLsRestrict :9100, :5252, :3100 to monitoring host only
No public exposureNever bind exporters to 0.0.0.0 on public-facing hosts
Access Grafana via tunnelssh -L 3000:localhost:3000 dan@mini

Example Tailscale ACL

{
  "acls": [
    { "action": "accept",
      "src": ["100.x.x.4"],       // Mac Mini only
      "dst": ["*:9100", "*:5252"] }   // monitoring ports
  ]
}

Target Architecture

Mac Mini — 100.x.x.4 (Central) ├── Docker (Colima): │ ├── prometheus :9090 ← scrapes Tailscale :5252 on all hosts │ │ ← receives remote_write from Alloy agents │ ├── grafana :3000 ← dashboards: node, ollama, tailscale │ ├── loki :3100 ← receives log push from Alloy agents │ ├── alertmanager :9093 │ ├── tailscale-exporter ← fleet API metrics (OAuth) │ ├── cadvisor, node-exporter, promtail (existing) │ └── ollama-exporter :9101 ← already deployed ├── Native: │ └── tailscale client metrics :5252 DigitalOcean — 100.x.x.45 ├── Docker: Grafana Alloy │ ├── prometheus.exporter.unix → remote_write → Mac Mini :9090 │ └── loki.source.journal → loki.write → Mac Mini :3100 └── tailscale client metrics :5252 Raspberry Pi 5 — 100.x.x.8 ├── Systemd: Grafana Alloy │ ├── prometheus.exporter.unix → remote_write → Mac Mini :9090 │ └── loki.source.journal → loki.write → Mac Mini :3100 └── tailscale client metrics :5252

Prometheus Config Updates

Add to existing ~/monitoring-stack/prometheus/prometheus.yml

# Enable remote write receiver (add to docker-compose command)
# command: '--web.enable-remote-write-receiver'

scrape_configs:
  # ... existing jobs ...

  # Tailscale client metrics from all hosts
  - job_name: 'tailscale-clients'
    static_configs:
      - targets: ['host.docker.internal:5252']
        labels: { host: 'mac-mini' }
      - targets: ['100.x.x.45:5252']
        labels: { host: 'digitalocean' }
      - targets: ['100.x.x.8:5252']
        labels: { host: 'raspberry-pi' }
    scrape_interval: 30s

  # Tailscale fleet exporter (runs locally)
  - job_name: 'tailscale-fleet'
    static_configs:
      - targets: ['tailscale-exporter:9090']

Alloy pushes host metrics via remote_write — no scrape targets needed for those.

Grafana Dashboards

Import these dashboard IDs

IDNameWhat it shows
1860Node Exporter FullCPU, memory, disk, network per host
24177Tailscale OverviewFleet: all devices, online/offline, OS, users
24178Tailscale MachinePer-device: throughput, DERP, latency
CustomOllama ObservabilityAlready deployed (uid: ollama-ocasia-001)

Import via Grafana UI or API

# Via CLI (from Mac Mini)
curl -s -u dan:Sl33py!!! \
  http://localhost:3000/api/dashboards/import \
  -H 'Content-Type: application/json' \
  -d '{"dashboard":{"id":null},"overwrite":true,
       "inputs":[{"name":"DS_PROMETHEUS","type":"datasource",
                  "pluginId":"prometheus","value":"Prometheus"}],
       "pluginId":"","folderId":0,
       "gnetId": 24177}'   # or 24178, or 1860

Pi 5 Deployment Steps

Full install sequence for the Raspberry Pi

# 1. Install Grafana Alloy (ARM64)
sudo mkdir -p /etc/apt/keyrings/
wget -q -O - https://apt.grafana.com/gpg.key \
  | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null
echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] \
  https://apt.grafana.com stable main" \
  | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update && sudo apt install -y alloy

# 2. Deploy config (see Slide 6 for full config)
sudo tee /etc/alloy/config.alloy < alloy-config.alloy

# 3. Enable Tailscale client metrics
sudo tailscale set --webclient

# 4. Start Alloy
sudo systemctl enable --now alloy

# 5. Verify
curl http://localhost:12345       # Alloy debug UI
journalctl -u alloy -f           # Alloy logs

Note Run Alloy as systemd service on the Pi, not Docker, to minimize RAM overhead (~100-200 MiB vs ~500 MiB in Docker).

DigitalOcean Deployment

Run Alloy in Docker alongside existing containers

# Add to existing docker-compose.yml or run standalone

docker run -d \
  --name alloy \
  --restart always \
  --network host \
  -v /var/log:/var/log:ro \
  -v /proc:/host/proc:ro \
  -v /sys:/host/sys:ro \
  -v ./alloy-config.alloy:/etc/alloy/config.alloy \
  grafana/alloy:latest \
  run /etc/alloy/config.alloy

Enable Tailscale metrics

sudo tailscale set --webclient

Tailscale Exporter (fleet metrics, runs on Mac Mini)

# Add to ~/monitoring-stack/docker-compose.yml
  tailscale-exporter:
    image: ghcr.io/adinhodovic/tailscale-exporter:latest
    container_name: tailscale-exporter
    restart: always
    environment:
      - TS_CLIENT_ID=<your-oauth-client-id>
      - TS_CLIENT_SECRET=<your-oauth-secret>
      - [email protected]

Implementation Checklist

Phase 1: Enable Prometheus remote_write

  • Add --web.enable-remote-write-receiver to Prometheus in docker-compose
  • Add Tailscale client scrape jobs to prometheus.yml
  • Restart Prometheus container

Phase 2: Deploy Alloy on Pi 5

  • Install Alloy via apt
  • Deploy config pointing to Mac Mini Tailscale IP
  • Enable tailscale set --webclient
  • Verify metrics in Prometheus, logs in Loki

Phase 3: Deploy Alloy on DO host

  • Run Alloy Docker container
  • Enable tailscale set --webclient
  • Verify metrics + logs flowing

Phase 4: Fleet monitoring

  • Create Tailscale OAuth client
  • Deploy tailscale-exporter container
  • Import dashboards 24177, 24178, 1860
  • Configure Tailscale ACLs
End state: All 3 hosts shipping metrics + logs to Ocasia's Docker stack. Tailscale mesh fully monitored. 4 Grafana dashboards. One agent per remote host.
1 / 16