macOS • 100.x.x.4
Central monitoring stack via Colima Docker: Prometheus, Grafana, Loki, Alertmanager, cAdvisor, node-exporter
Linux • 100.x.x.45
penny-backtest, penny-web containers. No monitoring agent yet.
Ubuntu 24.04 • 100.x.x.8
Brand new, 8GB RAM, nothing installed yet. Future AI agent host.
node-exporter → Prometheus scrapes :9100
promtail → pushes logs to Loki
(2 processes, 2 configs)
grafana-alloy → pushes metrics via remote_write
→ pushes logs via loki.write
(1 process, 1 config)
prometheus.exporter.unix = the node_exporter, inside Alloyconstants.hostname for auto-labeling# Convert existing promtail config to Alloy format automatically alloy convert --source-format=promtail --output=/etc/alloy/config.alloy promtail.yaml
| Agent | RAM | Loki Support | Verdict |
|---|---|---|---|
| Grafana Alloy | ~100-200 MiB | Native | Recommended |
| Fluent Bit | ~10-20 MiB | Plugin | Fallback |
| Vector | ~100-200 MiB | Native | Good, but not Grafana-native |
| ~50 MiB | Native | EOL Mar 2026 |
If the Pi ever gets RAM-constrained, Fluent Bit (10 MiB) for logs + bare node-exporter (20 MiB) is the ultra-lean fallback.
| Workload | CPU | RAM |
|---|---|---|
| Metrics (per 1M active series) | ~0.4 cores | ~11 GiB |
| Logs (per 1 MiB/s throughput) | ~1 core | ~120 MiB |
| Homelab (basic metrics + low-volume logs) | 0.05-0.15 cores | 80-200 MiB |
~100-200 MiB RAM as systemd service
~500 MiB if run in Docker (overhead)
Run as systemd, not Docker, on Pi
node-exporter: ~15 MiB
Promtail: ~50 MiB
Total: ~65 MiB (but 2 processes)
The tradeoff: ~3x more RAM for consolidation + future-proofing. On an 8GB Pi with 7.4 GiB free, 200 MiB is nothing.
// /etc/alloy/config.alloy — drop on every remote host
// ═══ METRICS (replaces node-exporter) ═══
prometheus.exporter.unix "host" {
set_collectors = ["cpu","meminfo","diskstats",
"filesystem","netdev","loadavg","uname","processes"]
}
prometheus.scrape "host_metrics" {
targets = prometheus.exporter.unix.host.targets
forward_to = [prometheus.relabel.add_host.receiver]
scrape_interval = "30s"
}
prometheus.relabel "add_host" {
forward_to = [prometheus.remote_write.central.receiver]
rule {
target_label = "host"
replacement = constants.hostname
}
}
prometheus.remote_write "central" {
endpoint {
url = "http://100.x.x.4:9090/api/v1/write"
}
}
// ═══ LOGS (replaces promtail) ═══
loki.source.journal "journal" {
max_age = "12h"
forward_to = [loki.write.central.receiver]
labels = { host = constants.hostname, job = "journal" }
}
loki.write "central" {
endpoint {
url = "http://100.x.x.4:3100/loki/api/v1/push"
}
external_labels = { host = constants.hostname }
}
Each Tailscale node exposes Prometheus metrics natively. Enable with:
tailscale set --webclient # exposes :5252/metrics over the tailnet
| Metric | Type | What it tells you |
|---|---|---|
tailscaled_inbound_bytes_total | counter | Inbound bytes by path (direct_ipv4, derp, etc.) |
tailscaled_outbound_bytes_total | counter | Outbound bytes by path |
tailscaled_inbound_dropped_packets_total | counter | Dropped packets with reason labels |
tailscaled_home_derp_region_id | gauge | Which DERP relay the node uses |
tailscaled_health_messages | gauge | Health warnings (type label) |
tailscaled_advertised_routes | gauge | Subnet routes advertised |
tailscaled_approved_routes | gauge | Subnet routes approved |
Path labels on throughput: direct_ipv4, direct_ipv6, derp — tells you if traffic is direct or relayed.
adinhodovic/tailscale-exporter on the Mac Mini for tailnet-wide device monitoring. Combine with native client metrics on each host.
| Exporter | What it monitors | Auth needed | Status |
|---|---|---|---|
| adinhodovic/tailscale-exporter | Fleet: devices, users, keys, DNS | OAuth client | v0.3.0 Dec 2025 |
| Native client metrics (:5252) | Per-host: throughput, DERP, health | None | Built-in v1.78+ |
| josh/tailscale_exporter | Device status, auth expiry | API key | Active |
| cfunkhouser/tailscalesd | Service discovery (not metrics) | API key | Niche |
tailscale-mixin in the adinhodovic repoPush for host metrics + logs (Alloy handles both, zero Prometheus config per host). Pull for Tailscale metrics (native :5252 endpoint, already exposed). Best of both worlds.
| Practice | How |
|---|---|
| Bind to Tailscale IP only | --web.listen-address=100.x.x.x:9100 |
| Tailscale ACLs | Restrict :9100, :5252, :3100 to monitoring host only |
| No public exposure | Never bind exporters to 0.0.0.0 on public-facing hosts |
| Access Grafana via tunnel | ssh -L 3000:localhost:3000 dan@mini |
{
"acls": [
{ "action": "accept",
"src": ["100.x.x.4"], // Mac Mini only
"dst": ["*:9100", "*:5252"] } // monitoring ports
]
}
~/monitoring-stack/prometheus/prometheus.yml# Enable remote write receiver (add to docker-compose command)
# command: '--web.enable-remote-write-receiver'
scrape_configs:
# ... existing jobs ...
# Tailscale client metrics from all hosts
- job_name: 'tailscale-clients'
static_configs:
- targets: ['host.docker.internal:5252']
labels: { host: 'mac-mini' }
- targets: ['100.x.x.45:5252']
labels: { host: 'digitalocean' }
- targets: ['100.x.x.8:5252']
labels: { host: 'raspberry-pi' }
scrape_interval: 30s
# Tailscale fleet exporter (runs locally)
- job_name: 'tailscale-fleet'
static_configs:
- targets: ['tailscale-exporter:9090']
Alloy pushes host metrics via remote_write — no scrape targets needed for those.
| ID | Name | What it shows |
|---|---|---|
| 1860 | Node Exporter Full | CPU, memory, disk, network per host |
| 24177 | Tailscale Overview | Fleet: all devices, online/offline, OS, users |
| 24178 | Tailscale Machine | Per-device: throughput, DERP, latency |
| Custom | Ollama Observability | Already deployed (uid: ollama-ocasia-001) |
# Via CLI (from Mac Mini)
curl -s -u dan:Sl33py!!! \
http://localhost:3000/api/dashboards/import \
-H 'Content-Type: application/json' \
-d '{"dashboard":{"id":null},"overwrite":true,
"inputs":[{"name":"DS_PROMETHEUS","type":"datasource",
"pluginId":"prometheus","value":"Prometheus"}],
"pluginId":"","folderId":0,
"gnetId": 24177}' # or 24178, or 1860
# 1. Install Grafana Alloy (ARM64) sudo mkdir -p /etc/apt/keyrings/ wget -q -O - https://apt.grafana.com/gpg.key \ | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] \ https://apt.grafana.com stable main" \ | sudo tee /etc/apt/sources.list.d/grafana.list sudo apt update && sudo apt install -y alloy # 2. Deploy config (see Slide 6 for full config) sudo tee /etc/alloy/config.alloy < alloy-config.alloy # 3. Enable Tailscale client metrics sudo tailscale set --webclient # 4. Start Alloy sudo systemctl enable --now alloy # 5. Verify curl http://localhost:12345 # Alloy debug UI journalctl -u alloy -f # Alloy logs
Note Run Alloy as systemd service on the Pi, not Docker, to minimize RAM overhead (~100-200 MiB vs ~500 MiB in Docker).
# Add to existing docker-compose.yml or run standalone docker run -d \ --name alloy \ --restart always \ --network host \ -v /var/log:/var/log:ro \ -v /proc:/host/proc:ro \ -v /sys:/host/sys:ro \ -v ./alloy-config.alloy:/etc/alloy/config.alloy \ grafana/alloy:latest \ run /etc/alloy/config.alloy
sudo tailscale set --webclient
# Add to ~/monitoring-stack/docker-compose.yml
tailscale-exporter:
image: ghcr.io/adinhodovic/tailscale-exporter:latest
container_name: tailscale-exporter
restart: always
environment:
- TS_CLIENT_ID=<your-oauth-client-id>
- TS_CLIENT_SECRET=<your-oauth-secret>
- [email protected]
--web.enable-remote-write-receiver to Prometheus in docker-composetailscale set --webclienttailscale set --webclient