Use case #01
Linux + Windows server monitoring without a Datadog bill
“Which tool can I use for server monitoring on Linux and Windows that doesn't cost €30 per host per month?”
One Rust binary agent on every server, one Go hub in the middle. CPU, memory, disk, network, processes — sub-second ingest, 30-day default retention. EU-hosted (Belgium).
Server monitoring + flat €3/server/month pricing → Pricing →
Use case #02
External accountant asks for evidence
“An external auditor wants compliance evidence for the last 12 months. How do I deliver it without giving them production access?”
Auditor Workbench generates one signed ZIP with all evidence packs from the period + offline verify.py + signing-keys snapshot. One-shot download URL, valid for 24h, TOTP-gated. The auditor needs no monsys credentials.
Auditor Workbench → View Auditor Workbench →
Use case #03
Auditing ChatGPT / Claude / Copilot use in the workplace
“My team uses ChatGPT and Claude for work. How do I keep an audit trail for compliance and cost control?”
Three SDKs (Python/JS/Go) ship LLM traces with PII redaction at source. Hub stores cost, refusal rate, top models, alerts on policy drift. Plus Copilot Audit + OpenAI Admin Audit as separate modules with signed evidence packs.
AI observability + Copilot/OpenAI audit modules → View AI observability →
Use case #04
MSP with 30+ customers and no overview
“I manage monitoring for 30 different customers. How do I see in one screen which customer needs urgent attention right now?”
MSP Cockpit shows every tenant on one line with Trust Score, Δ7d, open alerts, failing controls, last activity. Sorted by an urgency composite so the most acute customer is on top. A separate msp_operator role protects cross-tenant access.
MSP Cockpit → View MSP Cockpit →
Use case #05
TLS certificate is about to expire
“How do I get warned before a TLS cert expires — without running openssl by hand once a day?”
CertScanWorker runs daily external TLS handshakes against your registered endpoints. The agent additionally does an internal scan of listening TCP ports. Signals: cert.expiring_soon (medium @30d, critical @7d), cert.expired, cert.weak_signature, cert.weak_key, cert.weak_cipher, cert.tls10_supported, cert.self_signed_external, cert.san_mismatch.
Certificate scanning + CT-log monitoring → Read the docs →
Use case #06
Supply-chain attack via npm / PyPI / RubyGems / Composer
“How do I get warned when an npm package I depend on changes maintainer or suddenly gets deprecated?”
SupplyChainWorker runs daily registry probes for every (ecosystem, package) tuple in your inventory. Maintainer-set diff vs local cache → dep.maintainer_changed. Yanked / deprecated upstream → dep.deprecated. No external SaaS integration required.
Supply-chain monitoring → View Connected Dashboards →
Use case #07
Controlling PII in LLM prompts
“How do I know whether my team is sending PII to OpenAI or Claude — without installing a man-in-the-middle proxy?”
The monsys SDK redacts PII inside the SDK itself (regex + entropy + custom patterns), BEFORE the trace reaches the hub. The AI Cost × PII Quadrant dashboard plots cost vs hit-rate per app; top-right = danger zone. Per-app evidence packs for your DPO.
AI observability with source-side PII redaction → View AI observability →
Use case #08
Backups are running — but do they actually work?
“How do I know whether my restic / borg backups actually ran recently and whether they go offsite?”
The agent detects restic / borg / duplicati / rclone configs (cron, systemd timers, processes). Signals: backup.no_tool_configured, backup.stale (>7d since last run), backup.unencrypted, backup.local_only (heuristic on destination). Lives in the Trust Score 'backup' category.
Backup verification (Phase 1.4) → Read the docs →
Use case #09
Quarterly report for the compliance officer
“My compliance officer wants a report once per quarter of what we monitor and what its status is. Can this be automated?”
Compliance Timeline gives a control × month heatmap with status (passing with signed pack / passing without pack / override / failing). Auditor-mode toggle hides operator UI; ready for PDF export. Auditor Workbench generates the evidence bundle alongside it.
Compliance Timeline + Auditor Workbench → View Compliance Timeline →
Use case #10
MTTR / MTTA for a SOC or board report
“What is our average time-to-acknowledge and time-to-resolve per alert severity over the last 90 days?”
The Operations / MTTR dashboard computes MTTA + MTTR p50 and p95 per (scope, severity) over a rolling 90 days. Flags 'chronically ignored' alert types where MTTA p50 > 7d. Works on both agent-side alerts and ai_alerts.
MTTR & MTTA dashboard → View Connected Dashboards →
Use case #11
Identity sprawl across GitHub / OpenAI / cloud / dashboard / servers
“Who has access to what across our tooling? I don't want to auto-correlate them by email (GDPR), but I do want visibility.”
Identity Surface lets an admin manually link identities between sources (dashboard_user, copilot_seat, openai_user, inventory_user, cloud_iam). NO auto-correlation via email fingerprints. Signal identity.person_in_servers_not_dashboard flags someone with SSH access but no dashboard presence.
Identity Surface (explicit linking) → View Identity Surface →
Use case #12
Containers running as root or --privileged
“How do I know which containers in my fleet run as uid 0 or have dangerous capabilities?”
Hub-side derivation from extended_inventory. Per container: container.runs_as_root (uid 0), container.privileged (--privileged flag), container.dangerous_caps (SYS_ADMIN, NET_ADMIN, etc.), container.untrusted_registry (image from a non-allowlisted registry). Trust Score category 'process_dna'.
Container hygiene (Phase 1.7) → View Connected Dashboards →
Use case #13
SPF / DMARC / DNSSEC checks for your domains
“How do I check whether SPF, DMARC and DNSSEC are correctly configured on every domain we manage?”
DNSCheckWorker runs daily SPF, DMARC (parse p= tag), CAA, DNSSEC (AD-bit via 1.1.1.1), MX + dangling CNAME, MTA-STS, and nameserver-set diff checks. Result lands in the dns_snapshots table + dns.* signals (spf_missing, dmarc_weak, dnssec_disabled, dangling_cname, …).
DNS hygiene (Phase 1.2) → Read the docs →
Use case #14
A hundred hosts sharing one policy and SLA
“How do I bundle machines per customer or environment so owner, SLA and runbook are set once?”
Hierarchical host groups with static membership or dynamic tag-rules (role=web, env=prod). Per group: owner, on-call team, SLA target, change window, compliance scope, markdown runbook with revision history. Sub-groups inherit defaults from the parent.
Host groups + group docs → Start free →
Use case #15
Uptime per application, not just per VM
“A VM can be running while my nginx or postgres is stopped — how do I measure availability at the application level?”
SLA engine rolls 5-minute state buckets per agent, per application (systemd unit / docker container / compose service / process) and per group. Worst-case aggregation to the group level. Observed% + target% + error-budget burn-down in the UI; alerting hooks in.
SLA engine with per-app rollup → Read the docs →
Use case #16
A customer should only see their own servers
“How do I give an MSP customer or an internal tech viewer access to only their group without making them tenant admin?”
RBAC v2 layers scoped role_assignments (tenant / group / agent × viewer / editor / admin) on top of the tenant role. HasScopeAccess is the single check; EAT-triggering endpoints and CRUD mutations consult it. Group scope rolls through to agents in that group.
Scoped RBAC v2 → Start free →
Use case #17
Who gets paged at 3am for which group?
“How do I route alerts from prod-eu to the EU on-call and prod-us to someone else?”
on_call_rotations per group (or tenant-wide fallback) with JSON schedule [{user_id, start, end, contact}]. NotifyWorker resolves the current shift for each alert, adds the on-call user to the ntfy body, and pushes an extra notification to their personal topic.
On-call rotations + alert routing → See Connected Dashboards →
Use case #18
Primary DB down — what topples?
“How do I know which apps go down if this database server crashes, and is the standby available?”
agent_relations models failover / replica / depends_on between hosts; app_dependencies does the same between monitored apps (depends_on / calls / reads_from / writes_to). Impact API returns per host: failover partners + status, plus all downstream apps. SVG dependency map visualises the blast radius.
Failover pairs + service dependency graph → See Connected Dashboards →
Use case #19
Upgrade npm/pip CVEs without ssh-ing to every host
“How do I upgrade 40 npm packages with CVEs in /opt/myapp without logging into the host myself?”
OSV.dev scan per host writes into inventory_dependency_cves. Dashboard either generates a bash fix script per project or triggers an Ed25519-signed EAT that runs monsys-package-update on the host with snapshot + rollback. Auto-update-all groups per (project, ecosystem). Exit code + stdout returns to the agent detail page.
Application dependency CVE auto-fix → Start free →
Use case #20
See agent logs without ssh access
“How do I see what the monsys agent on a host did in the last 10 minutes if I can't / mustn't ssh in?”
Agent-side tracing layer filters WARN/ERROR + key INFO milestones (emergency, update, transport, inventory) and POSTs batches to the hub. Hub forwards to local Loki with {tenant_id, agent_id, level} labels. UI tab on the agent detail page shows a live tail with level filter + grep.
Agent log tail (Loki-backed) → Read the docs →