What the Observer does
Discovers topology
Walks every reachable API (Kubernetes, AWS, GCP, systemd) to build a live map of services, nodes, and dependencies.
Collects signals
Metrics, events, logs, and state, at the rate each one changes. No polling for the sake of polling.
Streams to the Cloud
Structured events and state snapshots over HTTPS and NATS. Raw payloads stay with you unless you opt in.
Stays read-only
No mutating permissions by default. Any action that could change state goes through the Guardian, not the Observer.
Where it runs, and how small
| Environment | Install shape | Typical footprint |
|---|---|---|
| Kubernetes | kubectl apply manifest into rubixkube-system namespace | About 255Mi RAM, under 10 millicores of CPU combined across Observer and Kubernetes MCP server |
| AWS | Systemd service on any Linux host, or a fresh EC2 instance the installer creates | About 200Mi RAM |
| GCP | Systemd service on any Linux host, or a fresh GCE instance | About 200Mi RAM |
| Linux VM | Systemd service directly on the host | About 150Mi RAM |
What the Observer sees
Each environment type has a slightly different signal set, but the shape is consistent.Kubernetes
Kubernetes
- Pod, deployment, replicaset, statefulset, daemonset state.
- Node health, capacity, allocatable.
- Services, endpoints, ingress routes.
- Events from the cluster event bus.
- Logs from pods you scope into the Observer.
- Standard metrics (CPU, memory, network, disk) via Kubernetes APIs.
AWS
AWS
- EC2 instances, RDS databases, Lambda functions, S3 buckets, ELB health, CloudTrail events.
- CloudWatch metrics for each of the above.
- Account-level changes and IAM events that might affect reliability.
GCP
GCP
- Compute Engine instances, GKE clusters, Cloud SQL, Cloud Run, Cloud Storage, Cloud Functions.
- Monitoring metrics from Google Cloud Monitoring.
- Project-level audit logs for reliability-relevant operations.
Linux VM
Linux VM
- CPU per core, load averages, memory, swap.
- Disk usage per mount, I/O metrics.
- Network interface statistics, errors, drops.
- Per-process metrics, top consumers, zombie detection.
- Systemd unit state for services you care about.
How the Observer decides what is worth streaming
Not every signal is equal. The Observer uses three filters to keep the signal stream light and the Knowledge Graph useful.Rate of change
A value that has not moved in an hour does not need to ship again. Static values stream on change, not on interval.
Relevance to open incidents
Signals tied to resources already under investigation upgrade to higher sampling frequency automatically.
Outbound network requirements
The Observer needs two outbound endpoints over HTTPS.api.rubixkube.ai:443for control and structured events.nats.rubixkube.ai:443for streaming signals.
Common questions
Does the Observer need privileged access to my cluster or cloud?
Does the Observer need privileged access to my cluster or cloud?
Read-only is the default. Kubernetes installs create a
ClusterRole scoped to the resources the Observer watches. AWS and GCP installers create a read-only IAM role or service account. Nothing mutating is provisioned.Can I run multiple Observers in one environment?
Can I run multiple Observers in one environment?
Yes, though usually unnecessary. The common case is one Observer per Kubernetes cluster, one per cloud account, and one per VM.
How often does the Observer talk to the Cloud?
How often does the Observer talk to the Cloud?
Continuously. The NATS channel stays open for streaming. The HTTPS control channel exchanges heartbeats every 30 seconds.
What happens if the Observer is unreachable from the Cloud?
What happens if the Observer is unreachable from the Cloud?
The environment card shows a degraded state after two missed heartbeats. Collected signals queue locally, up to a bounded buffer, and catch up once the connection returns. Alerts about the Observer itself go through the Notifications channels you have configured.
Can I upgrade the Observer without downtime?
Can I upgrade the Observer without downtime?
Yes. The Kubernetes manifest uses a rolling update. The systemd installer downloads the new binary, restarts the service, and the local buffer covers the few seconds of restart.
Related concepts
The Agent Mesh
How the Observer fits with the other agents in the mesh.
Knowledge Graph
What the Observer builds and keeps current.
Safety and Guardrails
Why the Observer stays read-only by design.