Prerequisites
A RubixKube account
Free tier works for this tutorial.
At least one environment connected
Kubernetes, AWS, GCP, or a Linux VM. Mix of any of them is fine.
Step 1: Read the Dashboard
The Dashboard is your daily check-in. Four tiles tell you whether the system is worth investigating right now.| Tile | What it means | When to pay attention |
|---|---|---|
| System Health | Blended score across every connected environment | Below 95% or trending down |
| Active Insights | Count of anomalies RubixKube is watching | Any new insight since your last visit |
| Intelligent Analysis | RCA reports ready to read | New report since last session |
| Agents | Health of the observer and cloud-side agents | Anything other than all green |
Step 2: Check your topology
Open Infrastructure Topology. You should see every resource the Observer has discovered, grouped by environment.- Green edges, dependencies behaving as expected.
- Yellow edges, degraded signals (higher latency, elevated error rate, resource pressure).
- Red edges, active incidents.
Step 3: Tune Insights to your team
Open Magic Insights. Each insight is an anomaly the system thinks a human should know about.Filter to your services
Use the environment and namespace filters to narrow to what your team owns. Bookmark the view.
Set the severity threshold
Start at Medium. Too many low-severity cards train people to ignore the list.
Step 4: Connect a notification channel
Health monitoring is only useful if the right person sees the signal. Connect a channel you already live in.Slack
Channel-level routing for insights and RCAs.
Microsoft Teams
Team-channel delivery for insights and RCAs.
PagerDuty
Promote critical insights into on-call pages.
Linear
Turn RCAs into tickets with one click.
Step 5: Ask Chat a monitoring question
Chat is the fastest way to pull a specific view without learning a query language. A few prompts worth bookmarking:What healthy monitoring looks like after a week
System Health is stable
Sits between 95 and 100% most of the time. Dips correlate to known events.
Insights have owners
Your team either acts, dismisses, or routes every new insight. Few stale cards older than a day.
Topology reflects reality
Newly deployed services appear. Decommissioned resources drop out within the hour.
Chat answers with evidence
Your team uses Chat instead of grepping logs for quick questions.
Common questions
How fast does RubixKube detect a new issue?
How fast does RubixKube detect a new issue?
Most anomalies surface within one to two minutes of the underlying signal. The OPEL loop runs continuously rather than on a fixed cron schedule.
What is the difference between an Insight and an RCA Report?
What is the difference between an Insight and an RCA Report?
An Insight is an anomaly worth attention. An RCA Report is a full causal chain with evidence and recommended fixes. Not every insight becomes an RCA, only the ones that look like they have a single identifiable root cause.
Can I monitor multiple environments at once?
Can I monitor multiple environments at once?
Yes. Every connected environment feeds into the same Dashboard, Insights list, and knowledge graph. Use the environment filter to narrow to a single one when needed.
What if I only want alerts, not a dashboard?
What if I only want alerts, not a dashboard?
Connect Slack or Teams, filter Insights to the severity you care about, and subscribe. The dashboard becomes optional.
Related guides
How to Automate Incident Remediation
The next step after monitoring: when an insight becomes an incident.
Talk to your infra
Go deeper with Chat for on-demand investigations.