# RubixKube Docs

> Site Reliability Intelligence for modern infrastructure. Detect anomalies, diagnose root cause, and resolve failures across Kubernetes, AWS, GCP, and Linux VMs.

## Docs

- [Rubix CLI command reference](https://docs.rubixkube.ai/cli/commands.md): Every top-level command and every slash command in the Rubix CLI. Generated from the CLI source so it stays accurate.
- [Rubix CLI configuration](https://docs.rubixkube.ai/cli/configuration.md): Where credentials and preferences live, how per-folder trust works, and how to switch environments and models.
- [Rubix CLI examples](https://docs.rubixkube.ai/cli/examples.md): Concrete flows for daily on-call, post-deploy verification, cost reviews, and CI automation using the Rubix CLI.
- [Rubix CLI FAQ](https://docs.rubixkube.ai/cli/faq.md): Quick answers to common questions about the Rubix CLI: scope, safety, CI usage, Windows support, and more.
- [Install the Rubix CLI](https://docs.rubixkube.ai/cli/installation.md): Install the Rubix CLI globally with npm, or run it without installing via npx. Node.js 18 or later required.
- [Rubix CLI overview](https://docs.rubixkube.ai/cli/overview.md): The Rubix CLI brings RubixKube's Site Reliability Intelligence to your terminal. Chat with infrastructure, investigate incidents, and run skills without leaving the command line.
- [Rubix CLI troubleshooting](https://docs.rubixkube.ai/cli/troubleshooting.md): Common issues with the Rubix CLI: login failures, stale credentials, proxy and firewall problems, Node.js version mismatches.
- [Using the Rubix CLI](https://docs.rubixkube.ai/cli/usage.md): Launch, log in, start a chat, resume sessions, and drive Rubix from your terminal. Keyboard-first workflows for SREs and platform engineers.
- [Actions](https://docs.rubixkube.ai/concepts/actions.md): The unit of change in RubixKube. Every recommended fix is an Action with an explicit scope, blast radius, and approval path.
- [The Agent Mesh](https://docs.rubixkube.ai/concepts/agent-mesh.md): A coordinated set of specialised AI agents, each expert in one slice of reliability work. They share a Knowledge Graph and run the OPEL loop together.
- [Environments](https://docs.rubixkube.ai/concepts/environments.md): The unit of scope in RubixKube. A Kubernetes cluster, an AWS account, a GCP project, or a set of VMs. Mix any number in one workspace.
- [Safety and Guardrails](https://docs.rubixkube.ai/concepts/guardrails.md): How RubixKube stays autonomous without being reckless. Read-only by default, explicit approvals on every change, evidence-backed recommendations, and data handling you can actually audit.
- [How RubixKube works](https://docs.rubixkube.ai/concepts/how-rubixkube-works.md): The short version: a mesh of specialised AI agents runs the OPEL loop (Observe, Plan, Execute, Learn) against your infrastructure, backed by a memory that compounds over time.
- [Insights](https://docs.rubixkube.ai/concepts/insights.md): An Insight is an anomaly RubixKube thinks a human should know about. Learned baselines, scoped severity, and evidence behind every card.
- [The Knowledge Graph](https://docs.rubixkube.ai/concepts/knowledge-graph.md): A live model of your infrastructure: every resource, relationship, signal, and incident in one queryable graph.
- [The Memory Engine](https://docs.rubixkube.ai/concepts/memory-engine.md): Private, compounding memory for your infrastructure. Every incident, RCA, and correction stays in your workspace and sharpens the next investigation.
- [The Observer Agent](https://docs.rubixkube.ai/concepts/observer-agent.md): The Observer is the eyes of RubixKube. It runs near your workload, discovers topology, collects signals, and keeps the Knowledge Graph current.
- [The OPEL loop](https://docs.rubixkube.ai/concepts/opel-loop.md): Observe, Plan, Execute, Learn. The four-phase rhythm that every RubixKube agent follows, policy-driven and memory-backed.
- [Root Cause Analysis](https://docs.rubixkube.ai/concepts/root-cause-analysis.md): Evidence-linked causal chains for every investigation. How the RCA Pipeline Agent builds them, what is inside, and how to read one well.
- [Skills](https://docs.rubixkube.ai/concepts/skills.md): Structured runbooks the SRI Agent can invoke. Turn operational know-how into reusable workflows with the open Agent Skills format.
- [AWS with RubixKube](https://docs.rubixkube.ai/environments/aws.md): Account-level observation across EC2, RDS, Lambda, S3, ELB, CloudTrail, and CloudWatch. Installer creates the read-only IAM role and policy for you.
- [Azure with RubixKube](https://docs.rubixkube.ai/environments/azure.md): AKS today through the Kubernetes path. Subscription-level Azure observation for App Service, Azure VMs, and managed data is on the roadmap.
- [GCP with RubixKube](https://docs.rubixkube.ai/environments/gcp.md): Project-level observation across GCE, GKE, Cloud SQL, Cloud Run, Cloud Storage, and Cloud Functions. Installer creates the service account and attaches viewer roles.
- [Kubernetes with RubixKube](https://docs.rubixkube.ai/environments/kubernetes.md): Install the RubixKube Observer on any Kubernetes cluster. EKS, GKE, AKS, KIND, and bare metal clusters on v1.24 or later are all supported.
- [Bare metal and Linux VMs with RubixKube](https://docs.rubixkube.ai/environments/vm.md): Host-level monitoring for any Linux VM or bare metal server. CPU, memory, disk, network, process, and systemd signals.
- [Connect your environment](https://docs.rubixkube.ai/getting-started/connect-your-environment.md): Install the RubixKube Observer on Kubernetes, AWS, GCP, or a Linux VM. The Observer discovers your topology and streams signals to RubixKube Cloud.
- [Welcome to RubixKube](https://docs.rubixkube.ai/getting-started/overview.md): RubixKube is the Reliability Layer for the AI era. An AI-native mesh of agents that watches your infrastructure, diagnoses root cause, and keeps systems alive across Kubernetes, AWS, GCP, and Linux VMs.
- [Quickstart](https://docs.rubixkube.ai/getting-started/quickstart.md): The shortest path to RubixKube. Create an account, connect an environment, and ask your first question.
- [Atlassian Rovo integration](https://docs.rubixkube.ai/integrations/atlassian-rovo.md): Connect the Atlassian stack so RubixKube agents can create Jira tickets, publish Confluence pages, read Bitbucket deploys, and bridge with Rovo agents.
- [Custom MCP Servers](https://docs.rubixkube.ai/integrations/custom-mcp-servers.md): Extend RubixKube with your own Model Context Protocol servers. Expose internal tools to the SRI Agent with typed arguments and scoped authorisation.
- [Custom REST Integrations](https://docs.rubixkube.ai/integrations/custom-rest.md): Point RubixKube at a REST API with an OpenAPI spec. Every endpoint becomes a tool the SRI Agent can call, with scoped auth and approval gates.
- [Datadog integration](https://docs.rubixkube.ai/integrations/datadog.md): Connect Datadog to RubixKube today via the custom integrations path. Agents can query metrics, logs, monitors, and APM services to enrich investigations and RCAs.
- [Dynatrace integration](https://docs.rubixkube.ai/integrations/dynatrace.md): Connect Dynatrace to RubixKube today via the custom integrations path. Agents can pull problems, entity health, and metrics to enrich investigations and RCAs.
- [GitHub integration](https://docs.rubixkube.ai/integrations/github.md): Give RubixKube agents read and write access to your GitHub repos so they can correlate deployments with incidents, cite PRs in RCAs, and propose soft remediations as pull requests.
- [GitLab integration](https://docs.rubixkube.ai/integrations/gitlab.md): Give RubixKube agents read and write access to your GitLab repos so they can correlate CI deployments with incidents, cite MRs in RCAs, and propose soft remediations as merge requests.
- [Grafana integration](https://docs.rubixkube.ai/integrations/grafana.md): Connect Grafana to RubixKube today via the custom integrations path. Agents can read dashboards, alert state, and datasource queries to enrich investigations and RCAs.
- [Linear integration](https://docs.rubixkube.ai/integrations/linear.md): Connect Linear so RubixKube agents can read your projects, create tickets from insights and RCAs, and keep incident state in sync with the tool your team already tracks work in.
- [Microsoft Teams integration](https://docs.rubixkube.ai/integrations/microsoft-teams.md): Teams is both an interface to RubixKube and a notification channel. Talk to the bot in any channel to ask questions, trigger investigations, and approve actions. Get insights, RCAs, approvals, and digests where your team already lives.
- [New Relic integration](https://docs.rubixkube.ai/integrations/new-relic.md): Connect New Relic to RubixKube today via the custom integrations path. Agents can run NRQL, read APM service health, and pull alert state to enrich investigations and RCAs.
- [Notion integration](https://docs.rubixkube.ai/integrations/notion.md): Connect your Notion workspace so RubixKube agents can read team docs, publish RCAs as pages, and keep post-mortems in the wiki you already use.
- [OAuth and authentication types](https://docs.rubixkube.ai/integrations/oauth-and-auth-types.md): Reference for every authentication method RubixKube supports on integrations. API keys, OAuth 2.0, mTLS, signed requests, and more.
- [PagerDuty integration](https://docs.rubixkube.ai/integrations/pagerduty.md): Connect PagerDuty so RubixKube agents can trigger incidents for critical insights, sync acknowledgement and resolution, and keep the on-call picture in step with RubixKube.
- [Prometheus integration](https://docs.rubixkube.ai/integrations/prometheus.md): Connect Prometheus to RubixKube today via the custom integrations path. Agents can run PromQL and pull Alertmanager state to enrich investigations and RCAs.
- [Sentry integration](https://docs.rubixkube.ai/integrations/sentry.md): Connect Sentry to RubixKube today via the custom integrations path. Agents can read issues, events, and releases to enrich investigations and RCAs.
- [Slack integration](https://docs.rubixkube.ai/integrations/slack.md): Slack is both an interface to RubixKube and a notification channel. Talk to the bot in any channel to ask questions, trigger investigations, and approve actions. Get insights, RCAs, approvals, and digests where your team already lives.
- [Frequently Asked Questions](https://docs.rubixkube.ai/support/faq.md): Quick answers to common RubixKube questions.
- [Glossary](https://docs.rubixkube.ai/support/glossary.md): Canonical definitions for every RubixKube term: agents, environments, RCAs, skills, and the rest of the vocabulary.
- [System Status](https://docs.rubixkube.ai/support/system-status.md): Check the real-time health of all RubixKube platform services at rubixkube.ai/status. See current operational state, 7-day uptime, and 30-day incident history.
- [Troubleshooting Guide](https://docs.rubixkube.ai/support/troubleshooting.md): Diagnose and resolve common issues in RubixKube.
- [How to add custom agent skills](https://docs.rubixkube.ai/tutorials/add-custom-agent-skills.md): Turn a runbook into a reusable skill the SRI Agent can run. This tutorial covers the skill file format, allowed tools, scope, and testing.
- [How to add custom integrations](https://docs.rubixkube.ai/tutorials/add-custom-integrations.md): Connect RubixKube to your own internal tools. Covers Custom MCP servers, REST integrations, and the OAuth flows that make them safe.
- [How to automate incident remediation with RubixKube](https://docs.rubixkube.ai/tutorials/automate-incident-remediation.md): Go from detection to safe resolution. Learn how RubixKube surfaces the right RCA, recommends the fix, and how approvals keep you in control.
- [Advanced Chat: Personas and Workflows](https://docs.rubixkube.ai/tutorials/chat-advanced.md): Master Chat for different roles. SRE, DevOps, and platform engineer workflows with real examples.
- [Chat Basics: Your First Queries](https://docs.rubixkube.ai/tutorials/chat-basics.md): Learn the Chat interface with hands-on examples. Start here for your first conversation with RubixKube.
- [Cost analysis with Chat](https://docs.rubixkube.ai/tutorials/chat-cost-analysis.md): Analyse and reduce infrastructure cost using RubixKube Chat. Rightsizing, anomaly hunting, weekly budget reviews.
- [Troubleshooting with Chat](https://docs.rubixkube.ai/tutorials/chat-troubleshooting.md): Use Chat to investigate real incidents: OOMKilled, CrashLoop, ImagePullBackOff, latency spikes.
- [How to monitor infrastructure health with RubixKube](https://docs.rubixkube.ai/tutorials/monitor-infrastructure-health.md): Set up continuous health monitoring across Kubernetes, AWS, GCP, and VMs. Learn what each dashboard surface tells you and when to act.
- [Talk to your infrastructure](https://docs.rubixkube.ai/tutorials/talk-to-infra.md): Ask your infrastructure questions in plain English. Chat is RubixKube's conversational gateway, backed by the SRI Agent and the whole Agent Mesh.
- [Action Center](https://docs.rubixkube.ai/using/action.md): Manage and track remediation actions from RCA reports and incidents in one place.
- [Agent Skills Store](https://docs.rubixkube.ai/using/agent-skills-store.md): Browse and enable system skills built by RubixKube, or create your own. Every skill is a reusable runbook the SRI Agent can run.
- [Reliability Intelligence](https://docs.rubixkube.ai/using/analytics.md): Actionable reliability metrics for engineering leadership. Time saved, MTTU, RCA coverage, risk concentration, recurrence, and resolution velocity across every environment.
- [Dashboard](https://docs.rubixkube.ai/using/dashboard.md): Your command centre for Site Reliability Intelligence. Health, insights, actions, RCA reports and infrastructure at a glance across every environment.
- [Environments](https://docs.rubixkube.ai/using/environments.md): Understand and operate Kubernetes, AWS, GCP, and VM environments in one RubixKube workspace.
- [Infrastructure Topology](https://docs.rubixkube.ai/using/infrastructure.md): Visualise and navigate your infrastructure with interactive topology views across Kubernetes, AWS, GCP, and VMs.
- [Magic Insights](https://docs.rubixkube.ai/using/insights.md): Incident detection, root cause analysis, and evidence-based troubleshooting in the RubixKube console.
- [Integrations](https://docs.rubixkube.ai/using/integrations.md): Connect RubixKube with Slack, PagerDuty, Linear, Notion, and your own private tools.
- [Notifications](https://docs.rubixkube.ai/using/notifications.md): Route insights, RCA reports, and pending actions to the channels your team already uses. Slack, Microsoft Teams, PagerDuty, email.
- [RCA Reports](https://docs.rubixkube.ai/using/rca-reports.md): Evidence-linked root cause reports for every incident. Observed conditions, causal chain, recommended actions, and verification, all in one place.
- [Rubix Chat Agent](https://docs.rubixkube.ai/using/rubix-chat-agent.md): The conversational surface of RubixKube. Ask your infrastructure questions in plain English, run skills, and launch investigations from the terminal or the console.
- [Skills](https://docs.rubixkube.ai/using/skills.md): Enable system skills or create custom ones to extend the capabilities of your agents.
- [Team and Workspace Management](https://docs.rubixkube.ai/using/workspace.md): Invite team members, manage roles, and collaborate on reliability inside a RubixKube workspace.

## Optional

- [Blog](https://rubixkube.ai/blog)
- [Console](https://console.rubixkube.ai)
- [Status](https://rubixkube.ai/status)