Skip to main content
RubixKube is autonomous, not reckless. The system can recommend and prepare changes at machine speed, but every mutation passes through controls you run. This page is the plain-English model of how the safety layer works. In short: we watch, we analyse, we recommend. Your team decides what to do. Nothing changes without you. That is the default posture. Policies let you adjust it for lab environments, tighten it for regulated production, or route mutating changes through your existing GitOps pipeline.

Three safety layers

Least privilege, read only by default

The Observer is the only component in your environment. Read only, scoped permissions, outbound only.

Explicit approvals on every change

Two approval shapes: in-product review, or a pull request against your infrastructure repo.

Audit on everything

Every proposal, approval, payload, and outcome is recorded and exportable.

Layer 1: Least privilege

The Observer is the only RubixKube component deployed into your environment. It runs as a read-only workload. Cluster-admin is never required.

Kubernetes RBAC scope

The Observer uses a Kubernetes ServiceAccount bound to a scoped ClusterRole. The role grants get, list, and watch on exactly these resource types:
pods, nodes, namespaces,
deployments, replicasets,
services, endpoints, ingress,
events, configmaps
No write permissions. No access to secret contents. Nothing outside this list.

Configurable visibility

You can tighten the scope further. The Observer supports:
  • Inclusion and exclusion lists by namespace, label, or resource type.
  • Redaction rules that strip sensitive patterns from signals before they leave your environment.
  • Per-environment overrides so staging and production can run different policies.

Outbound-only network

The Observer opens exactly two outbound connections. Nothing else.
EndpointPurposeProtocol
nats.rubixkube.ai:4222Signal streamingTCP with TLS
api.rubixkube.ai:443Registration, control, token exchangeHTTPS
No inbound ports are opened. No VPN, tunnel, or exposed service. If either endpoint is unreachable, the Observer buffers locally and catches up when the connection returns.

Tokens and rotation

The Observer authenticates with an API key issued during registration. It exchanges that key for a short-lived JWT on every subsequent control-plane call. Tokens can be rotated or revoked at any time through the console or the API.

Layer 2: Explicit approvals

RubixKube will recommend changes. It will not apply them without an explicit human approval. There are two shapes that approval can take, depending on how you operate.

In-product approvals

Every proposed action surfaces in the Action Center with its scope, reasoning, evidence, and expected blast radius. Approvers click Approve to run the change. Rejecting with a reason sends the proposal back for a rethink, and the reason feeds future recommendations.

Pull request based, for IaC and GitOps

If you already run infrastructure as code, RubixKube can propose changes as pull requests against your infrastructure repository instead of applying through an API call.

Supported workflows

Terraform, Helm, Kustomize, Crossplane, ArgoCD, Flux. Any repo that drives your infrastructure.

How it looks

A branch, a PR, a diff, a reviewer. Same flow your team already uses. RubixKube becomes one more contributor.
The review, the approval, and the merge happen in your pipeline. Nothing bypasses your existing checks. This is the preferred path for teams who already have a mature deployment flow.

Layer 3: Audit on everything

Every action has a complete trail:
  • Actor, which agent, skill, or user proposed the change.
  • Approver, who signed off.
  • Scope, exact resources affected.
  • Payload, the literal change (patch body, PR diff, or command).
  • Outcome, verified, partial, or reverted.
Audit records are immutable. Retention follows your plan and is configurable per workspace. Export is supported out of the box (JSON via API or on request).

Evidence-backed reasoning

Every RCA report cites its evidence. Every recommended action carries a confidence score, an expected blast radius, and a reversibility plan. When confidence is low, you can rerun the analysis with additional context rather than act on shaky ground. The confidence score is visible before you approve. Treat low-confidence actions the way you would treat a junior engineer’s suggestion: plausible, worth discussing, not automatically correct.

Human in the loop

Agents stop and ask when they need help. When an investigation runs into ambiguity, when an action carries more risk than expected, or when context is thin, the system escalates to a human rather than guess. This is the default behaviour, not a failure mode.

What RubixKube sees, and what it does not

  • Infrastructure state: resource health, pod, deployment, node, and service status.
  • Topology: workload relationships and dependency graphs.
  • Logs relevant to anomaly detection and RCA context.
  • Metrics: CPU, memory, OpenTelemetry signals, golden metrics, health indicators.
  • Events: Kubernetes events and condition changes.
  • Customer application data.
  • Personally identifiable information (PII).
  • Secret contents. The Observer sees secret metadata (names, keys, revision) only, never values.
  • Anything outside the scoped RBAC resources listed above.

Obfuscation layer

The Observer applies a configurable obfuscation layer before signals leave your environment. Patterns to redact are set per workspace and per environment. Customers running against sensitive workloads typically tune this layer during onboarding.

Multi-tenancy

Isolation is a foundational design principle, enforced at every layer of the platform:
  • API. Every request is tenant-scoped and validated server-side.
  • Auth. JWT claims carry tenant id. All token validation is tenant-bounded.
  • Data stores. Every collection and query is tenant-scoped at the repository layer.
  • Knowledge graph. All graph queries anchor to the tenant node. Cross-tenant traversal is architecturally impossible.
  • Event bus. Streaming topics include a tenant segment. Consumers are tenant isolated.
  • AI agent sessions. Context, memory, and tool access are bounded to the tenant.
No tenant can see any other tenant’s signals, RCAs, or memory.

Encryption

In transit

TLS 1.2 or later on every path. NATS signal streaming is TLS encrypted. API and UI traffic is HTTPS only.

At rest

AES-256 on every durable store.

Key management

Managed keys on the hyperscaler today. Customer-managed keys (CMEK) are on the roadmap for enterprise customers.

Secrets

Platform credentials are held in a centralised secrets manager with strict access controls.

Data ownership, export, and deletion

Customer data belongs to the customer at all times. RubixKube operates as the processor, not the owner.
Tenant data is exportable in JSON via the API or on request. Use this for archival, regulator requests, or switching deployment modes.
All tenant data is permanently deleted from every storage system within 30 days of termination. Written confirmation is provided on request.

Agent-level safeguards

Network controls and RBAC protect the environment. Agent-level safeguards bound the agents themselves. RubixKube follows the practices laid out in Google’s ADK safety guidance.
  • Scoped tool surface. Each agent has an explicit list of tools it can call. The model cannot invent tools or reach beyond its declared surface.
  • In-tool policy enforcement. Tools validate their inputs against deterministic policies (tenant, environment, allowed scopes) before executing. Out-of-scope calls return an error to the agent without running.
  • Pre-execution callbacks. Every tool call is screened against the current workspace policy before it runs. A mismatch blocks the call and feeds the reason back to the agent.
  • Safety filters on model I/O. Provider-native content filters are enabled on the underlying LLMs. System prompts set scope and prohibited topics.
  • Guard model screening. A fast, light model screens inputs and tool outputs for prompt injection and jailbreak attempts before they reach the reasoning agent.
  • PII redaction at the tool boundary. PII detected in agent inputs or tool outputs is redacted before storage, logging, or onward model calls.
  • Sandboxed code execution. When an agent runs a command or a script, it runs inside a hermetic sandbox: no outbound network, no persistent state, full cleanup across runs. This prevents model-generated code from exfiltrating data or persisting across sessions.
  • Output escaping in the UI. Model-generated content is escaped before it renders in the console, so injected HTML or JavaScript cannot execute in a user’s browser.
  • Continuous evaluation. Internal eval suites score agent outputs for correctness and alignment. New model versions must clear the bar before reaching production.
Combined with least privilege, approvals, and audit, this keeps model errors recoverable rather than catastrophic.

Reducing hallucinations

The agents use flagship frontier models from Anthropic, OpenAI, and Google. We layer the following on top of the base models:
  • Targeted fine-tuning on infrastructure reasoning tasks.
  • Temperature controls tuned per task through internal benchmarks.
  • Tool-call boundaries so each agent can only reach tools in its allowed list.
  • Evidence requirement on every claim, so an unfounded output cannot make it into an RCA.
No single technique is a silver bullet. The combination of a narrow agent mesh, cited evidence, confidence scores, and mandatory approvals is what keeps the system recoverable when a model is wrong.

Compliance

SOC 2 Type II

In progress. Type II report planned for Q3 2026.

GDPR

Data minimisation by design. Deletion requests honoured within 30 days.

ISO 27001 and PCI DSS

Not currently in scope.

Detailed questionnaire

For a full security questionnaire tailored to your deployment, contact security@rubixkube.ai.

Availability and incident response

  • Availability target: 99.9% uptime (SLO). Formal SLA available for enterprise agreements.
  • Status page: https://rubixkube.ai/status
  • Security contact: security@rubixkube.ai
  • Incident response: on-call rotation in place. Security incidents are triaged within 24 hours, with customer notification as part of the response flow.

Common questions

Not in the default configuration. You can opt specific low-risk actions in specific environments into auto-approval, but this is opt-in, per environment, and audited. The default remains: human review before apply.
Air-gapped is not supported in the current release. The platform needs outbound reachability to api.rubixkube.ai:443 and nats.rubixkube.ai:4222. VPC-only and private-network deployments work as long as those two endpoints are reachable. A self-hosted option is on the roadmap for enterprise customers.
On the RubixKube SaaS control plane. Primary data region is ap-south-1 (Mumbai). Regional options for enterprise customers are on the roadmap. For specifics on your deployment, contact security@rubixkube.ai.
From the console or API. Revoke the API key, issue a new one, redeploy the Observer with the new key. The old token is invalidated immediately.
Yes. Use inclusion and exclusion lists by namespace, label, or resource type. You can also tune the obfuscation layer to redact sensitive patterns before signals leave your environment.
All tenant data is permanently deleted from every storage system within 30 days. Written confirmation is provided on request. Export your data before termination if you need a copy.

Actions

The unit of change the approval flow evaluates.

Observer Agent

The only component deployed in your environment, and why it stays read only.

Agent Mesh

The broader system the safety layer protects.

Root Cause Analysis

How evidence-backed reasoning works end to end.