Safety Guardrails: Trust Through Control
Guardrails are RubixKube’s multi-layered safety system that ensures autonomous operations remain safe, controlled, and reversible. They’re the reason you can trust AI agents to act on your infrastructure without constant supervision.The Core Principle: Autonomous doesn’t mean uncontrolled. Every action happens within strict safety boundaries, with human oversight available when needed.
The Trust Problem
Why Guardrails Matter
Autonomous systems without guardrails are dangerous:Cascading Failures
One wrong decision can trigger a chain reaction across your infrastructure
Blast Radius Explosion
A cluster-wide change gone wrong affects all services, all customers
Irreversible Damage
Some actions (delete data, scale to zero) can’t be undone
Compliance Violations
Unauthorized changes can breach regulatory requirements
Guardrails prevent all of these.
The Seven Layers of Guardrails
RubixKube implements defense in depth with multiple safety layers:1. Scope Limiting
Principle: Actions are confined to the smallest possible scope.- What It Does
Blast Radius Containment:
- Changes affect one pod → not the deployment
- Changes affect one deployment → not the namespace
- Changes affect one namespace → not the cluster
- Changes affect one cluster → not all clusters
Auto-fix will do this:
action: restart_pod scope: production/checkout-7f9d blast_radius: single_podAuto-fix will NEVER do this (requires approval):
action: restart_all_pods scope: production/* blast_radius: entire_namespace2. Action Classification
Principle: Risk-based action categorization determines autonomy.Low Risk (Auto-Approved)
Low Risk (Auto-Approved)
Actions that rarely cause harm:
- Restart a single failed pod
- Scale deployment within existing HPA bounds
- Update resource requests (within limits)
- Retry failed jobs
Medium Risk (Suggest + Review)
Medium Risk (Suggest + Review)
Actions that need human judgment:
- Modify deployment replicas beyond HPA range
- Change resource limits (may cause pod restarts)
- Update service configurations
- Modify network policies
High Risk (Require Explicit Approval)
High Risk (Require Explicit Approval)
Actions that could cause outages:
- Delete resources
- Modify RBAC permissions
- Change namespace quotas
- Update critical system pods
- Multi-service changes
Forbidden (Always Blocked)
Forbidden (Always Blocked)
Actions never allowed autonomously:
- Delete namespaces
- Delete persistent volumes
- Modify cluster-level resources
- Grant cluster-admin permissions
- Disable monitoring/observability
3. Dry-Run Mode
Principle: Test before you touch. Every proposed action runs in simulation first:4. Rate Limiting
Principle: Prevent rapid-fire mistakes. Guardrails control how many and how fast actions can occur:- Max 1 auto-fix per pod per 5 minutes
- Max 3 auto-fixes per namespace per hour
- Max 10 auto-fixes per cluster per hour
- Circuit breaker: If 2 actions fail, pause auto-fix for 1 hour
Example Protection:
5. Change Approval Workflows
Principle: Different environments, different rules.- Development
- Staging
- Production
6. Rollback Capability
Principle: Every change must be reversible. Before any action executes, RubixKube: 1.Captures current state (manifest, config, resource versions) 2.Generates rollback plan (exact steps to undo) 3.Tests rollback plan (dry-run validation) 4.Stores rollback trigger (one-click revert)If something goes wrong:
Example: Failed Auto-Fix with Rollback
Example: Failed Auto-Fix with Rollback
7. Audit Logging
Principle: Complete transparency and accountability. Every action is logged with full context:- Understand exactly what changed and why
- Reproduce or debug actions later
- Prove safety for security reviews
Customizing Guardrails
Configuring Safety Levels
You control how aggressive or conservative RubixKube acts:- Conservative (Default)
- Balanced
- Aggressive
Per-Resource Policies
Fine-grained control for specific resources:Critical services get extra protection. Non-critical get more autonomy.
Guardrail Enforcement
What Happens When Guardrails Trigger
1
Action Proposed
Remediation Agent wants to fix an issue
2
Guardrails Evaluate
Guardian Agent checks all safety rules:
- Scope within limits?
- Risk classification? Medium
- Approval policy? Medium = require approval
- STOP: Cannot proceed without human
3
Human Notified
Alert sent to appropriate channel:
4
Human Decides
-Approve → Action executes with full logging
-Deny → Action cancelled, incident escalated
-Modify → Adjust parameters, resubmit
-Timeout → Depends on policy (abort or notify)
Safety Mechanisms in Detail
Blast Radius Calculation
How RubixKube determines impact:
Example:
Policy-Based Controls
Define what’s allowed, what’s not:
Time-Based Policies
Time-Based Policies
Example: Deployment Windows
Environment-Based Policies
Environment-Based Policies
Different rules for different environments:
Resource-Based Policies
Resource-Based Policies
Protect critical resources:
3. Approval Workflows
Human-in-the-loop when it matters:

Workflow Steps:
1.Action Proposed → Notification sent (Slack, email, dashboard) 2.Context Provided → Full RCA, evidence, risk assessment 3.Human Reviews → Examine proposed changes 4.Decision Made → Approve, deny, or modify 5.Action Logged → Who approved, when, why Approval Methods: -** Approval Methods:** - Click approve in UI -Slack - React with emoji or click button -CLI -rubixkube approve <action-id>
-API - Programmatic approval for custom workflows
4. Circuit Breakers
Principle: Stop automatically if things go wrong.Circuit Breaker States:
- Closed (Normal)
- Open (Paused)
- Half-Open (Testing)
5. Resource Limits
Principle: Prevent resource exhaustion. Guardrails ensure RubixKube itself doesn’t consume excessive resources:6. Least Privilege RBAC
Principle: Minimum necessary permissions. RubixKube Observer Agent runs with restricted permissions:Even if compromised, Observer can’t modify your cluster.
7. Verification & Monitoring
Principle: Trust, but verify. After every action, guardrails verify success:1
Pre-Action Check
- Dry-run passed?
- Approvals obtained?
- Resources available?
2
Action Execution
- Apply change
- Monitor in real-time
- Watch for errors
3
Post-Action Verification
- Health check: Pod running?
- Metrics check: Error rate normal?
- Dependency check: Downstream services OK?
- User impact: Latency acceptable?
4
Outcome Decision
-Success: Log and continue
-Failure: Trigger rollback
-Uncertain: Alert human, freeze actions
Guardrail Scenarios
Scenario 1: Guardrails Prevent Disaster
What Happened:
Guardrails saved you from a self-inflicted outage.
Scenario 2: Human Override
When to override guardrails:
Configuring Guardrails
Access Guardrail Settings
Navigate to Settings → Security → Guardrails (coming soon in Beta). Currently, configure via API or support team:Guardrails Best Practices
Start Conservative
Begin with observe-only mode . Let RubixKube watch for 1-2 weeks before enabling auto-fix.
Enable Gradually
Turn on auto-fix for low-risk actions first (pod restarts). Add medium-risk after trust builds.
Test in Staging
Enable aggressive auto-fix in staging first . Learn guardrail behavior before production.
Monitor Audit Logs
Review weekly: What actions occurred? Any blocked? Any failures? Adjust policies accordingly.
Set Clear Policies
Define what’s allowed when explicitly. Ambiguity creates risk.
Practice Rollbacks
Test rollback procedures monthly. Ensure they work when you need them.
Frequently Asked Questions
Can I disable guardrails entirely?
Can I disable guardrails entirely?
No. Some guardrails are ** No.** for safety:
- Forbidden actions list (e.g., delete namespaces)
- Audit logging
- Rollback capability
- Resource limits on Observer Agent
What happens if I need to act faster than guardrails allow?
What happens if I need to act faster than guardrails allow?
Two options:
1.Manual Override - Bypass guardrails with justification (logged) 2.Adjust Policies - Lower safety threshold for specific scenariosBoth require admin permissions and create audit trails.Do guardrails slow down incident response?
Do guardrails slow down incident response?
For low-risk actions: NO Guardrail evaluation takes less than 100msFor high-risk actions: YES, intentionally ** The 30-60 seconds for human approval is worth it**to prevent making incidents worse.Most incidents (80%+) are low/medium risk → Fast autonomous response
Critical incidents (20%) → Human judgment essential anyway
Can different teams have different guardrail policies?
Can different teams have different guardrail policies?
Yes! Policies can be scoped by:-Namespace - Team A’s namespace has different rules than Team B’s
-Label - Resources labeled
high-risk get extra protection
-User Role - Admins can override, operators cannot
-Time - Different rules during business hours vs off-hoursFlexible policy engine supports complex organizational needs.Guardrails + Memory + Agent Mesh = Safe SRI
The three work together:This is Site Reliability Intelligence.
Related Concepts
What is SRI?
The foundation of intelligent reliability
Agent Mesh
Agents that guardrails protect
Memory Engine
Knowledge that informs safe decisions