Skip to main content

Safety Guardrails: Trust Through Control

Guardrails are RubixKube’s multi-layered safety system that ensures autonomous operations remain safe, controlled, and reversible. They’re the reason you can trust AI agents to act on your infrastructure without constant supervision.
The Core Principle: Autonomous doesn’t mean uncontrolled. Every action happens within strict safety boundaries, with human oversight available when needed.

Coming in Future Release

Full guardrail system is under development for the production release. RubixKube Beta currently operates in observe-only mode with basic safety mechanisms.Active NOW (Beta): - Observe-Only Mode - No automated changes to your cluster
  • Read-Only RBAC - Observer agent has no write permissions
  • Audit Logging - All detections and suggestions logged
  • Manual Approval Required - You control all actions
Coming Q1 2026 (Production): - Full 7-layer guardrail system
  • Automated remediation with safety checks
  • Blast radius calculation and enforcement
  • Circuit breakers and rate limiting
  • Customizable policies per environment
This page describes the complete guardrail architecture being built. For Beta, RubixKube provides intelligent insights and suggestions - you remain in full control of all changes.

The Trust Problem

Why Guardrails Matter

Autonomous systems without guardrails are dangerous:

Cascading Failures

One wrong decision can trigger a chain reaction across your infrastructure

Blast Radius Explosion

A cluster-wide change gone wrong affects all services, all customers

Irreversible Damage

Some actions (delete data, scale to zero) can’t be undone

Compliance Violations

Unauthorized changes can breach regulatory requirements

Guardrails prevent all of these.


The Seven Layers of Guardrails

RubixKube implements defense in depth with multiple safety layers:

1. Scope Limiting

Principle: Actions are confined to the smallest possible scope.
  • What It Does

Blast Radius Containment:

  • Changes affect one pod → not the deployment
  • Changes affect one deployment → not the namespace
  • Changes affect one namespace → not the cluster
  • Changes affect one cluster → not all clusters
Example: ```yaml

Auto-fix will do this:

action: restart_pod scope: production/checkout-7f9d blast_radius: single_pod

Auto-fix will NEVER do this (requires approval):

action: restart_all_pods scope: production/* blast_radius: entire_namespace
</Tab>

<Tab title="Configuration">
Set scope limits in Settings:

```yaml
guardrails:
  auto_fix_scope:
    - pod_level: true          # Can restart individual pods
    - deployment_level: false  # Cannot modify deployments without approval
    - namespace_level: false   # Cannot touch namespace configs
    - cluster_level: false     # Never cluster-wide changes

2. Action Classification

Principle: Risk-based action categorization determines autonomy.

Actions that rarely cause harm:

  • Restart a single failed pod
  • Scale deployment within existing HPA bounds
  • Update resource requests (within limits)
  • Retry failed jobs
Guardrail: Auto-executed, logged for audit

Actions that need human judgment:

  • Modify deployment replicas beyond HPA range
  • Change resource limits (may cause pod restarts)
  • Update service configurations
  • Modify network policies
Guardrail: Proposed to humans, requires click to approve

Actions that could cause outages:

  • Delete resources
  • Modify RBAC permissions
  • Change namespace quotas
  • Update critical system pods
  • Multi-service changes
Guardrail: Require explicit approval + 2FA (optional)

Actions never allowed autonomously:

  • Delete namespaces
  • Delete persistent volumes
  • Modify cluster-level resources
  • Grant cluster-admin permissions
  • Disable monitoring/observability
Guardrail: Blocked entirely, manual only

3. Dry-Run Mode

Principle: Test before you touch. Every proposed action runs in simulation first:
Proposed Action: Increase memory limit
  from: 512Mi
  to: 1Gi

Dry-Run Results:
   Manifest validation: PASS
   Resource quota check: PASS (within namespace limits)
   Pod disruption budget: PASS (1/3 pods, within budget)
   Dependency impact: None detected
   Rollback plan: Generated
  
Predicted Outcome:
  - Pod will restart (estimated 8-12 seconds downtime)
  - Memory usage will normalize
  - No cascading effects expected
  
Confidence: 96%
If dry-run fails, the action is blocked. No exceptions.

4. Rate Limiting

Principle: Prevent rapid-fire mistakes. Guardrails control how many and how fast actions can occur:
  • Max 1 auto-fix per pod per 5 minutes
  • Max 3 auto-fixes per namespace per hour
  • Max 10 auto-fixes per cluster per hour
  • Circuit breaker: If 2 actions fail, pause auto-fix for 1 hour

Example Protection:

12:00: Auto-fix applied to pod A
12:01: Pod A crashes again
12:02: Auto-fix applied to pod A (attempt #2)
12:03: Pod A crashes again
12:04: BLOCKED - Max retries reached
      → Alert SRE team
      → Disable auto-fix for this pod
      → Manual intervention required
Prevents: Infinite loops, thrashing, runaway automation

5. Change Approval Workflows

Principle: Different environments, different rules.
  • Development
  • Staging
  • Production

Minimal Oversight:

environment: development
guardrails:
  auto_fix: enabled
  approval_required: false
  max_blast_radius: namespace
Fast iteration, low risk environment.

6. Rollback Capability

Principle: Every change must be reversible. Before any action executes, RubixKube: 1.Captures current state (manifest, config, resource versions) 2.Generates rollback plan (exact steps to undo) 3.Tests rollback plan (dry-run validation) 4.Stores rollback trigger (one-click revert)

If something goes wrong:

# Automatic rollback triggers if:
- Health check fails after change
- Error rate increases >50%
- Latency spikes >200%
- Pod crash rate increases

# Manual rollback available:
Click "Rollback" button in dashboard
OR: Run `rubixkube rollback <action-id>`
14:30:00 - Auto-fix applied: Scale deployment 3 → 5 replicas
14:30:15 - Health check: 2 new pods failing to start
14:30:20 - Error rate: 15% → 40% (threshold: 50%)
14:30:25 - Automatic rollback triggered
14:30:30 - Reverted to 3 replicas
14:30:35 - Service stabilized
14:30:40 - Alert sent: "Auto-fix rolled back - manual review needed"

Total impact: 40 seconds
Alternative (no rollback): Ongoing outage until manual intervention

7. Audit Logging

Principle: Complete transparency and accountability. Every action is logged with full context:
{
  "action_id": "fix-2025-10-03-abc123",
  "timestamp": "2025-10-03T14:30:00Z",
  "agent": "remediation-agent-v2",
  "trigger": "automated",
  "approved_by": "system (auto-approval policy)",
  
  "target": {
    "cluster": "prod-us-east",
    "namespace": "checkout",
    "resource": "deployment/checkout-service"
  },
  
  "action": {
    "type": "update_resource_limits",
    "changes": {
      "memory.limits": "512Mi → 1Gi"
    }
  },
  
  "context": {
    "incident_id": "inc-2025-10-03-xyz789",
    "root_cause": "OOMKilled",
    "confidence": 0.96,
    "blast_radius": "single_deployment"
  },
  
  "outcome": {
    "status": "success",
    "duration": "8.2s",
    "verification": "pod_healthy",
    "rollback_available": true
  }
}
Benefits: - Complete audit trail for compliance
  • Understand exactly what changed and why
  • Reproduce or debug actions later
  • Prove safety for security reviews

Customizing Guardrails

Configuring Safety Levels

You control how aggressive or conservative RubixKube acts:
  • Conservative (Default)
  • Balanced
  • Aggressive

Best for: Production, risk-averse teams

guardrails_profile: conservative

behavior:
  - observe_only: true
  - auto_fix: false
  - require_approval: all_changes
  - max_blast_radius: pod
  - timeout: abort_on_timeout
Result: Maximum safety, slower response

Per-Resource Policies

Fine-grained control for specific resources:
guardrails:
  policies:
    - name: "Critical Services"
      resources:
        - "production/payment-*"
        - "production/auth-*"
      rules:
        auto_fix: false
        require_approval: always
        require_2fa: true
    
    - name: "Background Jobs"
      resources:
        - "production/batch-*"
        - "production/cron-*"
      rules:
        auto_fix: true
        max_blast_radius: deployment
        approval: none

Critical services get extra protection. Non-critical get more autonomy.


Guardrail Enforcement

What Happens When Guardrails Trigger

1

Action Proposed

Remediation Agent wants to fix an issue
2

Guardrails Evaluate

Guardian Agent checks all safety rules:
  • Scope within limits?
  • Risk classification? Medium
  • Approval policy? Medium = require approval
  • STOP: Cannot proceed without human
3

Human Notified

Alert sent to appropriate channel:
 Approval Required

Action: Scale deployment 'checkout-service'
Change: replicas 3 → 5
Reason: High CPU usage (92%)
Risk: Medium (pod restarts)
Estimated impact: 10-15s rolling update

[Approve] [Deny] [View Details]
4

Human Decides

-Approve → Action executes with full logging -Deny → Action cancelled, incident escalated -Modify → Adjust parameters, resubmit -Timeout → Depends on policy (abort or notify)

Safety Mechanisms in Detail

Blast Radius Calculation

How RubixKube determines impact:

def calculate_blast_radius(action):
    scope = action.target_scope
    dependencies = get_dependencies(scope)
    
    radius = {
        "direct_impact": count_affected_resources(scope),
        "dependency_impact": count_dependent_resources(dependencies),
        "user_impact": estimate_affected_users(scope),
        "revenue_impact": calculate_potential_revenue_loss(scope)
    }
    
    if radius.direct_impact > 10 or radius.user_impact > 1000:
        return "HIGH_RISK - Require approval"
    elif radius.direct_impact > 3 or radius.user_impact > 100:
        return "MEDIUM_RISK - Suggest with review"
    else:
        return "LOW_RISK - Auto-approve possible"

Example:

Action: Restart pod 'api-gateway-7f9d'

Blast Radius Analysis:
  Direct Impact: 1 pod (out of 5 replicas)
  Dependency Impact: 0 (no dependent services fail)
  User Impact: 20% of traffic (during rolling update)
  Revenue Impact: $15/minute × 0.5min = $7.50
  
  Classification: LOW RISK
  Decision: Auto-approved

Policy-Based Controls

Define what’s allowed, what’s not:

Example: Deployment Windows

policies:
  - name: "No Auto-Fix During Business Hours"
    conditions:
      time: "09:00-17:00 Mon-Fri"
      timezone: "America/New_York"
    rules:
      auto_fix: disabled
      reason: "Manual oversight required during peak hours"

  - name: "Aggressive Off-Hours"
    conditions:
      time: "00:00-06:00 daily"
    rules:
      auto_fix: enabled
      reason: "Low traffic window for safe changes"

Different rules for different environments:

policies:
  - environment: production
    require_approval: medium_and_high_risk
    require_2fa: high_risk
    audit_retention: 2_years
  
  - environment: staging
    require_approval: high_risk_only
    require_2fa: false
    audit_retention: 90_days
  
  - environment: development
    require_approval: false
    require_2fa: false
    audit_retention: 30_days

Protect critical resources:

policies:
  - name: "Critical Services Protected"
    resources:
      - label_selector: "tier=critical"
      - namespaces: ["production", "payments"]
    rules:
      auto_fix: false
      min_approvers: 2
      require_incident_ticket: true

3. Approval Workflows

Human-in-the-loop when it matters:

RubixKube Dashboard showing approval workflow

Workflow Steps:

1.Action Proposed → Notification sent (Slack, email, dashboard) 2.Context Provided → Full RCA, evidence, risk assessment 3.Human Reviews → Examine proposed changes 4.Decision Made → Approve, deny, or modify 5.Action Logged → Who approved, when, why Approval Methods: -** Approval Methods:** - Click approve in UI -Slack - React with emoji or click button -CLI - rubixkube approve <action-id> -API - Programmatic approval for custom workflows

4. Circuit Breakers

Principle: Stop automatically if things go wrong.

Circuit Breaker States:

  • Closed (Normal)
  • Open (Paused)
  • Half-Open (Testing)
Status: Operating Normally
- Auto-fixes executing as configured
- Success rate: >90%
- No recent failures

Action: Continue autonomous operations

5. Resource Limits

Principle: Prevent resource exhaustion. Guardrails ensure RubixKube itself doesn’t consume excessive resources:
observer_agent:
  resource_limits:
    cpu: 500m (max 0.5 CPU cores)
    memory: 512Mi (max 512 megabytes)
    api_calls: 100/minute (to Kubernetes API)
  
  if_limits_exceeded:
    - throttle_monitoring_frequency
    - skip_non_critical_checks
    - alert_if_persistent
Benefit: RubixKube never becomes the problem it’s solving.

6. Least Privilege RBAC

Principle: Minimum necessary permissions. RubixKube Observer Agent runs with restricted permissions:
# What Observer CAN do (read-only):
apiGroups: ["", "apps", "batch"]
resources: ["pods", "deployments", "jobs", "services"]
verbs: ["get", "list", "watch"]

# What Observer CANNOT do:
verbs: ["create", "update", "delete", "patch"]
# Remediation happens through controlled API, not direct cluster access

Even if compromised, Observer can’t modify your cluster.

7. Verification & Monitoring

Principle: Trust, but verify. After every action, guardrails verify success:
1

Pre-Action Check

  • Dry-run passed?
  • Approvals obtained?
  • Resources available?
2

Action Execution

  • Apply change
  • Monitor in real-time
  • Watch for errors
3

Post-Action Verification

  • Health check: Pod running?
  • Metrics check: Error rate normal?
  • Dependency check: Downstream services OK?
  • User impact: Latency acceptable?
4

Outcome Decision

-Success: Log and continue -Failure: Trigger rollback -Uncertain: Alert human, freeze actions

Guardrail Scenarios

Scenario 1: Guardrails Prevent Disaster

What Happened:

Incident: API service memory leak
Remediation Agent Proposal: "Restart all pods simultaneously for quick fix"

Guardian Agent Analysis:
   Blast radius: ENTIRE SERVICE (5 pods)
   Downtime: 100% of capacity during restart
   User impact: ~5000 users affected
   Risk: HIGH

Decision: BLOCKED

Alternative Proposal: "Rolling restart (1 pod at a time)"
Guardian Agent Analysis:
   Blast radius: 20% of capacity per step
   Downtime: Zero (4 pods serve during restart)
   User impact: Minimal (slight latency)
   Risk: LOW

Decision: APPROVED

Guardrails saved you from a self-inflicted outage.

Scenario 2: Human Override

When to override guardrails:

Situation: Production database pod stuck, auto-fix blocked (high risk)
SRE Assessment: "We need to restart NOW, outage ongoing"

Manual Override:
  1. SRE clicks "Override Guardrails"
  2. System requires: 
     - Incident ticket number
     - Override justification
     - 2FA confirmation
  3. Action executes with OVERRIDE flag in audit log
  4. Post-incident review required

Result: Fast response when humans decide risk is acceptable
Override is logged and triggers post-incident review. Use sparingly.

Configuring Guardrails

Access Guardrail Settings

Navigate to Settings → Security → Guardrails (coming soon in Beta). Currently, configure via API or support team:
# Example API configuration
curl -X POST https://api.rubixkube.ai/v2/guardrails/config \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "environment": "production",
    "auto_fix_enabled": true,
    "auto_fix_scope": "pod_level_only",
    "require_approval": ["medium_risk", "high_risk"],
    "circuit_breaker_threshold": 3,
    "circuit_breaker_window": "30m"
  }'

Guardrails Best Practices

Start Conservative

Begin with observe-only mode . Let RubixKube watch for 1-2 weeks before enabling auto-fix.

Enable Gradually

Turn on auto-fix for low-risk actions first (pod restarts). Add medium-risk after trust builds.

Test in Staging

Enable aggressive auto-fix in staging first . Learn guardrail behavior before production.

Monitor Audit Logs

Review weekly: What actions occurred? Any blocked? Any failures? Adjust policies accordingly.

Set Clear Policies

Define what’s allowed when explicitly. Ambiguity creates risk.

Practice Rollbacks

Test rollback procedures monthly. Ensure they work when you need them.

Frequently Asked Questions

No. Some guardrails are ** No.** for safety:
  • Forbidden actions list (e.g., delete namespaces)
  • Audit logging
  • Rollback capability
  • Resource limits on Observer Agent
You can adjust thresholds and approval requirements, but core safety mechanisms can’t be disabled.

Two options:

1.Manual Override - Bypass guardrails with justification (logged) 2.Adjust Policies - Lower safety threshold for specific scenariosBoth require admin permissions and create audit trails.
For low-risk actions: NO Guardrail evaluation takes less than 100msFor high-risk actions: YES, intentionally ** The 30-60 seconds for human approval is worth it**to prevent making incidents worse.Most incidents (80%+) are low/medium risk → Fast autonomous response Critical incidents (20%) → Human judgment essential anyway
Yes! Policies can be scoped by:-Namespace - Team A’s namespace has different rules than Team B’s -Label - Resources labeled high-risk get extra protection -User Role - Admins can override, operators cannot -Time - Different rules during business hours vs off-hoursFlexible policy engine supports complex organizational needs.

Guardrails + Memory + Agent Mesh = Safe SRI

The three work together:
Memory Engine:
  "This fix worked 94% of the time for this issue"
  
Agent Mesh:
  "I propose we apply that fix now"
  
Guardrails:
  "Checking safety...  Scope OK,  Risk LOW,  Policy allows"
  "APPROVED - Execute with logging and rollback ready"
  
Outcome: Fast, safe, autonomous remediation

This is Site Reliability Intelligence.



Next Steps