Skip to main content

Troubleshooting Guide

This guide helps you quickly diagnose and resolve common issues across the RubixKube platform.

Quick Checklist

  • Verify connectivity: Ensure the Console dashboard loads and you are not seeing global 401/403 errors.
  • Confirm agent health: Check that the Observer, RCA Pipeline, and SRI agents are Active in the Agents view.
  • Check cluster readiness: Ensure your kube-apiserver is reachable and nodes are Ready.
  • Review Insights errors: Open the latest incident and read the Evidence and RCA report.
  • Check integrations: Ensure your MCP servers and OAuth integrations are properly authenticated.

Authentication and Access

401 Unauthorized in Console

  • Cause: Your session has expired, or the tenant context is missing.
  • Fix:
    1. Click Refresh in the header or log out and log back in.
    2. If persistent, clear your browser cookies for the console.rubixkube.ai domain.
    3. Verify your Tenant ID exists in Settings → Organization.

Cannot Login

  • Cause: SSO blocked, clock skew, or an expired invitation.
  • Fix:
    • Ensure your system time is correct.
    • Try an alternate provider (Google/GitHub) if your primary SSO fails.
    • If using an invitation link, verify it hasn’t expired.
    • Contact support with the timestamp and your email address.

Dashboard & Real-Time Updates

Real-Time Updates (SSE) Not Working

  • Cause: Network proxy or firewall blocking Server-Sent Events (SSE).
  • Fix:
    • Ensure your network allows long-lived HTTP connections to console.rubixkube.ai.
    • Check the browser console for connection drops. The Console will automatically attempt to reconnect.

Multi-Cluster Data Missing

  • Cause: The Cluster Selector is set to the wrong cluster, or your plan doesn’t support multi-cluster.
  • Fix:
    • Use the Cluster Selector in the header to ensure you’re viewing the correct environment.
    • Verify your subscription tier (Pro/Enterprise) in Settings.

Agents & RCA

Agents Show “Inactive”

  • Cause: The Observer pods are not running in the rubixkube-system namespace.
  • Fix:
    kubectl get pods -n rubixkube-system
    kubectl logs deploy/rubixkube-observer -n rubixkube-system
    
    • Verify image pull secrets and service account permissions.
    • Ensure the cluster has outbound connectivity to nats.rubixkube.ai:4222.

RCA Not Generated

  • Cause: Missing evidence, low signal, or the Insight is stuck in QUEUED_FOR_RCA.
  • Fix:
    • Ensure logs and metrics sources are reachable by the Observer.
    • The RCA Reconciler will automatically retry failed tasks. If it persists, check the Insight details for specific errors.

Chat Agent Unresponsive

  • Cause: The AI Agent platform (New Opel) is experiencing high latency or a session timeout.
  • Fix:
    • Refresh the chat window to establish a new session.
    • Check the System Status page for any ongoing incidents with the AI models (Vertex AI/Gemini).

Clusters & Infrastructure

No Clusters Found

  • Cause: The Observer agent has not been installed or registered.
  • Fix:
    kubectl apply -f "https://api.rubixkube.ai/install/observer.yaml?apiKey=YOUR_KEY"
    
    • Wait 1–2 minutes; verify nodes and pods are Ready.
    • Check the Orchestrator logs if you are self-hosting.

Knowledge Graph Not Rendering

  • Cause: WebGL/canvas blocked by the browser, or a large topology graph.
  • Fix:
    • Try a different layout (Force/Hierarchical).
    • Disable browser extensions that block canvas rendering.
    • Use the Search or Neighborhood queries to narrow down the scope.

Settings & Integrations

Slack/PagerDuty Not Receiving Alerts

  • Cause: The integration is not configured, or the OAuth token has expired.
  • Fix:
    • Reconnect the integration in Settings → Integrations.
    • Trigger a test event from the Insights dashboard.

Custom MCP Server Fails to Connect

  • Cause: Invalid configuration, unreachable endpoint, or missing authentication.
  • Fix:
    • Verify the MCP server URL is publicly accessible from the RubixKube control plane.
    • Check the Integration Registry for any connection errors.
    • Ensure the correct auth_header_prefix and credentials are provided.

Collecting Diagnostics

Provide the following when contacting support:
  • Tenant ID (found in Settings → Organization)
  • Cluster ID (if applicable)
  • Timestamp and timezone of the issue
  • Browser version and OS
  • Reproduction steps and screenshots
  • Relevant kubectl outputs from the rubixkube-system namespace