Skip to main content

Advanced Chat: Personas & Workflows

You’ve learned the basics and troubleshooting. Now let’s explore how different team members use Chat for their specific workflows, plus advanced features and power user tips.
Role-based approach: See how SREs, DevOps Engineers, Platform Engineers, and Developers each use Chat differently to maximize productivity.

Different Personas, Different Workflows

🚨 SRE: “Everything is On Fire”

Goal: Triage and resolve production incidents FAST Morning routine:
"Good morning! Any HIGH severity incidents?"
"Show me production pod health"
"What changed overnight?"
During incident:
"URGENT: What's down in production?"

"Which services are affected?"

"Root cause for checkout-service failure?"

"How do I rollback?"

"Is it fixed?"
Time: 3-5 minutes (vs. 20+ minutes) Post-mortem:
"Summarize today's incidents"
"What was the root cause?"
"How long was each service down?"
"Export this conversation for the post-mortem doc"
SRE Pro Tip: Use “URGENT” or “production” in your query - the agent understands priority and responds accordingly.

👨‍💻 DevOps Engineer: “Deploy Safely”

Goal: Validate and deploy without breaking things Pre-deployment checklist:
1. "Status of api-gateway deployment?"
2. "Any recent issues with api-gateway?"
3. [Upload new-deployment.yaml via 📎]
4. "Validate this deployment"
5. "What's the blast radius if this fails?"
6. "Looks good - deploying now"
Post-deployment verification:
"How's the new api-gateway version?"
"Any errors in the logs?"
"Resource usage vs. previous version?"
During rollout:
"Are new pods coming up?"
"Any errors during rollout?"
"Should I continue or rollback?"
Chat becomes your deployment co-pilot - validates changes, monitors rollouts, suggests rollbacks if needed.

🏗️ Platform Engineer: “Optimize Everything”

Goal: Resource efficiency and capacity planning Resource optimization:
"Show me pods with limits less than requests"
"What pods are over-provisioned?"
"Calculate actual vs. requested resources"
"Which deployments need autoscaling?"
"Resource waste by namespace"
Capacity planning:
"Cluster utilization percentage?"
"How many more pods can we run?"
"Memory headroom per node?"
"Project resource needs for 2x traffic"
Cost optimization:
"What pods consume the most resources?"
"Show me idle resources"
"Which nodes are underutilized?"
Platform Engineer Tip: Ask for comparisons! “Compare dev vs. prod resource usage” helps identify over-provisioning in lower environments.

🎓 Junior Developer: “Teach Me”

Goal: Learn Kubernetes while working Learning queries:
"What is a pod?"
"Explain CrashLoopBackOff in simple terms"
"Why does Kubernetes kill OOM pods?"
"What's the difference between Deployment and Pod?"
"How do resource limits work?"
Exploration:
"What applications are deployed here?"
"How does payment-service connect to the database?"
"What technologies are we using?"
"Show me an example of a healthy pod"
Safe experimentation:
"If I delete this pod, what happens?"
"What would breaking this service impact?"
"Is it safe to restart api-gateway?"
The agent becomes a patient teacher - explains concepts with examples from YOUR cluster, not generic docs.

Sample Workflows by Time of Day

☀️ Morning (9 AM): Health Check

"Good morning! Cluster status?"
→ Get: Health %, pod counts, active incidents

"Any issues overnight?"
→ Get: Events from last 8 hours

"All clear for deploys today?"
→ Get: Risk assessment
Time: 30 seconds
Result: Confident to start the day

🌆 Afternoon (2 PM): Pre-Deployment

"Status of payment-service?"
→ Current state, recent issues

[Upload new-deployment.yaml]
"Validate this"
→ Security, resource, config checks

"What's the risk?"
→ Blast radius analysis

"Deploying now - monitor it"
→ Agent watches for issues
Time: 2 minutes
Result: Safe deployment

🌙 Evening (8 PM): Post-Deploy Check

"How's the new payment-service?"
→ Pod status, errors, metrics

"Resource usage vs. old version?"
→ Before/after comparison

"Any regressions?"
→ Error rate, latency check
Time: 1 minute
Result: Sleep well, knowing it’s stable

🚨 3 AM: Incident Response

"URGENT: What's down?"
→ Failing pods, services affected

"Root cause?"
→ Fast RCA

"How do I fix it?"
→ Step-by-step remediation
Time: 3 minutes
Result: Back to sleep

Advanced Features

📎 File Upload & YAML Validation

Click 📎 to upload Kubernetes YAML files. Use cases:
  • Pre-deployment validation
  • Security audits
  • Best practice checks
  • Configuration review
Example workflow:
  1. Click 📎
  2. Select deployment.yaml
  3. Agent analyzes automatically
Sample response:
Analyzed your deployment:

✅ Valid YAML syntax
✅ Resource limits defined
⚠️  Warning: No liveness probe
⚠️  Warning: Running as root
❌ Error: Missing imagePullSecret

Recommendations:
1. Add livenessProbe for health checks
2. Set runAsNonRoot: true
3. Create imagePullSecret

Would you like a fixed version?

🔗 Integration with RubixKube Features

  • Chat + Dashboard
  • Chat + Insights
  • Chat + Memory Engine
Workflow:
  1. See incident spike in Dashboard
  2. Click “Provide to Chat Context”
  3. Chat auto-loads incident data
  4. Ask follow-up questions
Example: Dashboard shows OOMKilled → Chat explains why

Power User Tips

If you see an incident in Insights:
"Analyze incident OOMKilled-20231004"
"Tell me about the crash-loop incident"
Agent pulls full RCA data immediately.
"Compare memory usage: dev vs. prod"
"Which uses more CPU: api-gateway or checkout?"
"Show me the diff between v1.2 and v1.3"
Great for before/after analysis.
Build on previous responses:
"Show me failing pods" → Get list
"Focus on the HIGH severity one" → Drill down
"Show me its logs" → Get evidence
"Explain this error" → Understand it
"How do I fix it?" → Get solution
"Summarize this RCA"
"Generate post-mortem for today"
"Export the fix we applied"
"Create runbook for this issue"
Then click Export conversation → Save as Markdown

Query Best Practices Expanded

✅ DO This

Include Namespace

Good: “Show failing pods in prod”Why: Faster, more accurate

Specify Resource Type

Good: “Why is pod api-gateway failing?”Why: Clearer than just “api-gateway”

Ask Follow-Ups

Good: “Tell me more” or “What about logs?”Why: Leverages context

Use Urgency Keywords

Good: “URGENT” or “production down”Why: Agent prioritizes

❌ DON’T Do This

Don't Repeat Context

Bad: Asking full question again when in conversationWhy: Agent remembers

Don't Be Too Vague

Bad: “Fix it” (without context)Why: Agent needs to know WHAT

Don't Use kubectl Syntax

Bad: “kubectl get pods -A”Why: Just ask! “Show me all pods”

Don't Assume Omniscience

Bad: “Why is it slow?” (which “it”?)Why: Be specific: “Why is checkout-service slow?”

Beta Limitations & Workarounds

Current Limitations:Chat CANNOT (yet):
  • Execute kubectl commands for you
  • Apply configuration changes
  • Restart pods/deployments
  • Create/modify resources
Workarounds:
  • Agent provides kubectl commands → You run them
  • Agent suggests YAML changes → You apply them
  • Agent explains steps → You execute them
Coming Q1 2026: Automated execution with approval workflow

Export & Share

Click Export conversation to:

Markdown Export

Save as .md for documentationUse for: Post-mortems, runbooks

JSON Export

Save as .json for analysisUse for: Audit trails, automation

Share with Team

Copy link (coming soon)Use for: Collaboration

Email Thread

Send conversation (coming soon)Use for: Stakeholder updates

Common Questions

Very high.
  • Responses based on REAL cluster data (not hallucinated)
  • Function calls to actual Kubernetes API
  • Evidence-based RCA
  • Validated against best practices
But: In Beta, always verify critical changes before applying
Not in Beta.Chat provides the kubectl command - you run it.Q1 2026: Automated execution with approval gates
The agent will:
  1. Try multiple approaches (visible in Function Calls)
  2. Ask clarifying questions
  3. Explain what it checked
  4. Suggest alternative queries
Example: If pods not in default, asks which namespace
YES.
  • Encrypted in transit and at rest
  • Workspace-isolated
  • Exportable/deletable anytime
  • SOC 2 compliant
Indefinitely (in Beta, no retention limits)You can delete conversations manually
Not yet.Coming: Custom prompts, preferred response styles, domain-specific training

Building Your Chat Habits

1

Week 1: Daily Health Checks

Start each day with: "Cluster health?"Goal: Get comfortable with Chat
2

Week 2: Troubleshooting

Use Chat for EVERY pod issueGoal: Build troubleshooting muscle memory
3

Week 3: Learning

Ask 1 “why” question per dayGoal: Deepen Kubernetes knowledge
4

Week 4: Advanced

Try file uploads, historical queries, comparisonsGoal: Become a power user

Real-World Success Patterns

Pattern 1: Morning Standup

Every day at 9 AM:
"Cluster health?"
"Any new incidents?"
"Team can deploy today?"
Result: 30-second standup prep

Pattern 2: Pre-Deploy Validation

Before EVERY deploy:
[Upload deployment.yaml]
"Validate this"
"What's the risk?"
Result: 80% fewer bad deploys

Pattern 3: Incident Response Template

When paged:
"What's down?"
"Impact?"
"Root cause?"
"Fix?"
Result: Structured triage in 3 minutes

Pattern 4: Learning Hour

Friday afternoons:
"Explain [concept] with examples from my cluster"
Result: Learn by doing with real infrastructure

Keyboard Power User Mode

Master these shortcuts:
ShortcutUse CaseTime Saved
⌘KJump to Chat from anywhere2-3 seconds
EnterSend queryInstant
Shift+EnterMulti-line queryFor complex questions
Edit/retry last queryFix typos quickly
EscClose ChatClean workspace
Pro workflow:
  1. Press ⌘K (wherever you are)
  2. Type “chat”
  3. Type query
  4. Press Enter
  5. Get answer
Total: 5 seconds from thought to answer

What Makes RubixKube Chat Unique

Cluster-Aware

Uses YOUR data, not generic knowledge

Memory-Powered

Recalls past incidents automatically

RCA Integration

Explains detected incidents

Multi-Agent

Coordinates Observer, RCA, Memory agents

Context Retention

True conversation thread

Transparent

See the agent think
It’s not just a chatbot - it’s your intelligent infrastructure co-pilot. 🚀

Comparing to Other Tools

FeatureGeneric ChatGPTRubixKube Chat
Knows your cluster❌ No✅ Yes - live data
Executes queries❌ No✅ Yes - real Kubernetes API
Historical context❌ No✅ Yes - Memory Engine
RCA integration❌ No✅ Yes - incident correlation
Evidence-based❌ Can hallucinate✅ Shows actual logs/events
Transparent reasoning❌ Black box✅ Shows function calls
RubixKube Chat = ChatGPT + Live Cluster Data + RCA + Memory Engine

Pro Tips for Mastery

Save your favorite queries:
Daily health:    "Cluster status + incidents?"
Pre-deploy:      "Validate [service] for deploy"
Post-deploy:     "How's [service] after deploy?"
Triage:          "Show HIGH severity issues"
Paste and run daily.
Build investigation flow:
1. "What's failing?" → Overview
2. "Focus on HIGH" → Prioritize
3. "Root cause?" → Understand
4. "Show evidence" → Verify
5. "Fix steps?" → Remediate
New team member onboarding:
1. "What applications run here?"
2. "Show me service connections"
3. "Explain our monitoring"
4. "What does payment-service do?"
Result: Self-serve onboarding
Best workflow:
  • Dashboard: Visual overview
  • Chat: Deep dive investigation
See spike → Click “Discuss in Chat” → Get answers
For every major incident:
  1. Investigate via Chat
  2. Click Export conversation
  3. Save as Markdown
  4. Add to post-mortem
Result: Documentation writes itself

What You Learned

5 Personas

How SRE, DevOps, Platform Eng, Junior Dev use Chat differently

Time-Based Workflows

Morning, afternoon, evening, 3 AM response patterns

File Upload

YAML validation and analysis

Integrations

Chat + Dashboard + Insights + Memory Engine

Power User Shortcuts

Keyboard shortcuts for efficiency

Real Patterns

Actual workflows from production users

Next Steps

You’re now a Chat expert! Explore related concepts:

Summary

The Chat interface transforms how you work with infrastructure: Natural language replaces kubectl commands
Context awareness maintains conversation thread
Multi-persona support for different workflows
Time savings of 84% on average
Learning mode teaches Kubernetes concepts
RCA integration explains detected incidents
Transparent reasoning shows how agent thinks
You’re now equipped to use Chat like a pro across all scenarios! 🎯

Quick Reference Card

Print or bookmark this:
ScenarioQueryExpected Response
Daily health”Cluster health?”Health %, incidents, pod counts
Find failures”What’s failing?”List of unhealthy resources
Investigate”Why did [pod] fail?”RCA with root cause
Get logs”Show logs for [pod]“Filtered log output
Get fix”How do I fix [pod]?“kubectl commands
Verify”Is [pod] healthy?”Current status
Learn”Explain [concept]“Educational response
ValidateUpload YAML + “Validate”Security & config checks

You’ve mastered Chat! Start experimenting and discover what works best for your workflow. 🚀
I