Skip to main content

Detecting Container Image Issues

One of the most common Kubernetes failures is ImagePullBackOff - when a pod can’t pull its container image. Let’s see how RubixKube detects and helps you resolve this issue.
What you’ll learn:
  • How RubixKube detects image pull failures
  • Reading the incident details
  • Understanding AI-generated suggestions
  • Fixing the issue manually
  • Verifying the resolution

The Scenario

We’ll intentionally create a pod with an invalid container image to trigger an ImagePullBackOff error, then watch RubixKube detect it.

Create the Failing Pod

Deploy a pod with a non-existent image:
apiVersion: v1
kind: Pod
metadata:
  name: broken-image-demo
  namespace: rubixkube-tutorials
spec:
  containers:
  - name: app
    image: nonexistent-registry.io/invalid-image:v1.0
    imagePullPolicy: Always
Apply it:
kubectl create namespace rubixkube-tutorials
kubectl apply -f broken-image-pod.yaml

Watch Kubernetes Fail

Check the pod status:
kubectl get pods -n rubixkube-tutorials
Expected output:
NAME                READY   STATUS             RESTARTS   AGE
broken-image-demo   0/1     ImagePullBackOff   0          2m
What’s happening:
  1. Kubernetes tries to pull nonexistent-registry.io/invalid-image:v1.0
  2. Registry doesn’t exist → Pull fails
  3. Kubernetes retries with exponential backoff
  4. Status cycles: PendingErrImagePullImagePullBackOff
kubectl describe pod broken-image-demo -n rubixkube-tutorials
Key events:
Warning  Failed     2m (x4 over 3m)  kubelet
         Failed to pull image "nonexistent-registry.io/invalid-image:v1.0": 
         failed to resolve reference: failed to do request: 
         Head "https://nonexistent-registry.io/...": 
         dial tcp: lookup nonexistent-registry.io: no such host

Warning  Failed     2m (x4 over 3m)  kubelet
         Error: ErrImagePull

Warning  Failed     1m (x8 over 3m)  kubelet
         Error: ImagePullBackOff

RubixKube Detection (1-2 Minutes)

Now open the RubixKube Dashboard. Within 1-2 minutes, you’ll see:
Dashboard showing detected ImagePullBackOff

What RubixKube Detected

Activity Feed shows:
  • New incident appeared
  • Type: Image pull failure
  • Severity: Medium (not critical, but needs fixing)
  • Status: Active
Active Insights increased from 0 to 1+ (or higher if multiple issues)
Detection is automatic! You didn’t configure any alerts or rules. RubixKube’s Observer Agent continuously watches your cluster and reports anomalies.

Viewing the Incident Details

Click on Insights in the navigation to see detailed analysis:
Insights page showing incident analysis

Incident Information

You’ll see:
FieldValueMeaning
TypeImagePullBackOffKubernetes can’t pull the image
SeverityMediumNot critical but needs attention
Affected ResourcePod/broken-image-demoWhich pod is impacted
Namespacerubixkube-tutorialsWhere the pod lives
Detected2 minutes agoWhen RubixKube first saw it
Confidence90%+How certain RubixKube is about the diagnosis

AI-Generated Suggestions

RubixKube provides actionable recommendations:
Why: Typos in image names are the #1 cause of ImagePullBackOffCheck:
  • Is the registry URL correct?
  • Does the image exist in that registry?
  • Is the tag valid? (:latest, :v1.0, etc.)
Why: Private registries require authenticationVerify:
  • Can your cluster reach the registry?
  • Are imagePullSecrets configured?
  • Does the service account have access?
Why: Policy affects when Kubernetes pulls imagesOptions:
  • Always - Pull on every pod start (good for :latest)
  • IfNotPresent - Use cached image if available
  • Never - Only use local images
Why: Network policies might prevent outbound connectionsTest:
kubectl run test-curl --rm -i --image=curlimages/curl -- \
  curl -I https://registry-1.docker.io
If this fails, network policies may be blocking registry access.

Fixing the Issue

For our scenario, the image simply doesn’t exist. Let’s fix it:

Option 1: Use a Valid Image

kubectl delete pod broken-image-demo -n rubixkube-tutorials
Then deploy with a real image:
apiVersion: v1
kind: Pod
metadata:
  name: fixed-image-demo
  namespace: rubixkube-tutorials
spec:
  containers:
  - name: app
    image: nginx:latest  # Valid image
    imagePullPolicy: IfNotPresent
kubectl apply -f fixed-pod.yaml

Option 2: Fix In-Place (if image was a typo)

kubectl edit pod broken-image-demo -n rubixkube-tutorials
Change image: to a valid one, save, and Kubernetes will recreate the container.

Verify the Fix

Check the pod status:
kubectl get pods -n rubixkube-tutorials
Success looks like:
NAME              READY   STATUS    RESTARTS   AGE
fixed-image-demo   1/1     Running   0          30s
In RubixKube Dashboard:
  • Active Insights decreases (issue resolved)
  • Activity Feed shows resolution event
  • System Health improves
RubixKube tracks resolution! Once you fix the issue, RubixKube detects the recovery and updates the incident status to “Resolved”.

What RubixKube Learned

After you fix this issue, the Memory Engine stores:

The Problem

ImagePullBackOff on pod broken-image-demo due to non-existent registry

The Fix

Replaced with valid image nginx:latest → Pod started successfully

Time to Resolve

Detection: 2 minutes, Manual fix: 30 seconds, Total: 2.5 minutes

Pattern Stored

If similar error occurs again, RubixKube will recognize the pattern faster
This knowledge helps with future incidents!

Common ImagePullBackOff Causes

Based on RubixKube’s Memory Engine and common patterns:
CauseFrequencyFix TimePrevention
Typo in image name40%1 minAutomation, validation
Missing image tag25%2 minAlways specify tags
Private registry auth20%5 minConfigure imagePullSecrets
Network policy blocking10%10 minAllow egress to registries
Registry rate limiting5%15 minUse image caching
Pro Tip: Always use specific image tags (:v1.2.3) instead of :latest. This prevents unexpected changes and makes rollbacks easier.

Real-World Impact

Before RubixKube:
  • Engineer notices pod not starting (manual check)
  • Runs kubectl describe pod to find error
  • Googles “ImagePullBackOff kubernetes”
  • Tries various fixes
  • Time: 10-15 minutes (if experienced)
With RubixKube:
  • Automatic detection within 2 minutes
  • Incident appears in dashboard automatically
  • AI suggests likely causes
  • Relevant documentation linked
  • Time: 2-3 minutes (with clear guidance)
Time saved: 70-80% even for simple issues!

Next Tutorial

Now that you’ve seen image pull detection, let’s look at a more complex failure:

Next: Detecting OOMKilled Pods

Learn how RubixKube analyzes memory issues and suggests right-sizing

Summary

You learned: How to intentionally create an ImagePullBackOff error
How RubixKube automatically detects the failure
How to read incident details in the Insights page
What AI-generated suggestions look like
How to fix the issue manually
How RubixKube verifies resolution
How the Memory Engine learns from incidents
Next: Try detecting memory issues with OOMKilled pods!