Pre-Installation Issues

Issue

Agent installation fails or Kubernetes resources (Deployments, Services, ConfigMaps) are not created.

Cause

  • The installation user does not have sufficient Kubernetes RBAC permissions.
  • Restricted access prevents the creation of required objects.

Solution

  1. Verify permissions using:
    kubectl auth can-i create deployment --namespace=opsramp-agent
    (Repeat for other objects like services, configmaps.)
  2. If permissions are missing, request the necessary role bindings from your Kubernetes administrator.

Installation Issues

Issue

The Helm chart for the OpsRamp Agent fails to install.

Cause

  • Incorrect Helm install command or namespace.
  • Networking restrictions preventing Helm from pulling the chart.

Solution

  1. Use the correct installation command:
    helm install opsramp oci://us-docker.pkg.dev/opsramp-registry/agent-helm-charts/agent \
    -n opsramp-agent --create-namespace --values opsramp-agent-values.yaml
  2. Ensure internet connectivity and that your cluster can access the GCP Artifact Registry (no special authentication is required).

Post-Installation Issues

Issue

Pods are not running, stuck in states like ImagePullBackOff, CrashLoopBackOff, or OOMKilled.

Cause

  • Pods failed to fetch images due to network/firewall restrictions.
  • Pods restarted due to insufficient CPU/memory limits.
  • Configuration errors in Deployments, DaemonSets, or StatefulSets.

Solution

  1. Verify installation:
    helm list -n opsramp-agent
    kubectl get pods -n opsramp-agent
  2. If pods are failing, describe the pod for details:
    kubectl describe pod <pod-name> -n opsramp-agent
    • For CrashLoopBackOff / OOMKilled: Increase CPU and memory limits in the corresponding deployment.
    • For ImagePullBackOff: Ensure internet access and networking policies allow pulling images from GCP Artifact Registry.
  3. Check the health of core components:
    kubectl get deployment -n opsramp-agent
    kubectl get daemonsets -n opsramp-agent
    kubectl get statefulsets -n opsramp-agent

Expected healthy state:

  • Deployments (master, prometheus): 1/1 Ready
  • DaemonSet (worker): all nodes Ready
  • StatefulSet (redis-node): all replicas Ready