Applications

This guide covers configuring the pgt-application Helm chart for deploying applications to Kubernetes. The chart supports two deployment strategies:

Rolling Deployments - Standard Kubernetes deployments (default)
Canary/Blue-Green Deployments - Progressive delivery using Argo Rollouts (Premium)

Prerequisites

Add the chart as a dependency in your Chart.yaml:

apiVersion: v2
name: demo-app
version: 0.0.1
dependencies:
  - name: pgt-application
    version: 0.0.4
    repository: oci://public.ecr.aws/w9m9e0e9/pgt-helm-charts

After adding the dependency, run:

helm dependency update

Common Configuration

These configuration options apply to both deployment strategies. All values are nested under pgt-application: in your values.yaml.

Required Fields

pgt-application:
  # [Required] Application name - used as base name for all K8s resources
  name: my-application

  # [Required] Organisation name for labeling
  organisationName: my-org

  # [Required] Container image configuration
  container:
    image:
      registry: public.ecr.aws
      repository: my-org/my-app
      tag: "1.0.0"
    resources:
      limits:
        memory: 512Mi
      requests:
        memory: 512Mi  # Should equal limits for Guaranteed QoS
        cpu: 100m

⚠️ Memory Requests and Limits
Always set memory requests equal to memory limits. This ensures your pod receives a Guaranteed Quality of Service (QoS) class, which provides:
Predictable scheduling: The scheduler knows exactly how much memory your pod needs
OOM kill priority: Guaranteed pods are the last to be killed when nodes run low on memory
No throttling surprises: Your pod won't be killed for exceeding its request while under its limit
When requests differ from limits, your pod gets a Burstable QoS class. Burstable pods can be OOM-killed even when using memory between their request and limit, making failures unpredictable and harder to debug.

Development Environment & Cost Optimization

When deploying to non-production environments, consider these settings to reduce costs:

pgt-application:
  autoscaling:
    enable: true
    minReplicas: 1
    maxReplicas: 3
    scaleDownOutsideOfficeHours: true  # Scale to zero outside office hours

  affinity:
    nodeAffinity:
      preferSpotInstances: true  # Prefer running on spot instances

  container:
    resources:
      limits:
        memory: 256Mi  # Right-size for dev workloads
      requests:
        memory: 256Mi
        cpu: 50m

💡 Spot Instances
The preferSpotInstances: true setting prefers scheduling your workloads on spot instances when available, which can significantly reduce compute costs. If you're interested in enabling spot instances for your environment, please reach out to the platform team to discuss your requirements and ensure your applications are suitable for spot instance usage.

💰 Office Hours Scaling
Setting scaleDownOutsideOfficeHours: true automatically scales your application to zero replicas outside of business hours (evenings and weekends). This is ideal for:
Development and staging environments
Internal tools not needed 24/7
Non-critical services in non-production clusters
⚠️ Warning: Do not enable this for production workloads or services that require high availability.

🕐 Custom Scaling Schedules

If scaleDownOutsideOfficeHours doesn't fit your needs, you can define custom scaling schedules using KEDA's cron trigger:

pgt-application:
  autoscaling:
    enable: true
    minReplicas: 0
    maxReplicas: 10
    triggers:
      # Scale up during business hours (Mon-Fri, 07:00-18:00 UTC)
      - type: cron
        metadata:
          timezone: UTC
          start: "0 7 * * 1-5"
          end: "0 18 * * 1-5"
          desiredReplicas: "2"
      # Add other triggers as needed (CPU, memory, etc.)
      - type: cpu
        metricType: Utilization
        metadata:
          value: "80"

This gives you full control over when your application scales up or down based on your specific requirements.

Ports and Service

Define ports to expose your application. If no ports are defined, no Service is created:

pgt-application:
  ports:
    - name: http
      port: 8080
      protocol: TCP
      targetPort: 8080
    - name: metrics
      port: 9090
      protocol: TCP
      targetPort: 9090

Environment Variables

Configure environment variables directly or from ConfigMaps/Secrets:

pgt-application:
  # Direct environment variables
  environmentVariables:
    - name: LOG_LEVEL
      value: info
    - name: DATABASE_PASSWORD
      valueFrom:
        secretKeyRef:
          name: db-secret
          key: password

  # Load all variables from ConfigMap or Secret
  environmentVariablesFrom:
    - configMapRef:
        name: app-config
    - secretRef:
        name: app-secrets

Health Probes

Configure liveness and readiness probes. We recommend running health checks on a dedicated port separate from your application traffic port:

pgt-application:
  ports:
    - name: http
      port: 8080
      protocol: TCP
      targetPort: 8080
    - name: health
      port: 8081
      protocol: TCP
      targetPort: 8081

  container:
    livenessProbe:
      httpGet:
        path: /health
        port: health  # Use dedicated health port
      initialDelaySeconds: 30
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 3

    readinessProbe:
      httpGet:
        path: /ready
        port: health  # Use dedicated health port
      initialDelaySeconds: 5
      periodSeconds: 10
      timeoutSeconds: 3
      failureThreshold: 3

💡 Dedicated Health Check Port
Running health checks on a separate port (e.g., 8081) from your main traffic port (e.g., 8080) provides several benefits:
Isolation from traffic load: Health checks won't compete with application requests for connections
Independent scaling: Health endpoints can remain responsive even when the main port is under heavy load
Security: Health endpoints can be excluded from external ingress while remaining accessible to the kubelet
Simpler debugging: Separating concerns makes it easier to diagnose issues with either traffic or health

Autoscaling with KEDA

The chart uses KEDA ScaledObject for autoscaling. When enabled, the replicas field is ignored:

pgt-application:
  autoscaling:
    enable: true
    minReplicas: 2
    maxReplicas: 10
    # Scale to 0 outside office hours (cost savings for non-prod)
    scaleDownOutsideOfficeHours: false
    triggers:
      - type: cpu
        metricType: Utilization
        metadata:
          value: "80"
      - type: memory
        metricType: Utilization
        metadata:
          value: "80"

Gateway API Ingress

Configure traffic routing using Gateway API (HTTPRoute, GRPCRoute, etc.). The cloud setting determines which cloud-specific resources are created alongside the standard Gateway API routes.

Cloud Provider Options

Cloud

Description

aws

Creates TargetGroupConfiguration resources for ALB health checks and protocol settings

azure

Creates standard Gateway API routes only; health checks are configured via container probes

""

No cloud-specific resources; uses standard Gateway API routes only

AWS Configuration

When cloud: aws, the chart creates TargetGroupConfiguration resources that configure AWS Application Load Balancer target groups with custom health checks, protocol settings, and HTTP matchers:

pgt-application:
  gatewayIngress:
    cloud: aws
    routes:
      - name: my-api
        kind: HTTPRoute  # HTTPRoute | GRPCRoute | TCPRoute | UDPRoute | TLSRoute
        gatewayName: external-gateway
        gatewayNamespace: gateway
        hostnames:
          - api.example.com
        aws:
          protocol: HTTP               # HTTP | HTTPS | TCP | UDP | TLS
          protocolVersion: HTTP1       # HTTP1 | HTTP2 | GRPC
          healthCheck:
            path: /health              # [Required] Health check endpoint
            port: "8080"               # [Required] Health check port
            protocol: HTTP             # Health check protocol
            healthyThreshold: 5        # Consecutive successes to mark healthy
            interval: 30               # Seconds between health checks
            timeout: 5                 # Seconds to wait for response
            unhealthyThresholdCount: 2 # Consecutive failures to mark unhealthy
            desiredHttpCode: "200"     # Expected HTTP response code
        rules:
          - serviceName: my-service
            servicePort: 8080

Azure Configuration

When cloud: azure, the chart creates standard Gateway API routes. Azure's Gateway implementation uses the container's readiness probe for health checks, so no additional cloud-specific configuration is required:

pgt-application:
  gatewayIngress:
    cloud: azure
    routes:
      - name: my-api
        kind: HTTPRoute
        gatewayName: external-gateway
        gatewayNamespace: gateway
        hostnames:
          - api.example.com
        rules:
          - serviceName: my-service
            servicePort: 8080

  # Health checks are derived from the readiness probe
  container:
    readinessProbe:
      httpGet:
        path: /ready
        port: http
      initialDelaySeconds: 5
      periodSeconds: 10

💡 Azure Health Checks
On Azure, ensure your readinessProbe is properly configured as the Gateway uses this to determine backend health. The probe path, port, and timing settings directly affect load balancer behaviour.

External Secrets

The pgt-application chart includes pgt-secrets as a subchart for fetching secrets from AWS Secrets Manager or Azure Key Vault. For full configuration options, see the PGT Secrets documentation.

pgt-application:
  pgt-secrets:
    enabled: true
    organisationName: my-org
    serviceAccount:
      create: true
      name: my-app-secrets-sa
    aws:
      enabled: true
      secretRegion: eu-north-1
    items:
      - secretStoreName: application-store
        kubernetesSecretName: my-app-secrets
        data:
          - secretKey: database-password
            remoteRef:
              key: prod/my-app/database
              property: password

Service Account

Configure a ServiceAccount to allow your application to access cloud resources securely:

pgt-application:
  serviceAccount:
    name: my-service-account
    annotations:
      # Add cloud-specific annotations - see guides below

For detailed instructions on creating IAM roles and managed identities, see:

AWS IAM Roles (IRSA) - Configure IAM roles for EKS workloads
Azure Workload Identity - Configure managed identities for AKS workloads

Prometheus ServiceMonitor

Enable metrics scraping:

pgt-application:
  serviceMonitor:
    enabled: true
    path: /metrics
    targetPort: 9090
    interval: 60s

Rolling Deployments (Default)

Rolling deployments are the default strategy. The chart creates a standard Kubernetes Deployment that gradually replaces pods during updates.

Basic Rolling Deployment

pgt-application:
  name: my-application
  organisationName: my-org

  # Deployment configuration (used when rollout.enabled is false)
  deployment:
    replicas: 3  # Ignored when autoscaling.enable is true
    annotations:
      reloader.stakater.com/auto: "true"
    strategy:
      type: RollingUpdate
      rollingUpdate:
        maxSurge: 25%        # Allow 25% extra pods during update
        maxUnavailable: 25%  # Allow 25% unavailable during update

  container:
    image:
      registry: public.ecr.aws
      repository: my-org/my-app
      tag: "1.0.0"
    resources:
      limits:
        memory: 512Mi
      requests:
        memory: 512Mi
        cpu: 100m

  ports:
    - name: http
      port: 8080
      protocol: TCP
      targetPort: 8080

Rolling Update Strategy Options

Parameter

Default

Description

deployment.strategy.type

RollingUpdate

RollingUpdate or Recreate

deployment.strategy.rollingUpdate.maxSurge

25%

Max pods above desired count during update

deployment.strategy.rollingUpdate.maxUnavailable

25%

Max unavailable pods during update

Zero-Downtime Rolling Update

For zero-downtime deployments, configure:

pgt-application:
  deployment:
    strategy:
      type: RollingUpdate
      rollingUpdate:
        maxSurge: 1
        maxUnavailable: 0  # Never have fewer than desired replicas

  container:
    readinessProbe:
      httpGet:
        path: /ready
        port: http
      initialDelaySeconds: 5
      periodSeconds: 5

    lifecycle:
      preStop:
        exec:
          command:
            - /bin/sh
            - -c
            - sleep 10  # Allow time for load balancer to drain

Complete Rolling Deployment Example

pgt-application:
  name: api-service
  organisationName: acme-corp

  deployment:
    replicas: 3
    annotations:
      reloader.stakater.com/auto: "true"
    strategy:
      type: RollingUpdate
      rollingUpdate:
        maxSurge: 1
        maxUnavailable: 0

  container:
    image:
      registry: public.ecr.aws
      repository: acme-corp/api-service
      tag: "2.1.0"
    imagePullPolicy: IfNotPresent
    resources:
      limits:
        memory: 1Gi
      requests:
        memory: 1Gi
        cpu: 250m
    livenessProbe:
      httpGet:
        path: /health
        port: health  # Dedicated health port
      initialDelaySeconds: 30
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /ready
        port: health  # Dedicated health port
      initialDelaySeconds: 5
      periodSeconds: 5
    lifecycle:
      preStop:
        exec:
          command: ["/bin/sh", "-c", "sleep 10"]

  ports:
    - name: http
      port: 8080
      protocol: TCP
      targetPort: 8080
    - name: health
      port: 8081
      protocol: TCP
      targetPort: 8081
    - name: metrics
      port: 9090
      protocol: TCP
      targetPort: 9090

  environmentVariables:
    - name: LOG_LEVEL
      value: info
    - name: DB_HOST
      valueFrom:
        secretKeyRef:
          name: db-credentials
          key: host

  autoscaling:
    enable: true
    minReplicas: 3
    maxReplicas: 10
    triggers:
      - type: cpu
        metricType: Utilization
        metadata:
          value: "70"

  gatewayIngress:
    cloud: aws
    routes:
      - name: api
        gatewayName: public-gateway
        gatewayNamespace: gateway
        hostnames:
          - api.acme-corp.com
        aws:
          healthCheck:
            path: /health
            port: "8081"  # Use dedicated health port
        rules:
          - serviceName: api-service
            servicePort: 8080

  serviceMonitor:
    enabled: true
    path: /metrics
    targetPort: 9090

  serviceAccount:
    name: api-service-sa
    annotations:
      eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/api-service-role

Canary/Blue-Green Deployments (Premium)

ℹ️ Premium Feature
Canary and Blue-Green deployments require Argo Rollouts to be installed in your cluster. This is a premium setup that enables progressive delivery strategies.

Enable Argo Rollouts by setting rollout.enabled: true. This creates a Rollout resource instead of a Deployment.

Canary Deployments

Canary deployments gradually shift traffic to the new version, allowing you to validate changes with a subset of users before full rollout.

Basic Canary Configuration

pgt-application:
  name: my-application
  organisationName: my-org

  # Enable Argo Rollouts
  rollout:
    enabled: true
    replicas: 3
    strategy:
      type: canary
      canary:
        maxSurge: "25%"
        maxUnavailable: 0
        # Default steps: 20% -> 50% -> 80% -> 100% with pauses
        steps: []

  container:
    image:
      registry: public.ecr.aws
      repository: my-org/my-app
      tag: "1.0.0"
    resources:
      limits:
        memory: 512Mi
      requests:
        memory: 512Mi
        cpu: 100m

Custom Canary Steps

Define custom rollout steps to control the progression:

pgt-application:
  rollout:
    enabled: true
    strategy:
      type: canary
      canary:
        steps:
          - setWeight: 5     # Start with 5% traffic
          - pause:
              duration: 5m   # Wait 5 minutes
          - setWeight: 20
          - pause:
              duration: 10m
          - setWeight: 50
          - pause: {}        # Pause indefinitely (requires manual promotion)
          - setWeight: 80
          - pause:
              duration: 5m
          - setWeight: 100   # Full rollout

Canary with Gateway API Traffic Routing

When using Gateway API routes, traffic routing is automatically configured:

pgt-application:
  rollout:
    enabled: true
    strategy:
      type: canary
      canary:
        steps:
          - setWeight: 20
          - pause: { duration: 2m }
          - setWeight: 50
          - pause: { duration: 5m }
          - setWeight: 100

  gatewayIngress:
    cloud: aws
    routes:
      - name: my-api
        gatewayName: external-gateway
        gatewayNamespace: gateway
        hostnames:
          - api.example.com
        rules:
          - serviceName: my-service
            servicePort: 8080
            # Preview service auto-created as "my-service-preview"

Canary with Analysis (Automated Rollback)

Enable automated promotion/rollback based on metrics analysis:

pgt-application:
  rollout:
    enabled: true
    strategy:
      type: canary
      canary:
        steps:
          - setWeight: 20
          - pause: { duration: 2m }
          - setWeight: 50
          - pause: { duration: 5m }
          - setWeight: 100
        analysis:
          enabled: true
          templates:
            - templateName: success-rate
              clusterScope: true
            - templateName: latency-check
          args:
            - name: service-name
              value: my-service
            - name: namespace
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace

📝 AnalysisTemplate Required
You must create AnalysisTemplate resources separately. These define the metrics queries and success criteria for automated promotion decisions.

Blue-Green Deployments

Blue-Green deployments run two identical environments (blue = current, green = new) and switch traffic instantly after validation.

Basic Blue-Green Configuration

pgt-application:
  name: my-application
  organisationName: my-org

  rollout:
    enabled: true
    replicas: 3
    strategy:
      type: blueGreen
      blueGreen:
        # Services auto-named as "my-application" and "my-application-preview"
        activeService: ""
        previewService: ""
        autoPromotionEnabled: false  # Require manual promotion
        scaleDownDelaySeconds: 30    # Wait before scaling down old version

  container:
    image:
      registry: public.ecr.aws
      repository: my-org/my-app
      tag: "1.0.0"

Blue-Green with Auto-Promotion

Automatically promote after a specified delay:

pgt-application:
  rollout:
    enabled: true
    strategy:
      type: blueGreen
      blueGreen:
        autoPromotionEnabled: true
        autoPromotionSeconds: 300  # Auto-promote after 5 minutes
        scaleDownDelaySeconds: 30

Blue-Green with Preview Replicas

Run a different number of replicas for the preview environment:

pgt-application:
  rollout:
    enabled: true
    replicas: 5  # Production replicas
    strategy:
      type: blueGreen
      blueGreen:
        previewReplicaCount: 2  # Fewer replicas for preview validation
        autoPromotionEnabled: false

Blue-Green with Pre-Promotion Analysis

Run analysis before promoting the new version:

pgt-application:
  rollout:
    enabled: true
    strategy:
      type: blueGreen
      blueGreen:
        autoPromotionEnabled: false
        scaleDownDelaySeconds: 60
        analysis:
          enabled: true
          templates:
            - templateName: smoke-tests
              clusterScope: true
            - templateName: integration-tests
          args:
            - name: preview-url
              value: "http://my-app-preview.default.svc.cluster.local"

Complete Canary Deployment Example

pgt-application:
  name: payment-service
  organisationName: acme-corp

  rollout:
    enabled: true
    replicas: 5
    annotations:
      reloader.stakater.com/auto: "true"
    strategy:
      type: canary
      canary:
        maxSurge: "25%"
        maxUnavailable: 0
        steps:
          - setWeight: 5
          - pause: { duration: 2m }
          - setWeight: 20
          - pause: { duration: 5m }
          - setWeight: 50
          - pause: { duration: 10m }
          - setWeight: 80
          - pause: { duration: 5m }
          - setWeight: 100
        analysis:
          enabled: true
          templates:
            - templateName: success-rate
              clusterScope: true
          args:
            - name: service-name
              value: payment-service

  container:
    image:
      registry: public.ecr.aws
      repository: acme-corp/payment-service
      tag: "3.2.1"
    imagePullPolicy: Always
    resources:
      limits:
        memory: 2Gi
      requests:
        memory: 2Gi
        cpu: 500m
    livenessProbe:
      httpGet:
        path: /health
        port: health  # Dedicated health port
      initialDelaySeconds: 60
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /ready
        port: health  # Dedicated health port
      initialDelaySeconds: 10
      periodSeconds: 5

  ports:
    - name: http
      port: 8080
      protocol: TCP
      targetPort: 8080
    - name: health
      port: 8081
      protocol: TCP
      targetPort: 8081
    - name: grpc
      port: 9000
      protocol: TCP
      targetPort: 9000
    - name: metrics
      port: 9090
      protocol: TCP
      targetPort: 9090

  environmentVariables:
    - name: ENVIRONMENT
      value: production
    - name: STRIPE_API_KEY
      valueFrom:
        secretKeyRef:
          name: stripe-credentials
          key: api-key

  autoscaling:
    enable: true
    minReplicas: 5
    maxReplicas: 20
    triggers:
      - type: cpu
        metricType: Utilization
        metadata:
          value: "60"
      - type: memory
        metricType: Utilization
        metadata:
          value: "70"

  gatewayIngress:
    cloud: aws
    routes:
      - name: payment-api
        kind: HTTPRoute
        gatewayName: internal-gateway
        gatewayNamespace: gateway
        hostnames:
          - payments.internal.acme-corp.com
        aws:
          protocol: HTTPS
          healthCheck:
            path: /health
            port: "8081"  # Dedicated health port
        rules:
          - serviceName: payment-service
            servicePort: 8080
      - name: payment-grpc
        kind: GRPCRoute
        gatewayName: internal-gateway
        gatewayNamespace: gateway
        hostnames:
          - payments-grpc.internal.acme-corp.com
        aws:
          protocolVersion: GRPC
          healthCheck:
            path: /grpc.health.v1.Health/Check
            port: "9000"
        rules:
          - serviceName: payment-service
            servicePort: 9000

  serviceMonitor:
    enabled: true
    path: /metrics
    targetPort: 9090
    interval: 30s

  serviceAccount:
    name: payment-service-sa
    annotations:
      eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/payment-service-role

  pgt-secrets:
    enabled: true
    organisationName: acme-corp
    aws:
      enabled: true
      secretRegion: eu-west-1
    items:
      - secretStoreName: payment-secrets-store
        kubernetesSecretName: stripe-credentials
        data:
          - secretKey: api-key
            remoteRef:
              key: prod/payments/stripe
              property: api_key

Complete Blue-Green Deployment Example

pgt-application:
  name: checkout-service
  organisationName: acme-corp

  rollout:
    enabled: true
    replicas: 4
    strategy:
      type: blueGreen
      blueGreen:
        activeService: checkout-service
        previewService: checkout-service-preview
        autoPromotionEnabled: false
        previewReplicaCount: 2
        scaleDownDelaySeconds: 60
        analysis:
          enabled: true
          templates:
            - templateName: smoke-tests
            - templateName: load-tests
          args:
            - name: preview-host
              value: checkout-preview.internal.acme-corp.com

  container:
    image:
      registry: public.ecr.aws
      repository: acme-corp/checkout-service
      tag: "2.0.0"
    resources:
      limits:
        memory: 1Gi
      requests:
        memory: 1Gi
        cpu: 250m
    livenessProbe:
      httpGet:
        path: /health
        port: health  # Dedicated health port
      initialDelaySeconds: 30
      periodSeconds: 10
    readinessProbe:
      httpGet:
        path: /ready
        port: health  # Dedicated health port
      initialDelaySeconds: 10
      periodSeconds: 5

  ports:
    - name: http
      port: 8080
      protocol: TCP
      targetPort: 8080
    - name: health
      port: 8081
      protocol: TCP
      targetPort: 8081

  autoscaling:
    enable: true
    minReplicas: 4
    maxReplicas: 12
    triggers:
      - type: cpu
        metricType: Utilization
        metadata:
          value: "70"

  gatewayIngress:
    cloud: aws
    routes:
      - name: checkout
        gatewayName: public-gateway
        gatewayNamespace: gateway
        hostnames:
          - checkout.acme-corp.com
        aws:
          healthCheck:
            path: /health
            port: "8081"  # Dedicated health port
        rules:
          - serviceName: checkout-service
            servicePort: 8080
            previewName: checkout-service-preview
            previewPort: 8080

Managing Rollouts with Argo CD

The Argo Rollouts extension is installed in Argo CD, providing a rich UI for managing progressive deployments directly from the Argo CD interface.

Viewing Rollout Status

Navigate to your application in the Argo CD UI
Click on the Rollout resource in the application tree
The extension displays a visual rollout panel showing:
- Current rollout phase and status
- Step progression with visual indicators for completed, active, and pending steps
- Traffic weight distribution between stable and canary/preview versions
- ReplicaSet revisions with pod counts
- Analysis run status and results (if analysis is enabled)

Promoting a Paused Rollout

When a rollout is paused (waiting for manual promotion):

Open the application in Argo CD
Click on the Rollout resource to open the rollout panel
Click the Promote button in the rollout panel header
The rollout will proceed to the next step

For full promotion (skip all remaining steps):

Click the dropdown arrow next to Promote
Select Promote Full to immediately complete the rollout to 100%

Aborting a Rollout (Rollback)

To abort a rollout and revert to the previous stable version:

Open the application in Argo CD
Click on the Rollout resource
Click the Abort button in the rollout panel header
The rollout will scale down the canary/preview ReplicaSet and restore full traffic to the stable version

Retrying a Failed Rollout

If a rollout fails (e.g., due to failed analysis or degraded status):

Open the application in Argo CD
Click on the Rollout resource
Click the Retry button to restart the rollout from the beginning

Restarting a Rollout

To trigger a new rollout without changing the image (e.g., to pick up ConfigMap changes):

Open the application in Argo CD
Click on the Rollout resource
Click the Restart button to initiate a new rollout with the current configuration

Debugging Issues

Check Rollout Status Panel:

The rollout extension panel shows real-time status:

Click on the Rollout resource in Argo CD
Review the step progression - failed steps are highlighted in red
Check the revision history to compare stable vs canary ReplicaSets
Hover over status indicators for detailed messages

Check Rollout Events:

Click on the Rollout resource in Argo CD
Select the Events tab to view Kubernetes events
Look for warnings or errors related to the rollout progression

Check Pod Logs:

In the application tree, expand the Rollout to see its ReplicaSets and Pods
Click on a Pod resource
Select the Logs tab to view container logs
Use the container dropdown to switch between main and sidecar containers

Check Analysis Runs:

If using automated analysis:

AnalysisRun resources appear in the application tree under the Rollout
Click on the AnalysisRun to view the analysis panel
Review individual metric results - failed metrics are highlighted
Check metric queries and their returned values to understand failures

Sync Status:

If the application shows as OutOfSync:

Check the Diff tab to see what resources differ from Git
Click Sync to reconcile the desired state
Review sync options if specific resources need to be excluded

Additional Configuration

Init Containers

Run initialization tasks before the main container:

pgt-application:
  initContainers:
    - name: migration
      image: my-app:latest
      command: ["/bin/sh", "-c", "run-migrations.sh"]
      env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-secret
              key: url
      resources:
        limits:
          memory: 256Mi
        requests:
          memory: 256Mi
          cpu: 100m

Worker Containers (Sidecars)

Add sidecar containers that run alongside the main container:

pgt-application:
  workerContainers:
    - name: log-shipper
      image:
        registry: public.ecr.aws
        repository: fluent/fluent-bit
        tag: latest
      resources:
        limits:
          memory: 128Mi
        requests:
          memory: 128Mi
          cpu: 50m

Volume Mounts

Mount ConfigMaps or Secrets as volumes:

pgt-application:
  volumes:
    - kubernetesSecretName: tls-certs
      mountPath: /etc/ssl/certs
      readOnly: true
    - configMapName: app-config
      mountPath: /etc/config

Pod Scheduling

Control pod placement with affinity, tolerations, and node selectors:

pgt-application:
  affinity:
    nodeAffinity:
      preferSpotInstances: true  # Prefer Karpenter spot instances
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                app: my-application
            topologyKey: kubernetes.io/hostname

  pod:
    spec:
      nodeSelector:
        kubernetes.io/os: linux
      tolerations:
        - key: node.playgroundtech.io/os-windows
          operator: Exists
          effect: NoExecute

Values Reference

For a complete list of all available values, see the chart's default values:

Value

Type

Default

Description

name

string

nil

Required. Application name

organisationName

string

nil

Required. Organisation name

deployment.replicas

int

1

Replica count (ignored if autoscaling enabled)

deployment.strategy.type

string

RollingUpdate

RollingUpdate or Recreate

rollout.enabled

bool

false

Enable Argo Rollouts

rollout.strategy.type

string

canary

canary or blueGreen

container.image.registry

string

nil

Required. Container registry

container.image.repository

string

nil

Required. Image repository

container.image.tag

string

nil

Required. Image tag

autoscaling.enable

bool

true

Enable KEDA autoscaling

autoscaling.minReplicas

int

1

Minimum replicas

autoscaling.maxReplicas

int

3

Maximum replicas

autoscaling.scaleDownOutsideOfficeHours

bool

false

Scale to zero outside business hours

affinity.nodeAffinity.preferSpotInstances

bool

false

Prefer scheduling on spot instances

gatewayIngress.cloud

string

""

Cloud provider (aws for TargetGroupConfig)

serviceMonitor.enabled

bool

false

Enable Prometheus ServiceMonitor

pgt-secrets.enabled

bool

true

Enable external secrets

PreviousDeploy NextAWS IRSA

Last updated 2 days ago

Was this helpful?

hashtagPrerequisites

hashtagCommon Configuration

hashtagRequired Fields

hashtagDevelopment Environment & Cost Optimization

hashtagPorts and Service

hashtagEnvironment Variables

hashtagHealth Probes

hashtagAutoscaling with KEDA

hashtagGateway API Ingress

hashtagCloud Provider Options

hashtagAWS Configuration

hashtagAzure Configuration

hashtagExternal Secrets

hashtagService Account

hashtagPrometheus ServiceMonitor

hashtagRolling Deployments (Default)

hashtagBasic Rolling Deployment

hashtagRolling Update Strategy Options

hashtagZero-Downtime Rolling Update

hashtagComplete Rolling Deployment Example

hashtagCanary/Blue-Green Deployments (Premium)

hashtagCanary Deployments

hashtagBasic Canary Configuration

hashtagCustom Canary Steps

hashtagCanary with Gateway API Traffic Routing

hashtagCanary with Analysis (Automated Rollback)

hashtagBlue-Green Deployments

hashtagBasic Blue-Green Configuration

hashtagBlue-Green with Auto-Promotion

hashtagBlue-Green with Preview Replicas

hashtagBlue-Green with Pre-Promotion Analysis

hashtagComplete Canary Deployment Example

hashtagComplete Blue-Green Deployment Example

hashtagManaging Rollouts with Argo CD

hashtagViewing Rollout Status

hashtagPromoting a Paused Rollout

hashtagAborting a Rollout (Rollback)

hashtagRetrying a Failed Rollout

hashtagRestarting a Rollout

hashtagDebugging Issues

hashtagAdditional Configuration

hashtagInit Containers

hashtagWorker Containers (Sidecars)

hashtagVolume Mounts

hashtagPod Scheduling

hashtagValues Reference

Prerequisites

Common Configuration

Required Fields

Development Environment & Cost Optimization

Ports and Service

Environment Variables

Health Probes

Autoscaling with KEDA

Gateway API Ingress

Cloud Provider Options

AWS Configuration

Azure Configuration

External Secrets

Service Account

Prometheus ServiceMonitor

Rolling Deployments (Default)

Basic Rolling Deployment

Rolling Update Strategy Options

Zero-Downtime Rolling Update

Complete Rolling Deployment Example

Canary/Blue-Green Deployments (Premium)

Canary Deployments

Basic Canary Configuration

Custom Canary Steps

Canary with Gateway API Traffic Routing

Canary with Analysis (Automated Rollback)

Blue-Green Deployments

Basic Blue-Green Configuration

Blue-Green with Auto-Promotion

Blue-Green with Preview Replicas

Blue-Green with Pre-Promotion Analysis

Complete Canary Deployment Example

Complete Blue-Green Deployment Example

Managing Rollouts with Argo CD

Viewing Rollout Status

Promoting a Paused Rollout

Aborting a Rollout (Rollback)

Retrying a Failed Rollout

Restarting a Rollout

Debugging Issues

Additional Configuration

Init Containers

Worker Containers (Sidecars)

Volume Mounts

Pod Scheduling

Values Reference