Scaled Jobs
This guide covers configuring the pgt-scaledjob Helm chart for deploying event-driven jobs to Kubernetes. The chart creates KEDA ScaledJob resources that automatically scale job executions based on external event sources such as message queues, databases, or custom metrics.
When to Use ScaledJobs
ScaledJobs are ideal for:
Queue processing - Process messages from SQS, RabbitMQ, Kafka, Azure Service Bus, etc.
Event-driven workloads - Scale based on events from external systems
Batch processing - Process items in batches with automatic scaling
Background tasks - Execute tasks triggered by external events
For time-based scheduling, use CronJobs instead.
Prerequisites
Add the chart as a dependency in your Chart.yaml:
apiVersion: v2
name: my-scaledjob
version: 0.0.1
dependencies:
- name: pgt-scaledjob
version: 0.0.3
repository: oci://public.ecr.aws/w9m9e0e9/pgt-helm-chartsAfter adding the dependency, run:
ℹ️ KEDA Required
ScaledJobs require KEDA to be installed in your cluster. KEDA monitors your event sources and automatically creates Jobs when events are detected.
Basic Configuration
Required Fields
⚠️ Memory Requests and Limits
Always set memory requests equal to memory limits. This ensures your pod receives a Guaranteed Quality of Service (QoS) class, which provides predictable scheduling and OOM kill priority.
Development Environment & Cost Optimization
When deploying ScaledJobs to non-production environments, consider these settings to reduce costs:
💡 Spot Instances
The
preferSpotInstances: truesetting prefers scheduling your workloads on spot instances when available, which can significantly reduce compute costs. If you're interested in enabling spot instances for your environment, please reach out to the platform team to discuss your requirements and ensure your applications are suitable for spot instance usage.
💰 Cost-Saving Tips for ScaledJobs
Scale to zero: Set
minReplicaCount: 0to ensure no jobs run when there are no events to processLimit max replicas: Use lower
maxReplicaCountin non-production to prevent runaway scalingIncrease polling interval: A longer
pollingIntervalreduces API calls to your event sourceRight-size resources: Development jobs often need fewer resources than production
KEDA Triggers
Triggers define what events cause KEDA to create new Jobs. KEDA supports many trigger types including message queues, databases, HTTP endpoints, and custom metrics.
AWS SQS Queue
Scale based on messages in an AWS SQS queue:
Azure Service Bus Queue
Scale based on messages in an Azure Service Bus queue:
RabbitMQ Queue
Scale based on messages in a RabbitMQ queue:
Kafka Topic
Scale based on consumer lag in a Kafka topic:
PostgreSQL Query
Scale based on a PostgreSQL query result:
💡 More Triggers
KEDA supports 50+ trigger types. See the KEDA Scalers documentation for the full list and configuration options.
Trigger Authentication
When triggers need to authenticate with external services, use TriggerAuthentication to provide credentials securely.
💡 Creating Cloud Identities
Before configuring trigger authentication, you need to create IAM roles or managed identities. See:
AWS IAM Roles (IRSA) - Configure IAM roles for EKS workloads
Azure Workload Identity - Configure managed identities for AKS workloads
AWS with Pod Identity (IRSA)
For AWS services using IAM Roles for Service Accounts:
Azure with Workload Identity
For Azure services using Workload Identity:
Credentials from Secrets
For services requiring username/password or connection strings:
Scaling Configuration
Polling Interval
How often KEDA checks the trigger source for events:
Replica Limits
Control the minimum and maximum number of concurrent jobs:
Scaling Strategy
Control how KEDA creates jobs based on events:
default
KEDA manages scaling as events are detected
accurate
Creates exactly the number of jobs matching events in queue
eager
Creates maxReplicaCount jobs immediately when events are detected
Job Configuration
Job Execution Settings
Job History
Configure how many completed jobs to retain:
Container Configuration
Command and Arguments
Override the container's default entrypoint and arguments:
Environment Variables
Direct Environment Variables
Load from ConfigMap or Secret
External Secrets
The pgt-scaledjob chart includes pgt-secrets as a subchart for fetching secrets from AWS Secrets Manager or Azure Key Vault. For full configuration options, see the PGT Secrets documentation.
Volume Mounts
Mount ConfigMaps or Secrets as files:
Pod Configuration
Labels and Annotations
Tolerations
For scheduling on specific nodes:
Prometheus PodMonitor
Enable metrics scraping for ScaledJobs that expose metrics during execution:
Complete Examples
AWS SQS Queue Processor
Process messages from an SQS queue with IRSA authentication:
Azure Service Bus Queue Processor
Process messages from an Azure Service Bus queue with Workload Identity:
RabbitMQ Queue Processor
Process messages from a RabbitMQ queue:
Troubleshooting
Use Argo CD to investigate issues with ScaledJobs.
Viewing ScaledJob Status
Navigate to your application in the Argo CD UI
Locate the ScaledJob resource in the application tree
Click on the ScaledJob to view its details including:
Current replica count
Trigger status
Last scale time
Viewing Job Executions
In the Argo CD application tree, look for Job resources created by the ScaledJob
Click on a Job to see its status and completion time
Expand the Job to see its Pods
Checking Pod Logs
In the application tree, find the Pod created by a Job
Click on the Pod resource
Select the Logs tab to view container output
Checking TriggerAuthentication
Locate the TriggerAuthentication resource in the application tree
Verify the authentication configuration matches your trigger requirements
Common Issues
Jobs not being created:
Verify KEDA is installed and running in the cluster
Check if the trigger source has events/messages
Verify TriggerAuthentication credentials are correct
Check KEDA operator logs for trigger errors
Jobs failing repeatedly:
Check Pod logs for error messages
Verify secrets and ConfigMaps are correctly configured
Ensure the ServiceAccount has required permissions
Check if
backoffLimitis too low
Authentication errors:
Verify ServiceAccount annotations for IRSA/Workload Identity
Check TriggerAuthentication references match the trigger configuration
Ensure IAM roles/managed identities have correct permissions
Scaling issues:
Check
maxReplicaCountisn't limiting scalingVerify
pollingIntervalis appropriate for your use caseReview
scalingStrategyfor your workload pattern
Values Reference
name
string
nil
Required. ScaledJob name
organisationName
string
nil
Required. Organisation name
scaledjob.pollingInterval
int
30
How often KEDA checks triggers (seconds)
scaledjob.successfulJobsHistoryLimit
int
3
Successful jobs to retain
scaledjob.failedJobsHistoryLimit
int
1
Failed jobs to retain
scaledjob.maxReplicaCount
int
100
Maximum concurrent jobs
scaledjob.minReplicaCount
int
0
Minimum jobs to maintain
scaledjob.scalingStrategy
string
default
default, accurate, or eager
scaledjob.parallelism
int
1
Parallel pods per job
scaledjob.completions
int
1
Required successful completions
scaledjob.backoffLimit
int
3
Retries before job failure
scaledjob.activeDeadlineSeconds
int
nil
Maximum job duration
scaledjob.restartPolicy
string
OnFailure
OnFailure or Never
scaledjob.triggers
list
[]
Required. KEDA trigger configurations
affinity.nodeAffinity.preferSpotInstances
bool
false
Prefer scheduling on spot instances
container.image.registry
string
nil
Required. Container registry
container.image.repository
string
nil
Required. Image repository
container.image.tag
string
nil
Required. Image tag
container.command
list
[]
Container entrypoint override
container.args
list
[]
Container arguments
serviceAccount.name
string
nil
Required. ServiceAccount name
serviceAccount.annotations
object
{}
ServiceAccount annotations
triggerAuthentication.enabled
bool
false
Enable TriggerAuthentication
triggerAuthentication.secretTargetRef
list
[]
Secrets for authentication
triggerAuthentication.podIdentity
object
{}
Pod identity configuration
environmentVariables
list
[]
Environment variables
environmentVariablesFrom
list
[]
Load env vars from ConfigMap/Secret
volumes
list
[]
Volume mounts
podMonitor.enabled
bool
false
Enable PodMonitor
pgt-secrets.enabled
bool
false
Enable external secrets
Last updated
Was this helpful?
