CloudFront invalidation queue
When a CloudFront distribution handles a lot of cached content invalidating paths can cost computation.
When content changes, its exposed path needs to be invalidated. CloudFormation invalidations are billed for requested path, so lot of path invalidations cost lot of money.
A more elegant solution could be required instead of just invalidating /*
at every content change.
SQS queue
Being an operations queue the first service that comes to mind is SQS.
Using an SQS queue the message can hidden for a certain amount of time, sadly, only for standard, can be only up to 15 minutes. This might even be enough, the problem arises when you try to limit the number of parallel executions of the Lambda function.
using standard queues it is possible to limit the scaling of the lambda function, as described in the AWS blog article the minimum value that can be set is 2. By executing two lambda functions at the same time the invalidation queue is divided in half and two invalidations will always be created with possible duplication, not being able to clean the complete list but only having a view of half at a time.
For content that changes very often with possible applications it is not the right solution, otherwise it may already be a good solution
Scheduled Lambda
Another solution could be to create a custom queue implementation, with a DynamoDB table to store the queue and a scheduled Lambda function that executes queued invalidations.
For content that doesn’t change that often it could result in wasted executions, otherwise it is a very simple solution that already solves some problems.
StepFunction wait step
When an invalidation is requested it will be related to the current “time batch”. This is calculated by dividing the unix timestamp by the requested batch time, each time the code is executed the same “sequence token” will be returned for the entire batch period:
const sequence = Math.floor(currentTimestamp / BATCH_WINDOW_SECONDS)
Invalidation request will be store into a DynamoDB table using sequence and invalidation path and item key. This will prevent duplicated path inside the same batch:
await dynamo.transactWrite({
TransactItems: paths.map(path => ({
Put: {
TableName: TABLE_QUEUE_NAME,
Item: {
'sequence': sequence,
'path': path,
'ttl': currentTimestamp + BATCH_TTL_SECONDS
}
}
}))
}).promise()
Also adding a TTL specification to keep the table clean without too old invalidation history. The expiration time must be longer than the retry time so that the batch can be retrieved to retry the invalidation without the records disappearing
A StepFunction will be executed using the time sequence token as execution name, this in order to prevents a temporal batch from being executed twice.
Comment: Invalidate CloudFront distribution
StartAt: Wait
States:
Wait:
Type: Wait
SecondsPath: $.wait
Next: Execute Consumer
Execute Consumer:
Type: Task
Resource: "arn:aws:states:::lambda:invoke"
Parameters:
FunctionName: "${ConsumerFunctionName}"
Payload:
sequence.$: $.sequence
OutputPath: $.Payload
Retry:
- ErrorEquals:
- States.TaskFailed
IntervalSeconds: 60
MaxAttempts: 3
BackoffRate: 2.0
End: true
The StepFunction use a Wait
step to pause execution until the temporal batch end; during this period the requested invalidations are collected in the DynamoDB table.
At the end of the temporal batch, a Lambda function is executed which creates an invalidation on CloudFront using the paths collected in the Dynamodb table up to this moment.
I find this solution is much better suited to content that I change often or almost never and it is possible to manage the entire invalidation list by filtering out any duplications.
Repository: bitbull-serverless/cloudfront-invalidation-queue
Credits: Cloudcraft