When you run an RKE cluster on AWS, pulling images from an ECR is something to be expected. However, it is not the easiest of things to do that, since the credentials produced by aws ecr get-login
expire every few hours and thus, you need something to refresh them. In Kubernetes world this means we should use a CronJob.
What we do in this post is a simple improvement on other work here and here. The biggest difference is using the Bitnami docker images for aws-cli and kubectl to achieve the same result.
So we need a CronJob that will schedule a pod to run every hour and refresh the credentials. This job will:
- run in a specific namespace and refresh the credentials there
- have the ability to delete and re-create the credentials
- do its best not to leak them
- use "well known" images outside the ECR in question
To this end our Pod needs an initContainer that is going to run aws ecr get-login
and store it in an ephemeral space (emptyDir) for the main container to pick it up. The main container in turn, will pick up the password generated by the init Container and complete the credential refreshing.
Are we done yet? No, because by default the service account the Pod operates on, does not have the ability to delete and create credentials. So we need to create an appropriate Role and RoleBinding for this.
All of the above is reproduced in the below YAML. If you do not wish to hardcode the AWS region and ECR URL, you can of course make them environment variables.
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: dns
name: ecr-secret-role
rules:
- apiGroups: [""]
resources:
- secrets
- serviceaccounts
- serviceaccounts/token
verbs:
- 'create'
- 'delete'
- 'get'
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: dns
name: ecr-secret-rolebinding
subjects:
- kind: ServiceAccount
name: default
namespace: dns
roleRef:
kind: Role
name: ecr-secret-role
apiGroup: ""
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: ecr-update-login
namespace: dns
spec:
schedule: "38 */1 * * *"
jobTemplate:
spec:
template:
spec:
restartPolicy: Never
initContainers:
- name: awscli
image: bitnami/aws-cli
command:
- /bin/bash
- -c
- |-
aws ecr get-login --region REGION_HERE | cut -d' ' -f 6 > /ecr/ecr.token
volumeMounts:
- mountPath: /ecr
name: ecr-volume
containers:
- name: kubectl
image: bitnami/kubectl
command:
- /bin/bash
- -c
- |-
kubectl -n dns delete secret --ignore-not-found ecr-registry
kubectl -n dns create secret docker-registry ecr-registry --docker-username=AWS --docker-password=$(cat /ecr/ecr.token) --docker-server=ECR_URL_HERE
volumeMounts:
- mountPath: /ecr
name: ecr-volume
volumes:
- name: ecr-volume
emptyDir: {}
You can now use the ECR secret to pull images by adding to your Pod spec:
imagePullSecrets:
- name: ecr-registry
And yes, it is possible to solve the same issue with instance profiles:
”ecr:GetAuthorizationToken”,
“ecr:BatchCheckLayerAvailability”,
“ecr:GetDownloadUrlForLayer”,
“ecr:GetRepositoryPolicy”,
“ecr:DescribeRepositories”,
“ecr:ListImages”,
“ecr:BatchGetImage”
This is a solution when for whatever reason, for when you cannot. Plus it may give you ideas for other helpers using the aws-cli and kubectl. Or for other clouds and registries even.