At Runway, we've been using Kubernetes to run our production workloads since day one. Over nearly five years, we've operated, (re)created, and upgraded ourΒ K8s clustersΒ at least a dozen times. Our last upgrade was the hardest, as we had to modernize some of our oldest resources and third-party software deployed in the cluster, without causing application downtime. This post presents a few tips and strategies we learned during the process.
Feel free to jump around... (Hint: the Helm sections are really good).
...Or review all of our learnings in order.
We hope this one goes without saying, but you should perform your upgrades in a staging cluster first. If you don't have a safe pre-production cluster to prepare and test your upgrade in, you run a greater risk of breaking something in production.
The approach we've taken at Runway is to maintain one staging cluster for each production cluster (e.g. stage-aws1
, prod-aws1
, stage-aws2
, etc...). The production and staging cluster environments are intended to be as identical as reasonably and economically possible. Our staging clusters serve two purposes:
There is no shame if you've found yourself in a scenario where you don't have a dummy/non-production cluster to try an upgrade on first, but we do strongly suggest you do something to change that. We've found that managing our clusters via a Terraform module simplifies the process of creating new clusters and keeping existing ones in sync over time.
This is perhaps the most important learning in this post. Having a detailed document with checklists that outline what needs to be done, how to do it, and what to do if things go wrong is critical to ensuring the rollout goes smoothly. Updating the checklist as you go can also help keep your team informed in the process.
Here's an example of what the checklist might look like.
Pre Upgrade
kubent
to verify that no removed K8s APIs are used in the cluster.kubectl get nodes | grep v1.21
)Upgrade
Phase I - Control plane upgrade
Phase II - Node upgrades
cordon
and drain
nodes whose workloads which are preventing scale down.Post Upgrade
There are lots of opinions about the differences between a "runbook" and a "checklist". Our thinking is: who cares, so long as you produce something practical and useful. You want something simple enough that people will actually use and update it, but detailed enough that it isn't missing any crucial details.
We've found that most Kubernetes upgrades are fairly routine and unremarkable. You upgrade the control plane and nodes in staging, do some QA, and if all looks good, you do the same in production. Aside from a few transient errors during the control plane upgrade, the whole process can be pretty smooth.
That is, unless you're upgrading to a version which removes deprecated APIs that are still in use in your cluster. If that's the case, your struggle may be in the preparation rather than the upgrade.
This was the situation we found ourselves in when upgrading to v1.22. Many third-party resources we deployed were as old as our cluster, with some receiving no upgrades in four years. In fact, our NGINX ingress controller, prometheus operator, and cert-manager deployment were all well pre-1.0.
The K8s Deprecated API Migration Guide outlines every breaking API change in a way that humans can actually understand. Before upgrading, you must stop using any APIs that have been removed in the version you're targeting, and the migration guide can be a great first predictor of how hard is this going to be? We'll cover how to locate and remove the use of these APIs in the next section, but the takeaway here is that if there isn't much listed in the guide for your target upgrade version, you can expect to have a simpler upgrade.
The Kube-no-trouble project is the easiest way to detect what deprecated APIs are currently in use in your cluster before they cause a problem after an upgrade.
# Install in one line
# (source: https://github.com/doitintl/kube-no-trouble#install)
sh -c "$(curl -sSL https://git.io/install-kubent)"
# Uses your current Kubernetes context to detect and communicate
# with the cluster
kubent
6:25PM INF >>> Kube No Trouble `kubent` <<<
6:25PM INF Initializing collectors and retrieving data
6:25PM INF Retrieved 103 resources from collector name=Cluster
6:25PM INF Retrieved 0 resources from collector name="Helm v3"
6:25PM INF Loaded ruleset name=deprecated-1-16.rego
6:25PM INF Loaded ruleset name=deprecated-1-20.rego
__________________________________________________________________________________________
>>> 1.16 Deprecated APIs <<<
------------------------------------------------------------------------------------------
KIND NAMESPACE NAME API_VERSION
Deployment default nginx-deployment-old apps/v1beta1
Deployment kube-system event-exporter-v0.2.5 apps/v1beta1
Deployment kube-system k8s-snapshots extensions/v1beta1
Deployment kube-system kube-dns extensions/v1beta1
__________________________________________________________________________________________
>>> 1.20 Deprecated APIs <<<
------------------------------------------------------------------------------------------
KIND NAMESPACE NAME API_VERSION
Ingress default test-ingress extensions/v1beta1
Knowing is half the battle. Removing the use of the APIs is the other half.
Kubernetes supports multiple versions of an API at once. It is common for the "preferred" version of an API to be the latest stable version (or at least non-deprecated) but your use of it via CI or through a deployed operator can often reference an older deprecated version.
You can run kubectl api-resources
to view the preferredVersion
of all APIs currently supported by your cluster. If your versions of the resource in source control differ from the preferred version then you'll need to upgrade them in your code. But the good news is that the K8s API has probably already done the conversion for you. If you view a deployed resource you may find it reflected to you using the latest version and schema.
kubectl get -n $NAMESPACE $RESOURCE_TYPE $RESOURCE_NAME -o yaml
Kubernetes also maintains a tool called kubectl-convert
which you can use to convert between different API versions. Details here.
Kubernetes actively maintains the latest three releases and guarantees one year of patch support for every release. If you use a managed Kubernetes service like EKS, your cloud provider may maintain a different set of supported releases and release windows.
Beyond knowing which versions of K8s are supported, understanding the version skew each cluster component is tolerant to can help you understand the differences between what should be updated and what must be updated when planning your upgrade.
The policy states that:
kube-apiserver
). This is why we always upgrade the control plane first.kubelet
can be up to two minor versions behind kube-apiserver
(this saved us in a pinch once when a regression on an AMI forced us to hold kubelet
back).kube-controller-manager
, kube-scheduler
, and cloud-controller-manager
may be up to one minor version behind kube-apiserver
.kubectl
can be up to one minor version of kube-apiserver
(but we've gotten away with larger drifts π€).kube-proxy
to be tolerant to one version ahead and behind kubelet
, although this is not recommended.Many third-party tools and infrastructure packages you can deploy into K8s will offer you two options for doing so:
kubectl
command which deploys some static YAML files (sometimes called a "default static install")Always pick the Helm chart. We used to toy with deploying static manifest files because they were easier to reason about (no magic YAML generation) and we could give them a simple eye test to understand everything that would be deployed in our cluster. But we paid the price with this decision in the long run.
The reason being: Upgrades. If you use a Helm chart, your upgrade path will be much easier and well-defined. The day will undoubtedly come when you must upgrade a third-party component deployed in your cluster. When it does, if it was deployed with a Helm chart, other people will have already solved your problems for you (that's what chart maintainers do). If you deployed with static manifest files, you'll be the one having to find some meaningful mapping from an arbitrary collection of manifests thrown into your cluster and the ones associated with the release you want to upgrade to. Spoiler: This is terrible and a correct mapping/solution may not even exist.
There is no magic to Helm, even if it feels that way. It takes some YAML templates/
published by chart maintainers, applies some rendering to them based on default the values.yaml
file included in that published chart and settings you override, and then generates YAML files which get applied to your cluster.
Chart maintainers release their charts with sane defaults values and you get to narrowly change them in select ways. Helm remembers which values you override, so you don't have to worry about that next time you upgrade (although this section explains why you should).
Here's a few commands we lean on heavily when deploying or upgrading a helm chart release.
# Our normal upgrade process ----------------------------------------------------
REPO=ingress-nginx
CHART=ingress-nginx
NAMESPACE=nginx
# Get your currently deployed helm chart version and release name π
helm list -n $NAMESPACE
RELEASE=ingress-nginx
# List all remote versions of the chart that are available for install.
helm search repo $REPO/$CHART -l
DESIRED_CHART_VERSION=4.3.0
# Preview the differences between your deployed chart and one you're upgrading to
# Requires the helm `diff` plugin to be installed:
# helm plugin install https://github.com/databus23/helm-diff
helm diff upgrade $RELEASE $REPO/$CHART -n $NAMESPACE --version $DESIRED_CHART_VERSION --reuse-values
# Don't like something you see there? Review the source.
helm pull $REPO/$CHART --version $DESIRED_CHART_VERSION
DOWNLOADED_CHART="$CHART-$DESIRED_CHART_VERSION.tgz"
tar -xvzf $DOWNLOADED_CHART && rm $DOWNLOADED_CHART
open $CHART
# Deploy the upgrade. I'd suggest actually doing this with something like helmfile instead
# so that version and settings are source controlled. More on that later.
helm upgrade $RELEASE $REPO/$CHART -n $NAMESPACE --version $DESIRED_CHART_VERSION --reuse-values
# Rollback to the previous version if something went wrong
helm rollback $RELEASE -n $NAMESPACE
# A few additional helpers -----------------------------------------------------
# View all deployed YAML manifests associated with a release
helm get manifest $RELEASE -n $NAMESPACE
# View all YAML manifests a chart will apply **before** running it
helm upgrade --install $RELEASE $REPO/$CHART -n $NAMESPACE --dry-run
# See what configurable values a release is deployed with
helm get values $RELEASE -n $NAMESPACE # Only the values you overwrote
helm get values $RELEASE -n $NAMESPACE --all # All of the values (defaults + yours)
Now imagine trying to do all of that with some static YAML files tossed into your cluster at some point in the past. π
Another bonus is that Helm automatically removes manifests which aren't deployed in subsequent release upgrades or new chart versions. In other words, it cleans up after itself.
# No need to remember to remove up orphaned resources with...
kubectl delete serviceaccount permissive-serviceaccount-used-by-deleted-deployment --namespace you-totally-remember-to-do-this-every-time-you-delete-something-right
When upgrading a helm chart there are two versions you need to be aware of: The Helm chart version and one or more application versions. This is a concept you don't see everyday in the software engineering world.
We like to think of the chart version as a higher-level meta-version: It applies a semantic versioning scheme to the infrastructure manifests files needed to deploy a well-behaved version of an application (which has its own version).
A helm chart version upgrade may, but does not have to, change the application code version. Similarly, an application code version bump may require no infrastructure changes and result in no new helm version (although the chart maintainers may consider it best practice to do a minor helm chart version bump which changes only the application code version in the values.yaml
file).
+----------------+ +----------------+ +----------------+ +----------------+
| Helm Chart v1 |----| Helm Chart v2 |----| Helm Chart v2 |----| Helm Chart v3 |
+--------+-------+ +--------+-------+ +--------+-------+ +--------+-------+
| | | |
| | | |
v v v v
+----------------+ +----------------+ +----------------+ +----------------+
| App Code v1.0 |----| App Code v1.0 |----| App Code v1.1 |----| App Code v1.1 |
+----------------+ +----------------+ +----------------+ +----------------+
When trying to decide which helm chart version to upgrade to, we ask ourselves: "Can we tolerate the application code version used by the latest helm chart version?"
# You can use a command like this to list all available chart
# versions and their corresponding application code versions
$ helm search repo ingress-nginx/ingress-nginx -l
NAME CHART VERSION APP VERSION DESCRIPTION
ingress-nginx/ingress-nginx 4.6.0 1.7.0 ...
ingress-nginx/ingress-nginx 4.5.2 1.6.4 ...
ingress-nginx/ingress-nginx 4.5.0 1.6.3 ...
ingress-nginx/ingress-nginx 4.4.2 1.5.1 ...
ingress-nginx/ingress-nginx 4.4.1 1.5.2 ...
ingress-nginx/ingress-nginx 4.4.0 1.5.1 ...
ingress-nginx/ingress-nginx 4.3.0 1.4.0 ...
ingress-nginx/ingress-nginx 4.2.5 1.3.1 ...
Avoid scenarios where the only representation of what's deployed in your cluster are the deployed manifests and helm releases themselves. We highly recommend that you check all of the manifests you deploy into your cluster into a Git repo. Tools like kustomize
(now built into kubectl
) and helmfile
can help here. Better yet, you can automatically deploy manifests and helm charts from a git repository using tools like Argo CD and Flux, ensuring your Git repo is the source of truth for what's deployed inside K8s.
Good Infrastructure-as-Code (IaC) and GitOps practices can help you learn what needs to be upgraded, enable peer review for your suggested changes, and maintain a good history of past changes that can inform future ones.
This sounds like a joke, but it's not. Each new component you deploy in your cluster is baggage that must be brought along for the long haul. Exercising mindfulness when deciding to add components and regularly removing outdated or frivolous deployments can leave you with fewer blockers the next time you prepare to upgrade Kubernetes.
apply
If you found this post interesting, we invite you to explore career opportunities at Runway. We're always on the lookout for talented individuals who share our passion for cloud infrastructure. We run big, complex things, in far off places.
May your Kubernetes upgrades be smooth and your downtime nonexistent. Until next time, happy upgrading!