Over the past three years, I've navigated the sometimes turbulent waters of managing Kubernetes clusters. This journey, filled with challenges and discoveries, has given me a deep understanding of this cutting-edge technology and its many facets. In this article, I wish to share with you the ten most valuable lessons I've learned as a Kubernetes cluster manager.

These lessons span a range of topics, from managing the underlying infrastructure to optimizing deployment processes, and include best practices for ensuring the scalability and security of your clusters. Whether you're new to the world of Kubernetes or a seasoned expert, these insights will provide you with an enriching perspective on how to effectively manage your Kubernetes clusters.

Let's dive together into these teachings, the fruits of three years of experiences, successes, and challenges overcome.

Lesson 1: Better use Kubernetes in the cloud

Unless under extreme constraints, managing the underlying Kubernetes infrastructure yourself is futile. You'll spend time debugging issues that add no value to your business. Being an expert in kube-api, kube-apiserver, kubelet, etcd, kube-proxy, etc., is good, but day-to-day maintenance doesn’t add value. No need to claim expertise in these concepts to effectively manage a cluster. Delegate low-level tasks to cloud service providers (AWS, Azure, GCP, OVH...) who handle it better. At @HK-TECH, we chose AWS and EKS (ECS isn't Kubernetes!).

Lesson 2: Deploy your entire Kubernetes-related infrastructure with code.

 infrastructure with code. Not a single piece of your cluster should be manually set up via the console, not even a simple tag. No 'I fixed it quickly on the console, I'll update the code later.' You'll never do it.

Lesson 3: Avoid overusing helm charts that you don’t fully control. 

 understand. Yes, they work fast, saving you from creating your YAML, until an update breaks everything. If you're lazy or short on time, at least make the effort to understand each variable in the values.yaml file, no default values. At HK-Tech, the rule is no Helm chart; at worst, we retrieve the templates.

Lesson 4: Kubernetes doesn’t like lift and shift. 

You'll need to redesign your old apps to make them cloud-compatible. It's not Kube's job to adapt to your app; it's the application's job to adapt. If you can't recode your apps, maybe stick to your good old VMs.

Lesson 5: Mesh or not to mesh? 

 if unnecessary. How to know? Ask two questions: Do my cluster applications communicate with each other? Do exchanges between my cluster applications need to be secured? If yes to both, then installing a service mesh might be useful. I don't have a specific recommendation; generally, they're all comparable.

Lesson 6: Avoid multiplying tools. 

 numerous auxiliary tools promising efficiency in cluster management: argocd, lens, k9s, keda, krew, kubectx, kubens, kail... Avoid piling them up; the good friend kubectl fulfills 90% of needs. Personally, I limit myself to using kubectx, kubens, k9s, which significantly aids cluster administration.

Lesson 7: You must always define resource limits (memory and CPU) allocated to your pods. 

This prevents badly coded or configured apps from hogging all your cluster resources and knocking down your apps one after another because some pods are too greedy. This is why you should be cautious with helm charts and always check the source code behind the nice packaging.

Lesson 8: Think stateless.

Ideally, avoid persisting data in your pods. If unavoidable, favor mounts on NAS over disks. Otherwise, you'll be surprised to find some pods in your deployment lack access to persisted resources. Hard drives can't be mounted on a single node. So, if your pods are distributed across nodes, pods from the same node may see the same data but not those from other nodes. NAS mounts like EFS sidestep this issue.

Lesson 9: Configure HPA (Horizontal Pod Autoscaler). 

If you want to leverage Kubernetes' ability to autonomously manage resource usage based on demand, configure HPA for all your application projects. (Another limitation of helm charts, often absent here).

Lesson 10: Don’t be afraid of change. 

On average, expect 3 cluster version upgrades per year, about one update every 4 months. Some updates are seamless, but often there are impactful changes. To prepare for these upgrades, I recommend thoroughly reading release notes and experiences from those who updated before you. What I recommend and what we've implemented at HK-TECH is to always be one release behind the latest version (unless there are security changes).

There you go, Happy Kubernetes."


Published on: Sunday, November 12, 2023

Read More