Technology / DevOps

Kubernetes on Docker: How Kubernetes Fixes Containers

Follow us

Published on March 10, 2021

If you’ve been even on the fringes of DevOps, software engineering and deployment, or even just IT in general for any amount of time in the last five years, you’ve probably heard of Kubernetes. It’s supposed to be a game changer, enabling faster, more reliable software deployments, drastically improved uptime, and incredibly flexible app scaling.

But what exactly is it under the hood? What problems does it solve? And what’s up with the funny name? Class is in session, welcome to Kubernetes 101.

A Brief History

Kubernetes was created in 2014 by a team of engineers at Google. Its predecessor within Google was a system called Borg, which allowed engineers to deploy workloads to clusters made up of tens of thousands of hosts without having to worry about scheduling their workloads to specific hosts, resource management, or infrastructure failures taking down their workloads. Any time you use a user-facing application, like Google Docs, Gmail, or even Google search, you are probably interacting with code running on Borg.

For workloads to run on a system like Borg or Kubernetes, they must be written with a microservice architecture in mind. To define microservices, first think about any old school server application. A web app might have an Apache or IIS web server front end and a MySQL or Microsoft SQL Server database. All the other bones of the app however, the UI, the data processing, the data interface, code for caching or payments or logging or whatever else is all one big self-contained application, commonly called a monolith.

There’s little flexibility in updating or maintaining individual pieces of a monolithic application without dealing with the application as a whole. If we can rewrite that application from the ground up where each piece is self-contained code running independently of the others, that is a microservice architecture. Need to fix a bug in the payments system? With a monolith, that would require an update to the entire app. For a microservice application however, just the payment service’s code is changed and just that one service is updated in production. Development, deployments, and operations are all greatly streamlined and simplified.

What are Containers?

These microservices typically run in containers. A container is all the code, configuration, and dependencies required for that service to run. Nothing is required to be installed on the host machine to run the container (except the container engine itself). If a service requires a specific version of Node or a specific Python package or anything else, that dependency is baked in the container. The dependencies run only within the container when it starts, eliminating the problem of “it worked on my machine, it should work in production.”

There are many container engines available to run your services, but far and away the most popular one is Docker. Docker has become fairly ubiquitous in the container world as an open source solution running on Linux, macOS, and Windows. Microsoft even started including Docker natively in Windows 10 and 2016 in a feature called Hyper-V Containers.

Containers: The Cons

Containers fix a lot of other problems apart from code portability. Is the web traffic to your app starting to grow? Launch additional front end containers and add them to your load balancer. Voila, instant app scaling! Got an indexing service container that crashes regularly? Set up some alerting so you know it went down, then log into the server and restart it. A physical host bites the dust, taking all the containers it was running offline? Redeploy all those containers onto a new host and say hello to kinda sorta slightly improved uptime!

Wait a minute, all these scenarios sound awful! All this manual work, scaling containers up and down, killing and restarting containers, and entire hosts taking parts or all of your app offline! Containers sound awful, don't they? No better than monolith applications, at least to the Ops team tasked with keeping your code running. If only there was a better way…

A Solution to Containers

Kubernetes is a container orchestrator. That means it’s a layer that sits on top of Docker (or your choice of other container engines) and manages the details about running, or in Kubernetes vocabulary, scheduling, your containers.

Let’s look back at our list of container problems. Need to scale up? Kubernetes uses replicas, which are simply more containers of the same service. Need to add more or less replicas? It’s as simple as changing the number of desired replicas and watching as Kubernetes starts more or kills off unneeded containers.

Replicas can be monitored for health, so if a container becomes unhealthy, Kubernetes kills the container and starts up a fresh one, maintaining your specified number of healthy replicas.

Finally, the dreaded KIA host server. Kubernetes isn’t intended to run on a single host, but on multiple hosts known as a cluster. All the nodes or individual machines in your cluster talk to each other to keep up with who is healthy, who isn’t, and who is running what containers. If the cluster decides a node isn’t healthy, it’s removed from operation and any containers it was previously running are rescheduled onto the other healthy nodes in the cluster.

This brings up another point, automatic container balancing. If you have a three node cluster and start three replicas of a container, guess what? Kubernetes will automatically schedule each container onto a separate node. That helps reliability in that a single node dying off will always leave some replicas of the container running on other nodes, helping your uptime metrics tremendously.

Kubernetes for Old School Admins

This clustering capability should sound an awful lot like clustering within virtualization platforms like Hyper-V or VMware’s ESX, and the comparison isn’t that different. If that’s the world you’re coming from, the more you dig into Kubernetes the more similarities like this you will see, which will help immensely in forming your own mental models about how Kubernetes works.

For example, Hyper-V and ESX have built-in virtual network capabilities which allow you to connect many virtual machines to the LAN via the physical host’s networking adapter. There’s also flexibility to only allow communication between other virtual machines, restricting access from outside the virtual network. The same functionalities all exist with virtual networking in Kubernetes.

Also consider security. A virtual machine has a few basic security considerations, firstly open ports must be closed on the OS firewall to limit network access. Kubernetes address this by requiring explicitly defined ports to allow access to, either from outside the cluster or within the cluster.

Another huge consideration is access to the host from the VM is explicitly denied. Imagine a multi-tenant VM host where a hacker could move from one compromised VM, get access to the host, then access to any other customer VM running on that same host. Kubernetes (together with Docker also) works to restrict access to breaching the host from within a poorly secured container.

Kubernetes Deployments and Updates

There’s one final set of benefits we need to look at that Kubernetes brings to your applications; these revolve around deploying code. Kubernetes deploys containers, so it needs somewhere to pull those containers from. Whether you are using prebuilt containers or building your own, all containers come from some container registry, whether that’s Docker Hub, Amazon’s Elastic Container Registry, Quay, or some form of self-hosted registry.

When you deploy your application Kubernetes checks if it has a copy of the containers already cached. If not, it downloads the container from the appropriate registry and deploys it however you specified (we’ll get more into how you actually deploy containers in a future article).

How about updates? Say you update your code then build a new version of your container? Do you have to go through some painful process of killing off all your container replicas of the old version, startup a bunch of new ones on the new version, and hope no one notices the downtime between? Nope, it’s as simple as telling Kubernetes “Hey, this container has a new version, 2.0. Go do a rolling update please.” Kubernetes will then download container v2.0, kill one old container, start a new one, confirm it’s online and healthy, then repeat the process until all the container replicas are updated. The traffic is shifted to whichever containers are live at the moment as the load balancer monitors the health of all containers. No downtime for users and no manual processes for you.

This points out an interesting philosophy behind container deployments: containers are immutable. A server you build and deploy, then over it’s lifecycle you might upgrade software, patch the OS, or make networking or user changes. A container however is deployed already with either all it’s code and settings baked in; nothing should change on the container once it’s alive. If something needs to be patched or reconfigured, you patch your code or tweak your config, build a new container, and update like we described earlier. This way containers can come and go with minimal configuration; deployments are safer, more reliable, and more repeatable.

What's Next? Kubernetes 201

In our next post on Kubernetes, we’ll dive into the components that make up a Kubernetes cluster. Then we’ll explore the many methods of deploying a Kubernetes cluster, from manually configuring your own nodes to fully managed clusters on cloud providers like AWS and Google Cloud.

We’ll also look at Minikube, a local single node cluster for development and testing that is super easy to set up. Finally, we’ll install and connect to Minikube and then create your first deployment, so stay tuned until next time.

Download

Don't miss out!Get great content
delivered to your inbox.

By submitting this form you agree to receive marketing emails from CBT Nuggets and that you have read, understood and are able to consent to our privacy policy.