WWCode Talks Tech #25: Intro to Kubernetes & GKE
Google has been developing and using containers to manage our applications for over 12 years. We've been using containers for a long time in technical terms as technology moves so fast. Many people started becoming familiar with containers in 2014 when Docker came out. An important thing to know is that containers existed well before Docker. In a few different forms, there were Solaris Zones, FreeBSD Jails, and the Linux Container Project, an open-source project to create containers in the Linux operating system. Docker came along in 2014 and made containers easy to use. They took off, and everybody started using them. We have a lot of stuff in Google that runs in containers. In the Google Cloud Platform, even the virtual machines run in containers. You usually hear about virtual containers running in virtual machines. We launch a lot of containers. We run containers at scale. If this is your first time getting familiar with Kubernetes or hearing about the Cloud Native landscape, I just wanted to give you a sense of what you're jumping into. As you get into Cloud Native, most of the projects on this landscape are built on Kubernetes. They're meant to extend what Kubernetes can do and add additional functionality. It's all coming from containers and Kubernetes. If technology is known as Cloud Native, that doesn't mean it only works in the cloud. It just means that it's designed to work best in the cloud. The vast majority of projects you see on this landscape can be run in all environments, not just the cloud.
Kubernetes is Greek for the helmsman and is the root of the words governor and cybernetic. It's for managing container clusters, managing the underlying infrastructure that you are going to run containers on, to weigh the actual machines and virtual machines that you will be running your applications on. Kubernetes is inspired by Google's experiences and internal systems for running containers at scale. Google Borg is the internal project that Google came up with to run containers at scale some time ago. At some point, a few Google engineers saw that and thought, I bet a lot of other people would love to run containers like this. They came up with a concept for Kubernetes and open-sourced it. June 6th of, 2014 was the first commit. Kubernetes supports multiple clouds and bare metal environments. It’s a very flexible technology. Kubernetes supports multiple container runtimes. If you've heard of containers before, you've likely heard of Docker. There are other container runtimes, and Docker is not a container runtime per se, at least from Kubernete's perspective. Docker uses the container runtime container D to do what it does. Container D is on the cloud native landscape. Creo is a popular one that people look at for running containers in Kubernetes. There are a variety of ways that you can run containers in Kubernetes. It's all open source. It's written in Go. Kubernetes aims to allow you to manage applications while abstracting the machines away so that you don't have to worry about those as much.
I explain containers as cookies. I use this analogy for a variety of reasons. The main thing I'm trying to get at with this analogy is explaining container benefits. One reason I like to use the chocolate chip cookie as an analogy for containers is that your goal with a container is to run some application. That's like the chocolate chips in the cookie. It's the best part, the reason you're there. Containers are all about applications. Applications don't always just run very smoothly by themselves. They often need a lot of other things, a lot of other dependencies, binaries, and libraries to run. A container gives you a mechanism for taking all those extra ingredients and baking them up with your application into a portable package like a cookie. One thing people like about containers is that they can be run on various operating systems and different machines. If containers are like cookies and Nutties is a container orchestrator, which is all about running containers at scale. In this analogy, the Kubernetes container orchestrator would be like a cookie factory, not because it's where the cookies are made. Cookies could be made somewhere else. This cookie factory manages wide-scale cookie distribution at scale. Pods are really how Kubernetes thinks about containers. From Kubernetes' perspective, it hardly even runs containers. It doesn't think about them. It thinks about pods instead. When you want to run a container in Kubernetes, you will end up packaging it in a pod.
We have a pod spec., a YAML file that tells Kubernetes what to run. One of the great things about Kubernetes is your ability to declare what you want to have running. You can define it in a file and tell Kubernetes to make it happen. This is a big benefit of Kubernetes. Our pod spec. is the package of cookies in this analogy. The cookie can run not just one container but multiple. However, the most common use case is running just one pod container. Why would you ever run multiple containers in a pod? The most common uses are things like proxies. A lot of service meshes will be put in what we call a sidecar container, which is put in the pod, next to your container that's running the app. These sidecar containers can do things like acting as a proxy. As traffic comes into the pod intended for your application, the proxy can intercept that traffic and make sure that it's allowed to talk to the pod. That's one thing that a lot of service meshes do. They also add some logging and monitoring capabilities into the sidecar. They'll grab logs from your existing application and put those into your dashboard. With the sidecar container, you can get extra observability, logging, monitoring, or proxying without adding any extra code to the application. Two containers in a pod will share storage space, they share network space, and they generally want to be two things that are very tightly coupled.
Replication controllers are a key part of Kubernetes. They provide tools to run your application in a highly available way. In this analogy, this would be like multiple packs of these cookies at the store. If you picked up any of these packs of cookies at the store, you would expect it to be pretty much the same. There's an element of quality control here. Kubernetes thinks about containers in terms of pods, but with all of these highly available capabilities that Kubernetes has, like creating replicas, it has another object called a deployment. There are a few different types of objects that you can use this way. A deployment owns your pods for a specific application and can take care of them and do things like creating replicas of them and doing rolling upgrades. If you make a change, the deployment has some smarts built into it to roll out that upgrade instead of just killing all of your containers and bringing them back up in some disruptive manner. Deployments are useful for any application that you want to have consistently running. It's generally best for stateless applications. Things that don't need to have stated, if they disappear off the face of the earth and their storage goes with them, then that should be okay. A deployment is a type of replication controller. The purpose of replication controllers is quality control.
Technically all of these containers are still running on bare metal or virtual machines, some kind of infrastructure somewhere. Kubernetes has to decide where those pods will run and on what hardware they will run. That is Kubernetes scheduling. You can put anything you want as labels on a Kubernetes object. They can be used for very useful things in Kubernetes, like the node selector section. The infrastructure is real, whether a bare metal server or a virtual machine. It's also abstracted in Kubernetes and represented as an object. If you run kubectl get nodes, Kubernetes will output information about all the node objects it knows about. Since those are objects in Kubernetes, you can also add labels, which can be useful. When you create a container in Kubernetes, how are you going to access the application that's running in that container? Services are the answer to that question.
There are a few different types of services, but services have a few different types. Type cluster IP means that that application will only be accessible to other applications within the Kubernetes cluster. There's also node port, which means I want this application to be available on a port on the actual node it's running on. Since Kubernetes is abstracting away all of the machines underneath it, sometimes that can be a little strange because you have to figure out which actual node your pod is running on. You can access it through the port on that node. There's a load balancer, which is an element of the cloud native. Load balancers make so much sense in the cloud because, generally, if you're using something like a managed Kubernetes service. It will be pre-configured to understand that you're running in Google Cloud. It can spin up a load balancer resource in Google Cloud for your application. That makes it so that your application can be easily available on the internet. Load balancers make things so easy in the cloud, but they can be a little complicated if you're running things on-premises because then you have to use a node port to connect your pod to an existing load balancer.
Persistent storage for stateful applications in Kubernetes is called persistent volumes and persistent volume claims. There are also namespaces. Namespaces and role-based access are really important to understand with Kubernetes. A common use case for namespaces would be a production namespace, test namespace, or namespaces for different teams. You can use role-based access controls to control which users can deploy into which namespaces. You can split up a cluster across teams or use cases. Running Kubernetes from scratch open source is pretty difficult. If you're trying to run Kubernetes for your business or trying to run it in production, there are many things you have to think about in the control plane. You have to worry about all of the worker nodes, provisioning and managing those, the security of those nodes, making sure that you keep them patched and upgraded, making sure you keep Kubernetes itself patched and upgraded, all of the monitoring and management of the cluster and all sorts of things.
There are certain businesses where it's really important for the business to do things this way and to run them themselves. But, for most businesses, that's more than they need to worry about. Cloud providers looked for a way to give Kubernetes to their customers in an easy way. The idea with any managed Kubernetes service you find in the cloud will be to start a cluster with one click. You'll want to be able to view your clusters and workloads in some form of a dashboard. Google Cloud does a good job of having a lot of different types of dashboards available and showing you what's happening in your Kubernetes clusters with a dashboard. In the case of the Google Kubernetes engine, you've got Google which has been running containers for a long time. Since Kubernetes has been around, Google has been running a lot of Kubernetes clusters. You have us taking care of the control plane of your cluster. For most managed Kubernetes services, the cloud provider then manages your control plane of Kubernetes and takes care of it, and makes sure that it stays running. That is the case with the Google Kubernetes engine.
You can also connect your Kubernetes cluster to other cloud services. It reduces what you have to worry about. There are two modes of operation of the Google Kubernetes engine. There is a standard mode, which is how GKE has been running for almost six years. There is autopilot mode. Doing it yourself, there's more that you have to worry about when you're running an open-source Kubernetes cluster. Managed Kubernetes, like GKE standard mode, there's quite a bit less. Google's then taking care of the control plane. We're giving you useful tools for understanding what's going on in your cluster with the dashboards and things. We take care of a lot of the challenges around upgrading the cluster. There are still some things you have to worry about, like patching and upgrading and ensuring that the compute nodes you will be running your containers on are secure and worker node provisioning and management. In a GKE standard mode cluster, you still have to pick what machine and virtual machine instance types in Google Cloud you will be running.
Autopilot mode takes this managed Kubernetes concept a step further. Instead of having you figure out which nodes to provision, Google also takes on that. You wouldn't see that in your cloud console, and you wouldn't see your nodes in your cloud console either. We're going to pick those for you based on your pod definitions and your workload definitions. You can use resource requests and limits to request if this application will need X amount of CPU or memory. You can also place limits on your application. Autopilot looks at those types of things and tries to pick nodes for you. The goal here is to provide a more hands-off experience for production like a Kubernetes expert. Our goal here is to make it a lot easier to get going with Kubernetes since you won't have to worry about picking those virtual machines and taking care of them as much. That gives you a bit of a stronger security posture when you hit that button to create a cluster because now Google is in charge of taking care of those nodes.
We have a hardening guide in the Google Cloud documentation of recommendations for how you should secure a node. And now, since we're taking care of those nodes, we have to take on much of that responsibility for you. Google is your site reliability engineer. Site reliability engineering is a discipline that Google invented. The concept there, since Google is a website, is taking care of the reliability of that website. Site reliability engineers are people who keep things going. Google has a lot of experience running Kubernetes clusters and is taking care of your Kubernetes cluster if you're using an autopilot node or managed Kubernetes service GKE. If you've ever used open-source Kubernetes anywhere else, it will have the same commands, objects, and resources.
If you're familiar with using the Google Kubernetes engine in standard mode, there's a lot about GKE autopilot that's the same. GKE standard mode is a pay-per-node model. If you pick a virtual machine type, you will pay different amounts based on the machine type, which is all in the documentation. In the case of autopilot mode, we're taking care of those machines for you and picking which ones you're using. Autopilot mode, it's pay per pod or pay per resource. A pod-level service level agreement guarantees 99.9% uptime of autopilot pods deployed into multiple zones. You still have to think about deploying your pods in a highly available fashion. But when you do, we have an SLA for the pods themselves. Autopilot mode is a little bit different from serverless. It's more like node-less Kubernetes, where you're just not worrying about the nodes, but it's still Kubernetes rather than the full serverless model. Even though the payment model of paying according to the resources you're using sounds a bit similar.
I'm going to go into auto-scaling a little bit here. The top two here are auto scalers that are available in Kubernetes. If you're running a Kubernetes cluster anywhere, you should have access to these. Horizontal pod autoscaler will watch what's happening with your pods and then create more pods or have fewer pods depending on what's happening. Vertical pod autoscaler sets those requests and limits. Cluster autoscaler is more or fewer nodes. If you run your horizontal pod autoscaler and it creates a whole bunch of pods, and now all of your nodes are full, cluster autoscaler can detect that and spin up more nodes for you. That's a GKE-specific thing, a Google Cloud specific thing. The node auto-provisioning selects the sizes of those nodes for you. That can be pretty challenging. Autopilot mode is just running those bottom two things for you. We're just taking care of everything that's auto-scaling your nodes for you. In autopilot mode, all you would think about would be the horizontal pod autoscaler and the vertical pod autoscaler.