I'm very curious to hear field reports from people who switched to using Kubernetes in production in the last year or so. Why'd you do it? What got better and what got worse? And are you happy with the change?
At GitLab we're making the application ready for Kubernetes, we have not switched yet. It required us to untangle many parts. For example we used to store uploads on disk before moving them to object storage, now it goes directly to object storage. There where many interactions between our applications https://docs.gitlab.com/ee/development/architecture.html#com... over disk or sockets that we need to clean up.
When its done we expect to be able to scale the different parts of our application independently. This also makes it easier to detect problems (why did Gitaly suddenly autoscale to twice the number of containers).
At GitLab, our installation method for the product was also tightly based around chef-solo and chef omnibus. At first we were trying to continue using these technologies inside our containers, but this required our containers to run as root.
So a lot of our effort in moving to kubernetes has required us to try and duplicate all of our previous installation work in ways that don't require root access while the containers are running.
To help, we've chosen to use Helm to manage versioning/upgrading our application in the cluster: https://helm.sh/
I recently present at Cloud Expo Europe, describing the how and why GitLab has been working to produce a cloud native distribution. I've outlined some of the blockers we've faced, and what we've been working on to resolve these going forward. Those changes benefit both our traditional deployment methodologies, as well as our efforts for deployment on Kubernetes.
Our biggest blockers have been how to separate file system dependency for our various parts. For Git content, we've implemented Gitaly (https://gitlab.com/gitlab-org/gitaly/). For various other parts, we've been implementing support for object storage across the board. Doing this while keeping up with the rapid growth of Kubernetes and Helm has been a challenge, but entirely worth the effort.
We are also bringing all of these changes to GitLab CE, as a part of our Stewardship directive (https://about.gitlab.com/stewardship/, https://about.gitlab.com/2016/01/11/being-a-good-open-source...). We don't feel that our efforts to allow for greater scalability, resiliency, and platform agnosticism belong only in the Enterprise offering, so we're actively ensuring that everyone can benefit, just as they can contribute.
If you create a trial account on GCP, you can then use GitLab to make a new GKE cluster and it will be automatically linked. You can then click a single button to deploy the Helm Tiller, a GitLab Runner, and then run your CI/CD jobs on your brand new cluster!
The GCP trial is pretty nice, you get $300 in credits and you won't be automatically billed at the end: https://cloud.google.com/free/
As twk3 has said, we have examined multiple other options.
One big reason we've chosen to implement with Kubernetes is community adoption. You can now run Kubernetes on AWS, GCP, Azure, IBM cloud, as well as a slew of other platforms both on-premise and as a PaaS. With the expanding ecosystem, our decision has only been further confirmed, as the list of available solutions and providers continues to grow rapidly (https://kubernetes.io/docs/setup/pick-right-solution/)
Another big reason is far simpler: ease of use and configuration. A permanent fixture in our work for Cloud Native GitLab has been that this method must be as easy, if not easier to provide a scalable, resilient, highly-available deployment of GitLab at scale.
We can’t say, “This is our new suggested method. By the way, it’s harder.”
What we have found is that many other solutions require a much larger initial investment of time to understand and configure GitLab as a whole solution, as compared to the combination of Kubernetes with Helm. Helm provides us templating, default value handling, component enable/disable logic, and many other extremely useful features. These allow us to provide our users with a practical, streamlined method of installation and configuration, without the need to spend countless hours reding documentation, deciding on the architecture, and making edits to YAML.
At GitLab, we did evaluate Mesosphere DC/OS. What turned our focus to Kubernetes as our primary cluster install target was the speed at which it was developing, and after watching the space and talking to partners/customers we formed the opinion that Kubernetes was going to lead the pack.
We've been looking at these technologies for two years now, with our focus being Kubernetes for the last one.
GitLab has been investing significantly in Kubernetes, both because we believe in the platform but also because we see significant demand from our customers as well. It's ability to run on-premise as well as availability in a wide variety of managed cloud flavors is a huge benefit, and likely a driver of the demand.
We also try to use the same deployment tools for GitLab.com that we provide to customers, and this lets us offer a scalable production-grade deployment method that can run nearly anywhere.
We have switched last year and achieved 45% cost savings. This is mostly because we can now easily run all CPU heavy activity to preemptible nodes if they are available. It is also much simpler to gracefully scale down the number of nodes/containers outside peak hours.
We also no longer have to manage servers.
All of this was possible without Kubernetes, but it is so much easier with it. Although admittedly much of the ease of use is due to not having to manage Kubernetes itself – we use Google Kubernetes Engine. I would not want to install or manage a Kuberenetes cluster/master myself.
I can recommend managed Kubernetes to anyone that runs many different apps/services.
We run all of our production backend on it at Monzo (we're a bank.) [1]
We first deployed v1.2 nearly 2 years ago, and I can say Kubernetes has made some amazing improvements in that time – in terms of functionality, usability, scalability, and stability. This release continues that trend.
We've invested a lot in it too with things like making sure our etcd clusters backing Kubernetes are really solid [2], and we've even added some of our own features to Kubernetes like allowing CPU throttling to be more configurable to get more predictable >p99 latency from our applications.
We've been through our share of production issues with it (some of which we've posted publicly about in the hope that others can learn more about operating it too [3]), but I don't think there's any way in which we could run an infrastructure as large and complex as ours with so few people without Kubernetes. It's amazing.
One data point: I've wanted to but so far have not made much progress. I'd say my biggest impediment has been documentation: I can get it installed, but making it work seems to be beyond the scope of the documentation. I got closest once I found out about "kubespray" to install the cluster rather than using the official Kubernetes installation docs process.
I spent a couple weeks not quite full time going through tutorials, reading the documentation, reading blog posts and searching for solutions to the problems I was having. My biggest problem was with exposing the services "to the outside world". I got a cluster up quickly and could deploy example services to it, but unless I SSH port forwarded to the cluster members I couldn't access the services. I spent a lot of time trying to get various ingress configurations working but really couldn't find anything beyond an introductory level documentation to the various options.
Kubespray and one blog post I stumbled across got me most of the way there, but at that point I had well run out of time for the proof of concept and had to get back to other work.
My impression was that Kubernetes is targeted to the large enterprise where you're going to go all in with containers and can dedicate a month or two to coming up to speed. Many of the discussions I saw talked about or gave the impression of dozens of nodes and months of setup.
If you're looking for a better "out of the box" experience, I'd recommend having a look at OpenShift.
You can use either their free tier in cloud or use the Open source OpenShift Origin for trials (there's also MiniShift, which is similar to MiniKube).
From my looking at it OpenShift comes with some of the parts that base Kubernetes leaves to plugins, so things like ingress, networking etc are installed as part of the base.
Are you trying to play around, or set up a working cluster? If you just want to play around, I'd suggest just using minikube to get things going.
Anecdotally, I got an HA cluster running across 3 boxes in the space of about a month, with maybe 2-3 hours a day spent on it. The key for me was iterating, and probably that I have good experience with infrastructure in general. I started out with a single, insecure machine, added workers, then upgraded the workers to masters in an HA configuration.
I don't think it is really that hard to get a cluster going if you have some infrastructure and networking experience, especially if you start with low expectations and just tackle one thing at a time incrementally.
Full Disclosure: I work for Red Hat in the Container and PaaS Practice in Consulting.
At Red Hat, we define an HA OpenShift/Kubernetes cluster as 3x3xN (3 masters, 3 infra nodes, 3 or more app nodes) [0] which means the API, etcd, the hosted local Container Registry, the Routers, and the App Nodes all provide (N-1)/2 fault tolerance.
Not to brag, since we're well practiced at this, but I can get a 3x3x3 cluster in a few hours, I've lead customer to a basic 3x3x3 install (no hands on keyboard) in less than 2 days, and our consultants are able to install a cluster in 3-5 working days about 90% of the time, even with impediments like corporate proxies, wonky DNS or AD/LDAP, not so Enterprise Load Balancers, and disconnected installs. Making a cluster read for production is about right-sizing and doing good testing.
Worth mentioning that my "got a cluster working in a month" time frame includes starting with zero Kubernetes experience, and no etcd ops experience. Using kops, pretty much anybody can get a full HA cluster running in about 15 minutes. On top of that, it's maybe 5 more minutes to deploy all the addons you'd expect for running production apps on a cloud-backed cluster.
The great thing about automation is that once you have these basic tools (Prom/Graf monitoring/alerting, ELK, node pool autoscaling, CI/CD) implemented as declarative manifests, they're deployable anywhere in minutes.
would be good if the "Enterprise Load Balancer" would just be another set of servers (with HAProxy + keepalived or something else, I love the "single ip" failover)
Edit: especially load balacing the master servers. (that's actually the hard part of k8s, not even setting it up with/out openshift/ansible whatever)
load balancing services on k8s itself is basically just running either calico network and use one or two haproxy deployments of size 1 with a ip annotation or just using https://github.com/kubernetes/contrib/tree/master/keepalived...
I'm trying to set up a cluster in our development environment to play around with in preparation for rolling it to staging and production. So, minikube I have ruled out because it doesn't prove out the most critical parts of what we will need to run it in production.
I do have a lot of infrastructure and networking experience, it was mostly a matter of the ingress setup having many moving parts which were poorly documented. I could see that it had set up bridges and iptables rules and NAT and virtual interfaces, but I was never able to get a picture of how the setup was supposed to work to be able to see what parts of that picture were right or wrong.
There was no clear road-map of setting up a cluster. Most people talking about Kubernetes were doing "toy" deployments, which only had limited application to what I was doing. I only found kubespray because of a passing mention, for example.
I'd say your about right with a month. Had I given it another week or two, I probably would have gotten it going. I had really only expected it to take a couple days to have a proof of concept cluster, so at 2 weeks I was way beyond what I had slotted to spend on it.
Looking over the Getting Started Guide it looks very simple to get a test cluster set up. Which maybe set my expectations unreasonably high.
I guess that's what I'm trying to say: With the current state of documentation, it's probably a calendar month investment to get going.
Docker Swarm might be worth looking at regardless:
(1) it's like 15 minutes to learn how it works and then maybe a morning or so playing around with it. Very low investment.
(2) if you're already using docker-compose, it's pretty much an in-place switch. You might need to deal with a few restrictions (mandatory overlay network, no custom container names, no parameterized docker-compose), in exchange you'll get zero-downtime upgrades, automated rollbacks, and of course the ability to add more machines to your swarm.
Dokku is great if you want to deploy on a single machine. I've been happy with it so far and haven't been lucky enough to have a problem of scaling yet.
We're running Kubernetes for minor amounts of traffic in production, but we're still not in a great place due to a few limitations in running Kubernetes on your own gear.
I know that people like to talk about cost savings and stuff like that, but I'd like to see if it actually lowered your app latency and increased sales/conversions/whatnot. Things that matter a lot to a growth business.
I ask because the various overlay/iptables/NAT/complicated networking setups in Kubernetes lend themselves to adding more overhead and being much slower than running on "bare metal" and talking directly to a "native" IP. I really, really wish that Kubernetes had full, built-in IPv6 support. It would remove a lot of this crud.
Our solution works around this by assigning IP's with Romana and advertising those into a full layer 3 network with bird. The pods register services into CoreDNS, and an "old fashioned" load balancer resolves service names into A records. Requests are then round robin'd across the various IP's directly. There's no overlay network. There's no central ingress controllers to deal with. There's no NAT. It's direct from LB to pod.
The nginx ingress controller is not a long term solution. It's a stop-gap measure. Someone really needs to build a proper, lightweight, programmable, and cheap, software-based load balancer that I can anycast to across several servers. That or Facebook just needs to open-source theirs.
Regarding networking, did you consider Flannel? Its "host-gw" backend doesn't have any overhead, as it's only setting up routing tables, which is fine for small (<100s of nodes) clusters that have L2 connectivity between nodes.
Our network is full layer 3 Clos Leaf/Spine. We'd much prefer something with network advertisements (OSPF/BGP) or SDN. Layer 2 stuff is OK for labs, but I don't know anyone building out layer 2 networks any more.
We use Kubernetes in a very unusual way. We built a tool that allows you to take a snapshot of a Kubernetes cluster (along with all apps deployed inside) and save it as a single .tar.gz tarball.
This tarball can be used to re-create the exact replicas of the original cluster, even on server farms not connected to the Internet, i.e. it includes container images, binaries for everything, etc.
But if the replica clusters do have internet access, they can dial back home and fetch updates from the "master". People use this to run complex SaaS applications inside their customers' data centers, on private clouds. These clusters run for months without any supervision, until the next update comes out.
Thanks to Kubernetes, you have a complete, 100% introspection into a complex cloud software, which allows for a use case like this. Basically if you're on Kubernetes you can no longer be tied to your one or two AWS regions and start selling your complex SaaS as downloadable run-it-yourself software. [1]
Is it really necessary to copy the containers themselves? Do they contain long-lasting state? (And if they do, how does this state get properly synchronized between master and slaves?)
If the offline server farms hosted a private Docker registry (which is very simple to set up), couldn't you then simply push the container images to the registry, copy the relevant YAML files, and instantiate an identical cluster that way?
That's pretty clever, it looks like you've found a sweet spot somewhere in the middle between SaaS and good-old on premises that would serve a lot of use cases.
My company is running it in production. We started with the monitoring stack and then our application, which is a SaaS product. The product itself is a stack deployed on a per customer basis. With rough configuration management using chef and terraform it took an ops person hours to complete a deployment a year ago. Today, it takes a minute and it's self-service for engineers.
Efficiency is increased a great deal. Previously, I'd have idle instances costing money when they were unused. Now my resource consumption is amortized across my entire infrastructure.
It also drives an architecture that lends itself to better availability. A pod in k8s may be rescheduled at any moment. If it's a stateless service, one may just increase the number of replicas to prevent a service interruption. If it's a stateful service, one is forced to think about how to persist data and gracefully resume operations.
It took a while for everyone on my team to get familiar with it, but once we started to really grok it, the sense of safety went way up. I feel pretty good about the fact that if someone accidentally blew away a cluster with thousands of pods in it, that I could easily replicate the entire thing and get it back up and running in tens of minutes without a lot of hassle.
Your story with K8S sounds like a complete parallel of ours: we deploy a stack per customer and migrated from Chef. Would love to compare notes on how you're managing all the customer instances (we built in-house tooling layered on top of helm). Feel free to shoot me an email (in my profile).
How do you handle (if at all) network isolation between customers? I'm trying to find if there's a way to run one large cluster across multiple AWS VPCs.
Company was moving microservices and containers anyway, so Kubernetes is one of the few sane options to run them.
I can't say that anything got worse. Once you (collectively) get past the initial learning curve and add automation in place, it's way better than most other deployment scenarios. Kubernetes worker nodes fail from time to time, AWS kills them and we barely notice. POds get rescheduled automatically and mostly work.
Frankly, the remaining headache-inducing things are mostly related to the software which is not running on Kubernetes (mostly stateful infrastructure, and mostly for non-technical reasons). Managing VMs is a pain.
There is one thing that one needs to be careful. If you are moving from a VM for each service, to Kubernetes, now suddenly your service is sharing a machine with other services, which wasn't the case before. So I'd suggest that proper resource limits be set so that the scheduler can work properly.
One thing that can get worse is network debugging. There are way more moving parts, so it is not as easy to just fire up wireshark.
We use it in production. It's declarative nature is excellent. We tell it how many of which application we want to run, where our nodes are, and it does the legwork in deciding where to run the applications.
We have been migrating to Kubernetes for the past year. Overall it has been great and the extensibility of the platform is amazing once you combine Custom Resource Definitions (CRDs) into your cluster.
Moving a lot of our build infrastructure to it. The effort has essentially been moving alpha to beta grade.
The knowledge/documentation base just flat out sucks. It feels like documentation expects you to just know what you are doing and are really just reading it for the smaller feature switches to change how things work. That or expect you to be running on GKE.
That said, it's the best at doing what we need which is a scheduler and manager for running multi-container workloads.
I expect it to be more fulfilling/difficult when we move to more long-lasting pods but ultimately still use kubernetes.
At Reddit, we’re beginning to move our stateless services to it. We’re still in early stages and want to have a story that can at least be better than our current infrastructure. A lot of that means heavily utilizing Kubernetes abstractions, sticking close to the community and it’s tooling so we can provide more functionality than we could before with just a small team of infra folks. What I mean by this specifically is benefits of things like having an API for deployments, being able to provide different rollout strategies, offering devs access to more infrastructure safely etc. Another thing we’re hoping Kubernetes helps us solve is being able to become multi-cloud tenants. The benefits of this to us would be cost savings, and hopefully more reliability.
P.S: if you’re interested in working on Kubernetes at reddit, send me a message at saurabh.sharma [at] reddit.com
At my last meetup. (Detroit Kuberentes) We heard the story of a company that was able to rebuild their entire production environment from scratch in 1 hr. They run on kuberentes, of course it's not the only story, I'm sure using something like Terraform/Cloudformation is also part of the story.
I've been toying around with Kubernetes (AKS on Azure) the last few weeks and have to say I rather impressed. Still being able to start from scratch and be up and running in one hour is really impressive.
Clusters come up faster. They're easier to upgrade. Don't have to delete clusters to change them (just provision a new node pool). Persistent storage is less weird. No RBAC. Weirdness around kube-system namespace. I.e. if I create a registry there, it gets disappears with no logs or events suggesting why.
Our app was originally a single Node/Meteor app running on an Nginx server in AWS. We used shell scripts to deploy. "meteor build" was ran on each devs machine when we wanted to deploy. We experimented with a small microservice running on a docker instance on EC2, then eventually moved to a Kops-provisioned EC2 cluster for everything.
We don't really have tremendous auto-scaling needs. But there are a few reasons why we did it.
We've since scaled to a small handful of services (frontend web app, backend node api, legacy stuff, custom deployments for high paying customers). We knew that standardizing on Docker was a good idea just to simplify the build process. What were previously service-specific, hard to maintain, hard to read, just weird shell scripts became a simple Dockerfile for each service, often only 5 or 6 lines long. Once you have that, setting up CI is ridiculously easy; AWS CodeBuild into ECS ECR took all of a day to implement. If we were on GCP it would have been even easier.
Comparatively, we'd spent longer than that actually maintaining the old scripts. The new ones require zero maintenance; they "just work", 100% of the time.
So we knew we wanted Docker. And that was a very good choice. No regrets. We start there. But once we had docker, our eyes turned to orchestration.
Its worth saying that there are a lot of nonobvious answers to questions in the "low/medium scale up devops" world. One big one we ran into early is "where do we store configuration?" Etcd or Consul are fine, but we're a pretty small shop; we didn't want to have to manage it ourselves. We could go with Compose or something. But, how do we get that configuration to the apps? We were in the process of writing new services and we wanted to follow 12-Factor and all that, so envvars make sense. But to get configuration from a remote source would break that, so we'd need some sort of hypervisor for the app to fetch that...
Additionally, how do you deploy? Let's say we go with a basic EC2 instance running Docker. We'd have to patch it together with shell scripts, ssh in, pull new images, GC old images (this was a huge problem with our "interim" step in the first paragraph, our EBS volumes filled up all the time, lots of manual work), restart, etc. Can you do that with zero downtime? Probably. More scripts...
Load balancing and HA? Yeah of course we can wire that up. More scripts. More CloudFormation templates or whatever...
Eventually you arrive at an inescapable conclusion: At a surprisingly low level of scale, you are reinventing Kube and Kube-like systems. So why not just use Kube? You get config management built in. You get a lot of deployment power in kubectl. You get auto-provisioning load balancers on AWS. You get everything.
And I could not be more serious when I say: It "just works". We've had two instances of random internal networking issues between our ingress controller and a service, which were resolved by restarting the ingress controller. That's... it, in a full year.
Like Docker before it, I think Kube is a foundational piece of technology that is only going to get more and more popular. Its incredible. But I do think that there is room in the market to make it easier to adopt. Best practices around internal configuration stuff is hard to come by; even something as simple as "I want a staging env, do i use two clusters or namespaces in one cluster" doesn't have a clear answer. Getting a local development environment cycle up and running is a pain in the ass. Monitoring and alerting is still pretty DIY; GCP solves some of this, but there's no turn-key alerting piece that I'm aware of. Logging is a nightmare if you are self-hosting, including on Kops; we ended up installing the Stackdriver agent and we use GCP for it, even though literally everything else in our stack is on AWS.
I literally don't believe there's a level of "production scale" your organization could be at where you wouldn't benefit from Kube. It is far far easier to set up than a bare deployment of anything if you do GKE. Connect Github to CI to Kube... good to go.