Cert-Manager is the standard way to manage Let's Encrypt certificates in Kubernetes. A few months ago, Cert-Manager stopped working for me due to a DNS bug in Kubernetes. Although it wasn't the fault of Cert-Manager, no solution was provided for several months and I had to learn how to manually renew my certificates to keep my sites up.
There were several things I found annoying about CertManager even before this bug presented itself:
- The installation process involves applying a 26,000 line yaml file
- It involves the creation of several Custom Resource Definitions (CRDs)
- When it fails, you have to trace through a series of CRD objects
And since I had just learned how to manually create certificates in Kubernetes, I decided to build my own Let's Encrypt certificate manager!
Note: I've been using KCert in my own cluster for a few months, but the code is definitely not production ready. Feel free to try it out, but expect bugs, outdated docs and breaking changes until I officially release it.
There were several goals/requirements I wanted to achieve with this project:
- Build a .NET Cloud Native Application:
- I like Go for small projects but I still prefer C# for more complex code
- I want to see more healthy competition between .NET and Go in the space of Kubernetes applications
- A Visual Interface: I love command line tools, but sometimes you just want to visually browse and click to take action.
- Easy to Deploy: CertManager requires Helm or 26000 lines of yaml to deploy. I think a cert manager can be much simpler.
- Build an ACME client from scratch: I've never implemented anything based on an RFC. Doing so with the ACME protocol would be great experience.
- Simple to Debug: Certificate generation/renewal is a multi-step process. I want to see the entire process in one place and easily find exactly where something goes wrong.
Simple Web Interface
I decided to build KCert with a web interface from the start. In fact, KCert does not currently do anything in the background besides running the HTTP service. Although I will be adding it at some point, there is no automatic renewal of certificates.
This has made developing KCert and debugging quite simple. I can run it locally and click through every page of the UI to make sure things are working as expected. When the code gets more stable, I plan to add a background job for automatic renewals. But getting the manual scenario working first will make that a fairly simple step.
Kubernetes applications traditionally get their configurations from environment variables. KCert has many such configurations. However, I also opted to add a configuration page in the UI for some of the options. This allows the user to make some configuration changes without restarting the service. My decision process around deciding which options would be controlled in the UI was:
- Can the value be known before deploying?
- Would it be useful to be able to change the value without restarting the service?
- Can I provide a sensible default that will probably never need to be changed?
Single Instance Deployment
I decided to design the certificate manager assuming there is only instance running. Kubernetes easily allows you to deploy multiple parallel instances, but does a certificate manager really need that type of scalability? I would argue no.
Let's Encrypt certificates are valid for 90 days and only need to be renewed once every 60 days. This means you have a 30 day window to renew a certificate before it expires. So does it really matter if your certificate manager is down for a couple of minutes? How many domains/certs would be too much for a single instance to handle?
Limiting the deployment to one instance makes debugging the service much easier. There's only one pod and you can easily connect to it and watch the logs.
Progress so Far
I've been running this tool in my own cluster since December. I love the simplicity and have really enjoyed figuring out how to do all this without the complexity of CertManager. There are however several things that I'd still like to do:
- Properly handle certs with multiple hosts
- Add support for automatic cert renewals
- Eliminate the current namespaces config setting in favor of annotations on the ingress
I'm still experimenting with different ideas, the code and design is likely to change a lot. For this reason, I don't recommend that anyone try to use it unless they'd like to contribute to the code. I do however expect officially release this tool at some point. Maybe in a few more months?