Serverless: dissecting the jargon (part 2)

4 min readJun 21, 2020

In this article, we will be looking at Knative and how its biased approach can help you as a developer to focus on delivering scalable solutions. Read the first part of this article if you are looking for an introduction to this topic. This part targets the Serving component, and Knative 0.15.

Installing Knative in Kubernetes

If you are running Kubernetes, you can install your Knative stack by following this guide. Currently, the installation consists of installing the Serving and Eventing components. You have to pick a networking layer your solution will be operating; at this moment, it supports the following ones: [1]

Ambassador — Open-source Kubernetes-native API gateway for microservices built on the Envoy proxy. It has a commercial edition with more features and support.
Contour — A Kubernetes ingress controller using Envoy proxy.
Gloo — A Kubernetes-native ingress controller, and API gateway. It is based on Envoy proxy. It has many features, and there is a commercial edition with even more features and support.
Istio — A service mesh solution to connect, secure, control and observe services. One option that you can consider is using an API gateway in front of Istio to provide extra functionality. Gloo shows in this article an example of such a setup. Refer to this article to understand the difference between a service mesh and an API gateway.
Kong — The Kong Gateway is an open-source API gateway built on top of Nginx. It has a commercial edition with more features and support.
Knative Kourier — simple ingress implementation with just Envoy proxy.

For my local minikube cluster, I’ve opted to use Knative Kourier, since it is simple to set up and very lightweight when compared to service mesh solution like Istio.

Knative Offerings

There are many vendors offering Knative solutions that work out of the box. For commercial solutions please refer to this section in Knative documention.

Knative components

Knative consists of the Serving and Eventing components: [2]

Eventing — Management and delivery of events. [2]
It is designed to address a common need for cloud native development and provides composable primitives to enable late-binding event sources and event consumers. [3]
Serving — Request-driven compute that can scale to zero.
Knative Serving builds on Kubernetes and Istio to support deploying and serving of serverless applications and functions. Serving is easy to get started with and scales to support advanced scenarios. [4]

Knative Serving component

The Knative Serving project provides middleware primitives that enable: [4]

Rapid deployment of serverless containers [4]
Automatic scaling up and down to zero [4]
Routing and network programming for Istio components [4]
Point-in-time snapshots of deployed code and configurations [4]

Every time you deploy a Service, Knative will manage all the details for you by deploying a matching route based on the configuration based on the Service. Any changes to this resource will create revisions. Just by itself, this flow creates an abstraction layer for a developer, which saves time delimiting and implementing some operational requirements. The following image details the process:

The video below shows up an example of an application zero-scaling based on incoming requests. I am using a simple Java application based on Quarkus, which is a Kubernetes Native Java stack tailored for OpenJDK HotSpot and GraalVM, crafted from the best of breed Java libraries and standards [5].
This project includes a Skaffold + Knative template if you a looking to quickstart your project.
For the sake of performance, I am using a GraalVM based image to attempt to decrease warm-up time and increase performance. The template includes both JVM and GraalVM Dockerfiles for your development and production needs.

Zero scaling demo of a Quarkus service

In the following snippet you can find how to set up auto-scaling in a Service resource, by adding these annotations:

# Knative concurrency-based autoscaling (default).
autoscaling.knative.dev/class: kpa.autoscaling.knative.dev
autoscaling.knative.dev/metric: concurrency
# Target 10 requests in-flight per pod.
autoscaling.knative.dev/target: "10"
# Disable scale to zero with a minScale of 1.
autoscaling.knative.dev/minScale: "0"
# Limit scaling to 20 pods.
autoscaling.knative.dev/maxScale: "20"

autoscaling.knative.dev/metric supports multiple values, such as concurrency, rps and CPU usage. Refer to this section of the documentation for more details about auto-scaling Service resources.

The video below used the very same application shown earlier, with an addition of a Thread.sleep in the given GET request, to simulate some delay, as seen in a real-world transaction. To observe how Knative Serving auto-scales, I am using rakyll/hey, for simulating 50 constant concurrent requests during a minute.

It is possible to do other deployment strategies by controlling routing and managing traffic. This section of the documentation shows up how to do a Blue-Green deployment. With some tweaks, you can use a similar implementation to do a Canary release, where you allow a small percentage of traffic to experiment a new version of your service.

Conclusions and further research

I am surprised at how easy it has been to try Knative. The documentation is very straightforward, and it seems like the project is getting more mature over time. Knative philosophy abstracts developers from worrying about operational overhead of deploying and managing, like the other solutions mention in part 1.

In part 3 of this article, I will be focusing on implementing Knative Eventing in my demo project.

References

[1] Installing Knative | Knative
[2] Welcome to Knative | Knative
[3] Knative Eventing | Knative
[4] Knative Serving | Knative
[5] Quarkus — Supersonic Subatomic Java