In this way, we can convey information from one proxy to another and we can set a release for which 90% of requests go to the previous system and 10% go to the new one. This mechanism is called canary deploy, and it makes it much easier to increase or decrease the management of different versions on the same system.
The canary deploy is a salvation in many situations because it minimizes the unforeseen events that can arise with a new release. It can be applied to traffic percentages or to more complex logics as well, such as the user agent: we can route requests from iOS devices to the new system while Android ones continue to call the old one.
Now we have our app, and it’s distributed. How do we make sure everything is working well?
What we recommend is to always put some control routes, such as health and readiness routes. The readiness route tells us when the pod is ready to receive traffic. Until it’s ready, Kubernetes doesn’t route traffic to that pod. The health route, on the other hand, communicates whether the pod is healthy and functioning properly.
As a matter of fact, the pod may be up and running but not working properly (unable to queue messages, etc). In this case, Kubernetes restarts the system on our behalf and everything starts up again – including the readiness route.
In addition to asking how it is going, we may also want to monitor what exactly our service is doing in production.
We can measure some relevant information at the business level: how many messages am I queuing? How many payments have I made? How many active users do I have right now?
These metrics can be hosted on a database on which we can build monitoring dashboards, and set alarms. For example, we could set an alarm that warns us when there are too many messages in the queue and there may be a slowdown.
When the system crashes, it is (relatively) easy to locate and solve the problem, while it is much more difficult to intervene when the system slows down and does not give signals. This is why it’s important to act in advance.
One tool to do this could be Prometheus.
The Logs are another important aspect to constantly monitor. For each worker node, we can collect the logs of all the pods, put them in a database, view them on the dashboard and set alarms on the logs.
Within the logs we could insert a request id, a conversation tracking detached from our gateway and propagated with an http request on all services, which allows us to monitor all calls among all services.
There are several ways to observe the communication among services: just put a rec on the id on both logs and propagate them on all logs at the extra header level.
Finally, let’s see a slightly more complex style that somehow includes them all.
Starting from our legacy management systems, which are common, we can create an ecosystem of microservices, each with its own clear and defined responsibility: the product catalog, the stock status, the product tracking and all the ones we need.
Each one performs an action which alone would be useless, but, when connected to other features, executes a business logic. Once these logics are exposed, they create the services of our application.
A further step could be adopting Fast Data and machine learning within our system.
In fact, if the legacy systems had problems, by adopting Fast Data, and observing all the connections among all the microservices with machine learning, we could identify repeated problems both in operations and business areas.
With this data we could think of strategies to further stimulate our end users to make the most of all the product features they are using.
What we have just described is an architecture within Kubernetes that can be built over time in an incremental and evolutionary way: an application architecture that evolves with the business.