Compute Workload Abstraction - Part 5 or When Milli and Micro Meet

This is part five in a series exploring the topic of compute workload abstraction, in earlier posts we discovered how containers fit into the IT ecosystem. In this final part we take a bit of a tangent and look at performance and how adding the next level of abstraction inherits and obscures performance characteristics.

OK, who invited I/O to the party?

Sharing of a CPU and Memory resources are low hanging fruit from a virtualization perspective, mostly due to their predictable nature in terms of performance, and to some degree the closed nature of the interface to the operating system leading to further stability. Adding I/O sharing and prioritization has always been a challenge, and can be a bit complicated, so let’s look at it all with the analogy of a restaurant – and bear with me, as you may get hungry during the explanation.

Is my table ready?

The front-of-house activity at a restaurant is generally quite serene in its execution, prospective diners book a table, a small amount of overbooking takes place to cater for the diners that cancel before arriving, but only so many bookings are taken. When the diners arrive they are greeted by the maître-d and are either escorted to their table or asked to wait a short time while their table is made ready, the diners are not aware whether their table is currently occupied or is being cleared for their use, or due to the size of the dining party whether tables are being shuffled to sit diners together. In good restaurants the maître-d provides some idea of how long the wait may be which allows the diners to either leave the waiting area and get a notification when their table is nearly ready, perhaps get a drink (but not eat), or just wait. When the table is ready the diners are seated and the meal can begin.

Kitchen chaos.

How would you think a kitchen would cope if every table could request any conceivable starter, entrée or dessert in any order, with the only restriction being that they must fit on a standard size plate? Pretty chaotic in the kitchen huh? Think of your CPU and Memory scheduling in the hypervisor as the front of house, and the I/O subsystem as the kitchen in that scenario.

Continuing the analogy, things do get better, for instance it turns out that lots of diners like to eat the same things, they have habits of when they like to eat, they also like what is on the plate of the diners sitting next to them, and a lot of patrons of popular restaurants are also fashion junkies so like to try the new items on the menu. This allows the kitchen to predict that it will need certain ingredients and keep them in the larder so they are able to construct the recipes quickly. There is of course only so much space in the larder, and some ingredients perish quickly and nobody wants expired ingredients in their meal, add to that fashions change quickly it gets to be a challenge to manage the larder on its own.

The kitchen is very accommodating in terms of what recipes it will serve, however it’s not great at being able to make the front of house aware how long the customer may need to wait, as even though a lot of the ingredients are in the larder, all of the ingredients need to be in the kitchen to prepare the meal and some take longer to fetch than others. So I think I have rung that analogy dry, hopefully you can derive from that the performance penalty in this analogy can be seen in the time to provide the meal, the amount of ingredients that can be stored in the larder, and to some degree the size of the delivery truck.

That is what I ordered, but not what I want now.

These penalties align to the three pillars of computing I/O performance namely - latency, caching and bandwidth, and these affect containers just as much as they affect traditional systems, with the added complexity at the kernel level that the container is unaware of the competition for resources that needs to be arbitrated. So today, not only is there something at the end of the I/O pipe that is operating in a serial and slow manner (in most cases still a spinning disk) there is also a priority problem. In the time that it takes to issue and receive a response to a standard I/O it is possible, and most probable, that the I/O that was issued was in fact not the I/O that should have been issued and serviced if the I/O subsystem had a better understanding of the upcoming demand from the multitude of applications it is servicing. So there needs to be some management of the priority and performance provided across the container estate – maybe Google has the answer?

Previously

Next Time…

Its time for a new series – keep reading – this has been great writing this, and hopefully you have enjoyed it too. Don’t forget to comment, get involved and get in touch using the boxes below and reach out direct @glennaugustus