Welcome back! This blog post is part of a number of posts in advance of our upcoming release, for reference you can find part one below.
Getting right to it:
In this industry when somebody says ‘boot storms!’ – most of us will respond with:
Boot storms are a well documented, boring problem and have many solutions available from vendors and hypervisors alike. Most solutions today rely on a ‘shared memory’ storage area to cache ‘on boot’, in theory caching only one startup or one pattern in order to then serve it back to the proceeding desktops to boot.
But why are boot storms an issue? While working on ThinIO we had the unique ability to really dive into the Windows boot process and analyse why boot storms cause the damage they do and in this post we thought we’d share our findings to better document the issue.
Taking a typical windows 7 boot, to the login screen and idling until all services have started, the data traversing from disk to VM is relatively small. in our testing we found an average of Just 500-600 mb of data is read during this process, and write data barely registers at between 20 and 30mb.
But hey, what gives? Taking such low data throughput, why is boot such a contenscious issue? Have I been misled with marketing and vendor nonsense?
The IO chestnut:
Sadly no, it’s the way windows requests this data, but don’t take my word for it…. Behold, the incredible mess that is the Windows boot process!
Yep, that’s right, in the time Windows requested roughly 600mb of data, it sent down an astounding 70 thousand IO’s in the space of 2-3 minutes!
Now if you were to take these figures as they stand, you would take 70,000 IO’s divide this into 560mb and you’d probably end up with an average of about 8k of data requested per IO… You’d be wrong.
As my good buddy Conor Scolard would say, ‘when you Assume, you make an ass out of you and me’.
To better understand the bounderies of Windows, Windows requests IO’s between the minimum of 512 bytes all the way up the spectrum later in the boot process to 128k and above. But it requests these blocks sparcely, on demand, and not just once per sector, the same blocks are frequently accessed.
The net result of this causes absolute havok on the storage:
The crux of the issue is, for each one of these IO’s, the storage provider needs to compute the block data requested, seek the data out, then return it.
But 70,000 of these IO operations for a meagre 600mb of data is madness and you can now see exactly why boot storms were labelled as such for those early adopters who had their hands burned by this fact finding mission.
I’ll mitigate this issue by just booting my VM’s at night!
I’m sure you will! I would also love to see your face if a number of users happen to restart their desktops during the day, cascading 70,000 IO’s per desktop to the storage in a 2 minute window, per desktop!
Bootstorming IS an issue.
Now, knowing all this, it makes sense as to why storage and hypervisors alike are using a cache of ram.
But how does ThinIO fit in here? With Read Ahead of course!
Knowing the Windows boot process as intimate as only a technology like ThinIO can, there are many, many optimisations we can make to this process.
We can both speed the boot process up and also massively reduce the storage requirement while in VM, without any fancy caching mechanism!
With ThinIO’s read ahead technology, we can deliver just shy of an 80% boot IO reduction with nothing other than having our technology in the virtual machine:
Taking a ThinIO averaged test and overlaying it to a baseline averaged test, it’s clear just how much impact this technology can have:
So there you have it, with ThinIO, a simple, in VM solution, not only can you seriously reduce your IO footprint, boost user performance and achieve greater storage density per virtual machine, you also can also massively negate the impact a booting VM has on your storage.