ThinIO facts and figures, Part 4: Storage design and dangerous assumptions.
Welcome back to this blog series discussing our new product ThinIO. Please find the below three earlier articles in this series.
- ThinIO facts and figures, Part 1: VDI and Ram caching.
- ThinIO facts and figures, Part 2: The Bootstorm chestnut.
- ThinIO facts and figures, Part 3: RDS and Ram caching.
In the final blog post in this series, we’re going to discuss storage design and a frequent problem face when sizing storage. Lets get right into it:
“Designing for average, is designing for failure”
Peak IOPS:1015, Average IOPS: 78
A frequent mistake I see customers and consultants alike make is taking an average of a sizing requirement and using that as a baseline for sizing environments.
Looking at the figures produced from our internal load tests, we saw just an average of 78 IOPS required on write from a vanilla server without ThinIO to provide storage from this XenApp server.
Now frequently, people will take a figure like that, throw in 25% for growth and bob’s your uncle, ‘order the hardware Mr SAN man’. When I have questioned them about that assumption, they’ll often respond “oh it will be a bit slow if theres contention but it’ll even itself out”.
Things don’t go slow when they are over subscribed, they stop.
Don’t take my word for it! Lets do some simple theoretical math:
If you take a storage device and allocate 100 IOPS to this machine, what’s going to happen when a peak like 1000 IOPS is requested? A lot of queuing.
In theory, keeping to the 100 IOPS figure, that 1 second burst IO is now taking over 10 seconds to satisfy (1000 / 10).
But it gets worse, all subsequent IO that is requested after that spike occurred is going to also be haulted waiting for this task to occur.
Assuming you’re now mid spike and 10 seconds later the request is finished… taking your average figure, you now have 10 seconds worth of 100 IO’s per second potentially queued up behind…
Low and behold another login occurs and? STOP. Storage timeouts, twirly whirlies, application crashes, hour glasses and the good old “logins are slow”.
Oh ok, So how do I size to accommodate this?
Well you’re between a rock and a hard place aren’t you. You can’t tell users when to login, the price tag of a SAN sized for peak activity + 20% is going to cost you more than your entire desktop estate. And as you can see, it’s never safe to assume it will run slow.
Buying into shared storage is a tricky business
Storage is expensive. Very expensive. It always annoys me when you hear vendors in this space refering to themselves as ‘reassuringly expensive’. To me this directly translates to ‘We can charge what we want, so we will and you can be reassured we feel the price tag is worth it.’
Storage was never written with desktop workloads in mind, it was written for ‘steady state’ server workloads and was in the progress of going the way of the ‘dodo’ (extinct) up until that first release of vMotion requiring shared storage, which some say was the saving of the market.
Many vendors are going with software or hardware intelligent tiering. This is a great feature, but the real question to ask is how frequently data is moved from the hot tier to lower tier? Press your vendor on this as they more than likely wont know! Microsoft storage spaces is a prime example of this with a really poor optimisation process of just once a day!
Then ask yourself what happens when a base image update occurs and changes the disk layout of the base golden image? Further, stateless technologies from the bigger vendors delete the differencing disk on restart, can you be sure the new disk is going to end up in the smaller, faster SSD or RAM tier? Or is data up there already in contention?
RAM is far less tricky
RAM is commodity, available in abundance and throughout every virtual desktop project I’ve architected and deployed, you run out of CPU resources in a ‘fully loaded’ host way before you will run out of RAM. RAM has no running maintenance cost, Ram is an upfront CAPEX cost and requires little to no maintenance.
The beauty of what ThinIO does with the little resources you assign it, is turn that desktop workload into a healthier and happier server workload of minimal burst IO and a low steady state IO requirement.
Note the peak of just 40.5 IOPS and average IOPS of less than 2.
with as little as just 200mb cache for each of the 10 users logging in, within an aggressive 3 minute window, we reduced the peak from 1000 to 40. That’s a 96% reduction in burst IO.
With ThinIO, you:
- reduce your exposure to massive IO spikes.
- Improve user logon times.
- significantly reduce your daily IOPS run rate.
- Increase user productivity by spending less time waiting for the storage.
- Commit to nothing up front, test it and see how well it works. If you are happy, then buy in.
Lots of Intelligence baked in:
ThinIO is acutely aware of key operating system events that will cause these kind of spikes and react accordingly to reduce the spikes in IOPS created. ThinIO constantly watches the behavior and IO pattern of the storage and tunes itself accordingly.
Unlike other technologies, ThinIO is a true caching and performance solution. We do not move useful data in and out of the cache on demand when cache availability is contencious. We track patterns and frequency of block access to respond accordingly, delivering all the benefits we have mentioned, even with the tiniest cache, without EVER reducing the capability of the storage when overwhelmed.
And on the opposite side of the scale, when underworked, we leverage our cache to deliver deeper read savings as above.
ThinIO also has a powerful API and PowerShell interface to allow you to report and interact with the cache on demand.
And with the end of the series looming, allow me to finish on some easy points:
ThinIO Allows you to:
- size your SAN outside of the Lamborghini category & price tag for your desktop estate.
- rapidly achieve far deeper density on your current hardware when you are feeling the Pinch.
- guarantee a level of performance by assigning cache per VM, disallowing other users to steal or hamper caching resources.
- Improve user experience and login times immediately.
- Reduce the impact of boot storms and similar IO storm scenarios.
No other vendor can offer as quick a turn around time with their product. ThinIO installs in seconds and offers a huge range of compatibility.
One more thing:
In case you missed ThinIO’s launch day at E2EVC Barcelona, ThinIO is now in GA, available from our website and production ready! More marketing to follow! But grab your copy now and get playing!