Rule of Thumb: Sizing the Virtual I/O Server

Great Posting on IBM Developer works blog…
https://www.ibm.com/developerworks/mydeveloperworks/blogs/aixpert/entry/rule_of_thumb_sizing_the_virtual_i_o_server78?lang=en

Rule of Thumb: Sizing the Virtual I/O Server

I often get asked: How large to make a pair of Virtual I/O Server (VIOS)? 
The classic consultant answer is “it depends on what you are doing with Disk & Network I/O” is not very useful to the practical guy that has to size a machine including the VIOS nor the person defining the VIOS partition to install it!

Observations:
The VIOS server unfairly gets a bad press but note:

    • Physical adapters are now in the VIOS, so device driver CPU cycles (normally hidden and roughly half of the OS CPU System time) move to the VIOS – this is not new CPU cycles.

    • Extra CPU work involves function shipping the request from client to VIOS and back but this is a function call to the Hypervisor = small beer.

    • Data shipping is very efficient as the Hypervisor uses virtual memory references rather that raw data moving.

    • Aggregating the adapters in one place means that all client virtual machines have access to much larger and redundant data channels at reduced cost so it is a win win situation.


Who knows the I/O details to rates and packet sizes?

  • Answer: No one (in my experience) knows the disk and network mixture of block or packet sizes, the read and write rates for each size or the periods of time which will cause the peak workload. In new workloads it is all guess work anyway – lets be generous that would be + or – 25%.

  • If you do know, IBM can do some maths to estimate the CPU cycles at the peak period.

  • But most of the time that peak sizing would be total overkill.


So here is my Rule of Thumb (ROT), starter for 10 with caveats:

  • Trick 1 – “Use the PowerVM, Luke!”

  • Use PowerVM to re-use unused VIOS CPU cycles in the client Virtual Machines

  • VIOS = Micro-partition Shared CPU, Uncapped, high weight factor, with virtual processor minimum +1 or +2 headroom (virtual processor would be better called spreading factor)

  • This allows for peaks but doesn’t waste CPU resources


  • Trick 2 – Don’t worry about the tea bags!

  • No one calculates the number of teabags they need per year

  • In my house, we just have some in reserve and monitor the use of tea bags and then purchase more when needed

  • Likewise, start with a sensible VIOS resources and monitor the situation


  • Trick 3 – Go dual VIOS

  • Use a pair of VIOS to allow VIOS upgrades

  • In practice, we have very low failure rates in the VIOS – mostly because systems administrators are strongly recommended NOT to fiddle!


  • Trick 4  – the actual Rule of Thumb

  • Each VIOS: for every 16 CPUs – 1.0 Shared CPU and 2 GB of memory

  • This assumes your Virtual Machines are roughly 1 to 2 CPUs each and not extremely I/O intensive (more CPU limited)


  • Trick 5 – Check VIOS performance regularly

  • As workloads are added to a machine in the first few months, monitor VIOS CPU and memory use & tune as necessary

  • See other AIXpert blog entries for monitoring tools – whole machine and including VIOS


  • Trick 6 – Driving system utilisation beyond, say, 70%

  • As you drive system utilisation up by adding more workloads you need more pro-active monitoring

  • Implement some tools for Automatic Alerting of VIOS stress

 
Caveats
:

  • If using high speed adapters like 10 Gbps Ethernet or 8 Gbps SAN then VIOS buffering space is needed, so double the RAM to 4 GB.

  • Ignore, if you have these adapters but are only likely to use a fraction of the bandwidth like 1 Gbps.

  • If you know your applications are going to hammer the I/O (i.e. stress the high speed adapters) then go to 6 GB or 8GB.

  • If you are using extreme numbers of tiny Virtual Machines (lots of virtual connections) also go to 6 GB or 8GB.

  • On large machines, like 32 processor (cores) or more, many customers use a pair of VIOS for production and further a pair for other workloads.


Remember this is a starting point – monitor the VIOS as workloads increase – starving you VIOS is a very bad idea.


These are my opinions, I am sure others have different ideas too, comments are welcome … thanks Nigel Griffiths

Leave a Reply