How To
Summary
Power Systems gain their massive performance with lots a technology this series details many of them.
Objective
👁 Nigels Banner
Originally, written in 2012 for the DeveloperWorks AIXpert Blog for POWER7 but updated in 2019 for POWER8 and POWER9.
Steps
This article is a follow-on from a blog from Chris Gibson highlighting a question & concern from one of his customers in Australia. They were comparing POWER6 and POWER7 based computers and the utilisation numbers from the SMT Logical Processors and the graphs look different. I looked at some nmon data (what else!) and they look different. Then I ran a simple generated workload test, duplicated the graphs and then I explain them. Note, these are my personal observations rather than an official AIX developers insider statement.
I ran a workload:
ncpu -p8 -z 25 -h1 -s 900
- This reads 8 processes, sleeping 25% of the time but pause for 1 second after each 1 second of CPU time and then stop after 900 seconds.
- This gives us a bunch of programs running and starting and stopping fairly randomly. The also provides a safety net, so this artificial workload will stop - even if I forget to kill the program!
I then collected nmon data with:
nmon -f -s5 -c60
- This command means: collect a snapshot every 5 seconds for 60 snapshots (5 minutes worth).
On POWER6 Power 570
- Virtual machine called blue.
- Running AIX 6.1 TL6 on latest firmware.
- With Entitlement=0.4, uncapped, SMT=2 and virtual processes = 4.
On POWER7 Power 770
The answer is: Nothing wrong, actually, there is something very right with POWER7!
- Virtual machine called diamond5.
- Running AIX 7.1 TL1 on latest firmware.
- With Entitlement=0.4, uncapped, SMT=4 and virtual processes = 4.
The CPU_SUMM graph looks like this:
Comments:
- The POWER7 virtual machine has twice the logical CPUs as expected as it is running SMT=4 instead on SMT=2
- The POWER6 graphs show a fairly even split of work between logical CPU CPU001 and CPU002 (these two combined make up the first POWER6 physical CPU-core) - this is because it is in SMT=2 mode and there is no real favourite between the SMT threads. One thread is as good as the other.
- The POWER7 graph show that for logical CPUs CPU001, CPU002, CPU002, CPU003 (these four combined make up the first POWER7 physical CPU-core), that the first logical CPU is very much more favoured than the second and third and fourth logical CPUs are not used much at all.
- The POWER7 behaviour is Intelligent SMT Threading mode switching in action. It knows there are not enough processes running (low run queue) to use SMT=4 so it has switched to SMT=2 and moves the processes to the first two logical CPUs. Then it notices that there are not even enough processes running for a fair chunk of the time to need SMT=2 so it switches to SMT=1 and moves the processes to the first logical CPU. This means the single running progress is getting the internal resources for the whole physical CPU-core with no contention from other threads and so gets a speed boost.
- Both POWER6 and POWER7 were using roughly 2.5 physical CPUs but it is clear with POWER7 that we could remove a physical CPU-core or even two physical CPU-cores as you can clearly see there are plenty of unused SMT logical CPUs to run work on.
- Once more for the record for shared CPUs: It is impossible to average the logical CPU utilisation stats to work out how busy are your physical CPU-cores because the logical CPUs are con-currently executing on the shared internal compute units of the physical CPU-core. You can't find the 2.5 in the graphs above.
The customer question was: Is there something wrong with POWER7?
The answer is: Nothing wrong, actually, there is something very right with POWER7!
Note: POWER8 and POWER9 based computers follow the POWER7 mode of operation but except they have a newer higher SMT=8 mode.
Additional Information
Other places to find Nigel Griffiths IBM (retired)
Document Location
Worldwide
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power -\u003EPowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG60","label":"IBM i"},"Component":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]
Was this topic helpful?
Document Information
Modified date:
13 June 2023
UID
ibm11126389
