SCOM – Custom Process % Processor Time Collection Rule

I am currently working on a performance management pack requirement which includes a request to collect Process % Processor Time for all processes on all servers in the environment at a 30 second interval.  First, this is a really bad idea from a performance perspective, as it would result in an  extremely large increase of data insertions to the OpsDB and DW which could severely impact performance (of course, this depends on I/O capability, etc.).  Second, why collect every process for each sample interval when we are only interested in processes that are consuming high CPU?

So…after much debate, we settled on a plan to test a collection rule (in the lab of course) that would be disabled by default and enabled ONLY for a specific group, with a collection interval of 2 minutes.  Servers which require deeper root cause analysis will be added to this group temporarily and removed once troubleshooting as been concluded.  Additionally,  I added filters in the script to only collect data when the Process % Processor Time utilization exceeds 20% and the process name is not “Idle” or “_Total”.

Ok…time for some scripting!
Continue reading