SCOM – Reset manually closed monitors with PowerShell

Let me start by saying there are several blogs out there with different scripts and methods describing how to do this.  I decided to incorporate a few, add some of my own code and checks, and provide you with a good, in-depth post on how to utilize this invaluable clean up automation.

In most environments where I’ve consulted, one of the first questions I ask when I see a fairly clean active alert view in the console is “have you been closing alerts manually?”, directly followed by the obligatory “do you know the difference between a rule and a monitor?”.  We’ve all heard that question 100 times, right?  If not, you are probably guilty 🙂  I won’t dig too deep into that topic in this post, but here is the high level:  Don’t manually close monitors!

If you do want to dig deeper into how to tell if an alert is generated by a rule or monitor and/or what the differences are, feel free to hit that post here: https://scomanswers.wordpress.com/2015/03/04/scom-rule-vs-monitor/

Back to the topic at hand.  In the aforementioned environments above, in most cases the engineer immediately wants to know how to rectify the issue, and how to address it moving forward.  Depending on how bad the issue is, resetting the environment using maintenance mode works well if done carefully and off-hours,  but preferably, we can use a script to reset manually closed monitors and automate it to moving forward.  Let’s dig into option two!

My process:
1.  Run the first section of the script to get an idea of how many alerts we are dealing with. The code below will tell me how many monitor alerts have been closed by someone/something other than System.  If you are using a MOM like Netcool, you may have to add an additional filter(s).  This test is only valid the first time using this process as once the monitors are reset, the alerts will still show up in the query due to the property values filtered in the query remaining the same.  Other properties such as TimeResolved update after reset.  Additionally, this can be addressed by adding a date variable and further filtering your where statement using the $_.TimeRaised property. I will show you how to further filter for only current manually closed monitors in a moment.

Import-Module OperationsManager;
New-SCOMManagementGroupConnection ManagementServer

$alerts=get-scomalert | ?{$_.IsMonitorAlert -eq $true -and $_.resolutionstate -eq “255” -and $_.LastModifiedBy -ne “System”}
$alerts.count

2.  After making sure there aren’t 1000 alerts to contend with, I run the rest of the script in “safe mode” to verify that the closed alerts output in the $alerts variable meet the additional criteria of being in a Warning or Critical state.

Tip: During this phase I comment out the line where the monitor is actually reset and add in an array ($array) to store the $alert variable for each loop that meets the above criteria. The $array and $array.count output at the bottom will verify which monitors from the $alerts variable will actually be reset.

$array=@()

foreach ($alert in $alerts)
{
$MonitorID = $alert.monitoringruleid
$TargetClassID = $alert.monitoringclassid
$ObjectID = $alert.monitoringobjectid

##Retrieve the monitor, target class and instance to define the $monitoringobject variable
$monitor = Get-SCOMMonitor | where {$_.id -eq $MonitorID}
$monitoringclass = Get-SCOMClass | where {$_.id -eq $TargetClassID}
$monitoringobject = Get-SCOMMonitoringobject -class $monitoringclass | where{$_.id -eq $ObjectID}
   If (($monitoringobject.HealthState -eq ‘Error’) -or ($monitoringobject.HealthState -eq ‘Warning’))
{ #$monitoringobject | foreach{$_.ResetMonitoringState($monitor)}
$array+=$alert
}
}

##Output the list of manually closed alerts for review
$array
$array.count

3.  If I am comfortable with the amount of monitors that need to be reset, I will remove the comment from the “ResetMonitoringState” line of code and run the script.  If you watch the console during this process, you will notice that these alerts will gradually appear in the “Active Alerts” view.  Remember, the alerts that are output in the $array data are monitor generated alerts that have been manually closed, which breaks the monitor.  All of the alerts that show up in this output represent alerts that SHOULD be in the active alerts console because the issue still exists.  If the issue no longer existed, the state would not be Warning or Critical, and the monitor would not need to be reset.

4.  After running the script, you can verify whether it was successful by re-commenting out the “ResetMonitoringState” line and running the script.  You should show no output for the $array variable.

Full script:

**THIS SCRIPT IS FOR SAMPLE PURPOSES ONLY. USE AT YOUR OWN RISK**
################################################################################################# ##ResetManuallyClosedMonitors.ps1
##Shawn Tierney
##SAMPLE ONLY
##The script is currently set up for testing
#################################################################################################
Import-Module OperationsManager;
New-SCOMManagementGroupConnection ManagementServer;

$alerts=get-scomalert | ?{$_.IsMonitorAlert -eq $true -and $_.resolutionstate -eq “255” -and $_.LastModifiedBy -ne “System”}

$alerts.count

$array=@()
foreach ($alert in $alerts)
{
$MonitorID = $alert.monitoringruleid
$TargetClassID = $alert.monitoringclassid
$ObjectID = $alert.monitoringobjectid

$monitor = Get-SCOMMonitor | where {$_.id -eq $MonitorID}
$monitoringclass = Get-SCOMClass | where {$_.id -eq $TargetClassID}
$monitoringobject = Get-SCOMMonitoringobject -class $monitoringclass | where {$_.id -eq $ObjectID}

If (($monitoringobject.HealthState -eq ‘Error’) -or ($monitoringobject.HealthState -eq ‘Warning’))
{
#$monitoringobject | foreach{$_.ResetMonitoringState($monitor)}
$array+=$alert
}
}
$array
$array.count

###############################################################################################

This posting is provided “AS IS” with no warranties.

Advertisements

2 thoughts on “SCOM – Reset manually closed monitors with PowerShell

  1. Hi

    Thanks for your article. But I have a question.
    Your solution works if Closed Alerts are still in OperationsDatabase.
    But they have been groomed?
    Lets imagine. I closed Monitor Alert 14 days ago. I have Database Grooming Setting -> Records to Delete (Resolved Alerts) -> Older than (7 days)
    This alert only in DW DB.
    And get-scomalert | ?{$_.IsMonitorAlert -eq $true -and $_.resolutionstate -eq “255” } does not return to me this alert. So, I never new that I have monitor in error state.

    How resolve such case?

    Alex

    Like

    • Hello Alex,

      We do not want to reach a point where we have to worry about querying the DW to find out if an alert has been closed manually over 14 days ago. There are several ways to avoid ever having to deal with this issue, but the easiest resolution is to import Tao Yang’s Self-Maintenance management pack and enable the Manually Closed Alert’s rule which runs every 24 hours by default and triggers an alert when a monitor based alert has been closed manually. Alternatively, you can configure your own alert using this script. Both methods are great candidates for automation using Orchestrator’s Monitor Alert activity or SMA to ensure that you never miss one of these issues.

      Thanks,

      Shawn Tierney

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s