Skip to content

Is the Page Life Expectancy monitor bugging you? Go consecutive!

March 8, 2015

I think it was SQL management pack version 6.4.0.0 where the Page Life Expectancy monitor was introduced. The monitor was very evident because it generated quite some alerts in the OpsMgr environments i build/manage.
I need it to be clear that i am not stating the monitor is buggy, because it isn’t.

** Update – In the latest SQL managament pack 6.6.0.0 this monitor is consecutive by default and this post has become obsolete. It is still usefull as example for making a consecutive monitor.

What is SQL Page Life Expectancy (PLE)?

Page Life Expectancy is the number of seconds a page will stay in the buffer pool without references. In simple words, if your page stays longer in the buffer pool (area of the memory cache) your PLE is higher, leading to higher performance as every time request comes there are chances it may find its data in the cache itself instead of going to hard drive to read the data.
(http://blog.sqlauthority.com/2010/12/13/sql-server-what-is-page-life-expectancy-ple-counter/)

An older recommendation for PLE by Microsoft was that it should never be under 300 seconds. This post is not about arguing or discussing this value although i did found a good blog on this subject here: http://blogs.msdn.com/b/mcsukbi/archive/2013/04/12/sql-server-page-life-expectancy.aspx.

What does the PLE monitor do?

The monitor can be found in the Microsoft SQL Server 2008 Monitoring and the Microsoft SQL server 2012 monitoring management packs.
It is targeted agains the SQL server DB engines classes and checks the Buffer Node:Page Life Expectancy perfmon object every 15 minutes (300 sec).
If the value is below 300 it wil raise a critical alert.

The configurable options for this monitor are limited:

SCOMPLE01

When and how should i tune this?

I have been asked this question a few times, but unfortunately there is no simple answer for this. I advice my clients to tune their SQL environment in such a way the value will never be under 300 seconds.
I have clients where on some servers on specific times the PLE counter drops to ZERO and they can’t do anything about it. So there we have it, the main reason why i spend time in writing this post. (I know some of you OpsMgr admins have the same problem!!)

What are the options for tuning in such a scenario:

  • Disable the monitor for the specific SQL server.  –> No way, you will be blind on PLE performance 24hrs;
  • Put the object in maintenance mode. –> also not a good choice. The PLE monitor is targetted at the SQL Server DB Engine class. When this is put into maintence mode many more things will not be monitored during maintenace….;
  • Making two scheduled tasks that run PowerShell scripts. One for disabeling the PLE monitor 5 mins before the issue occurs and one to enable the monitor again. –> This is an option, but this can create quite some administrative overhead for the OpsMgr admins;
  • Create a consecutive sample monitor and apply this for the specific servers. –> Yep, this is the way i want to go!!

Creating a consecutive sample monitor for Page Life Expectancy in the Operations Manager 2012 console.

A consecutive sample monitor gives you the ability to measure stuff and evaluate it over more samples. This makes it possible to let OpsMgr check the value and generate an alert when the value meets a specific threshold for X samples.

Creating a consecutive monitor can be done in the Operations Manager 2012 console. Remember that the SQL 2008 and the SQL 2012 monitoring MP both contain a seperate PLE monitor.
These are the steps for creating the new monitor:

Go to Authoring –> Monitors. choose add unit Monitor:

SCOMPLE02

SCOMPLE03
I created an Addendum management pack. Probably the Override pack will also be a good place.

 

SCOMPLE04
I disable the monitor by default and override it later for the specific SQL servers.

 

SCOMPLE05

 

SCOMPLE06

 

SCOMPLE07

SCOMPLE08

All you need to do now is disable the original PLE monitor and enable this one for the troubled SQL servers.

Hopefully this helps you reducing the SQL alerts while keeping the monitoring intact.

Let’s make it manageable!

Advertisements

From → System Center

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: