Tuesday, March 22, 2011

Performance and Threshold Counters for Exchange Server 2010 - Mailbox Server

Good morning!!
Here the most important performance counters and their thresholds is critical to establishing a performance baseline and monitoring plan to proactively monitor your Exchange 2010 environment and troubleshoot and resolve issues when they arise

back to  Performance and Threshold Counters for Exchange Server 2010 - All in One

Active Database Copy IO Latency Requirements

When these values are exceeded the client experience will degrade (sluggish user experience, message delivery delays etc..)

CountersThresholdTroubleshooting



MSExchange Database\I/O Database Reads (Attached) Average Latency The average value should be below 20 ms.

 Spikes (maximum values) should not be higher than 100 ms.
 Indicates the average time (in milliseconds) to read from the database file.
MSExchange Database\I/O Database Writes (Attached) Average Latency This counter is not a good indicator for client latency since database writes are async.  In general, however this latency should be less than the MSExchange Database\I/O Database Reads (Attached) Average Latency when battery-backed write caching is utilized.
 Indicates the average time (in milliseconds) to write to the database file.
Database\Database Page Fault Stalls/secThis counter should be zero on production servers. If this counter is non-zero, it is an indication that the MSExchange Database\I/O Database Writes (Attached) Average Latency is too high.
Indicates the rate of page faults that cannot be serviced because there are no pages available for allocation from the database cache.



Active Log IO Latency Requirements

When these values are exceeded the client experience will degrade (sluggish user experience, message delivery delays etc..)


MSExchange Database\IO Log Writes Average LatencyThe average value should be below 10 ms.

Spikes (maximum values) should not be higher then 50 ms.
Indicates the average time (in millisecond) to write a log buffer to the active log file.
Database\Log Record Stalls/secThe average value should be below 10 per second.

Spikes (maximum values) should not be higher than 100 per second.
Indicates the number of log records that cannot be added to the log buffers per second because the log buffers are full.
Database\Log Threads WaitingThe average value should be less than 10 threads waiting.
Indicates the number of threads waiting to complete an update of the database by writing their data to the log.


Passive Database Copy IO Latency Requirements

When these values are exceeded the database copy may fall behind by not replaying logs in to the passive database copy fast enough.  Log replication performance may also be impacted.


MSExchange Database\I/O Database Reads (Recovery) Average Latency The average value should be below 200 ms.

Spikes (maximum values) should not be higher than 1000 ms.
 Indicates the average time (in milliseconds) to read from the database file.
MSExchange Database\I/O Database Writes (Recovery) Average LatencyIn general, however this latency should be less than the MSExchange Database\I/O Database Reads (Attached) Average Latency when battery-backed write caching is utilized.
 Indicates the average time (in milliseconds) to write to the database file.
Database\Database Page Fault Stalls/secThis counter should be zero on production servers.  If this counter is non-zero, it is an indication that the MSExchange Database\I/O Database Writes (Attached) Average Latency is too high.
Indicates the rate of page faults that cannot be serviced because there are no pages available for allocation from the database cache.

Replay Log IO Latency Requirements
When these values are exceeded the database copy may fall behind by not replaying logs in to the passive database copy fast enough. Log replication performance may also be impacted.

MSExchange Database\IO Log Read Average LatencyThe average value should be below 200 ms.

Spikes (maximum values) should not be higher than 1000 ms.
Indicates the average time (in millisecond) to read data from a log file. Specific to log replay and database recovery operations.


Information Store RPC Processing Counters
MSExchangeIS\RPC RequestsShould be below 70 at all times.The maximum value is 500 RPC requests that can execute at any designated time before the information store starts rejecting any new connections from clients.
Indicates the overall RPC requests that are currently executing within the information store process.
MSExchangeIS\RPC Averaged LatencyShould not be higher than 10 ms on average.To determine if certain protocols are causing overall RPC latencies, monitor MSExchangeIS Client (*)\RPC Average Latency to separate latencies based on client protocol.
Indicates the RPC latency, in milliseconds, averaged for all operations in the last 1,024 packets.
For information about how clients are affected when overall server RPC averaged latencies increase, see RPC Client Throttling.
MSExchangeIS Mailbox\RPC Averaged LatencyShould not be higher than 10 ms on average.
Indicates the RPC latency, in milliseconds, averaged for all operations in the last 1,024 packets.
For information about how clients are affected when overall server RPC averaged latencies increase, see RPC Client Throttling.
MSExchangeIS Client (*)\RPC Average LatencyShould be less than 10 ms on average.Wide disparities between different client types, such as IMAP4, Outlook Anywhere, or Other Clients (MAPI), can help direct troubleshooting to appropriate subcomponents.
Shows a server RPC latency, in milliseconds, averaged for the past 1,024 packets for a particular client protocol.


RPC Client Throttling Counters

MSExchangeIS\Client: RPCs Failed:Server Too Busy/secShould be 0 at all times.Higher values may indicate RPC threads are exhausted or client throttling is occurring for clients running versions of Outlook earlier than Microsoft Office Outlook 2007.
Shows the client-reported rate of failed RPCs (since the store was started) due to the Server Too Busy ROC error.
MSExchangeIS\Client: RPCs Failed:Server Too BusyShould be 0 at all times.
The client-reported number of failed RPCs (since the store was started) due to the Server Too Busy ROC error.

Message Queuing Counters


Database Counters


MSExchange Database ==> Instances(*)\Log Generation Checkpoint DepthShould be below 500 at all times for the Mailbox server role. A healthy server should indicate between 20 and 30 for each database instance.If checkpoint depth increases continually for a sustained period, this is an indicator of either a long-running transaction (which will impact the version store) or of a bottleneck involving the database disks.
Represents the amount of work in the log file count that will need to be redone or undone to the database files if the process fails.
MSExchange Database(Information Store)\Database Page Fault Stalls/secThis should be 0 at all times.
Shows the rate that database file page requests require of the database cache manager to allocate a new page from the database cache.If this value is non-zero, this indicates that the database is not able to flush dirty pages to the database file fast enough to make pages free for new page allocations.
MSExchange Database(Information Store)\Log Record Stalls/secThe average value should be below 10 per second. Spikes (maximum values) should not be higher than 100 per second.If I/O log write latencies are high, check for RAID5 or sync replication on log devices.
Shows the number of log records that cannot be added to the log buffers per second because the log buffers are full. If this counter is non-zero most of the time, the log buffer size may be a bottleneck.
MSExchange Database(Information Store)\Log Threads WaitingShould be less than 10 on average.Regular spikes concurrent with log record stall spikes indicate that the transaction log disks are a bottleneck.. If the value for log threads waiting is more than the spindles available for the logs, there is a bottleneck on the log disks.
Shows the number of threads waiting for their data to be written to the log to complete an update of the database. If this number is too high, the log may be a bottleneck.
MSExchange Database(Information Store)\Version buckets allocatedShould be less than 12,000 at all times.The maximum default version is 16,384. If version buckets reach 70 percent of maximum, the server is at risk of running out of the version store.
Shows the total number of version buckets allocated.
MSExchange Database Instances(*)\I/O Database Reads Average LatencyShould be 20 ms on average. Should show 50 ms spikes.
Shows the average length of time, in milliseconds, per database read operation.
MSExchange Database Instances(*)\I/O Database Writes Average LatencyShould be 50 ms on average.Spikes of up to 100 ms are acceptable if not accompanied by database page fault stalls.
Shows the average length of time, in milliseconds, per database write operation.
MSExchange Database(Information Store)\Database Cache Size (MB)Maximum value is RAM-2GB (RAM-3GB for servers with sync replication enabled). This and Database Cache Hit % are extremely useful counters for gauging whether a server's performance problems might be resolved by adding more physical memory.Use this counter along with store private bytes to determine if there are store memory leaks. If the database cache size seems too small for optimal performance and there is little available memory on the system (check the value of Memory/Available Bytes), adding more memory to the system may increase performance. If there is ample memory on the system and the database cache size is not growing beyond a certain point, the database cache size may be capped at an artificially low limit. Increasing this limit may increase performance.
Shows the amount of system memory, in megabytes, used by the database cache manager to hold commonly used information from the database files to prevent file operations.
MSExchange Database(Information Store)\Database Cache % HitShould be over 90% for companies with majority online mode clients. Should be over 99% for companies with majority cached mode clients.If the hit ratio is less than these numbers, the database cache may be insufficient.
Shows the percentage of database file page requests that were fulfilled by the database cache without causing a file operation. If this percentage is too low, the database cache size may be too small.
MSExchange Database\Log Bytes Write/secShould be less than 10,000,000 at all times.With each log file being 1,000,000 bytes in size, 10,000,000 bytes/sec would yield 10 logs/sec. This may indicate a large message being sent or a looping message.
Shows the rate bytes are written to the log.


Client-Related Search Counters
MSExchangeIS Mailbox(*)\Slow Findrow RateShould be no more than 10 for any specific mailbox store.Higher values indicate applications are crawling or searching mailboxes, which is affecting server performance. These include desktop search engines, customer relationship management (CRM), or other third-party applications.
Shows the rate at which the slower FindRow needs to be used in the mailbox store.
MSExchangeIS Mailbox(*)\Search Task RateShould be less than 10 at all times.
Shows the number of search tasks created per second.
MSExchangeIS\Slow QP ThreadsShould be less than 10 at all times.
Shows the number of query processor threads currently running queries that are not optimized.
MSExchangeIS\Slow Search ThreadsShould be less than 10 at all times.
Shows the number of search threads currently running queries that are not optimized.


Content Indexing Counters


Process(Microsoft.Exchange.Search.ExSearch)\% Processor timeShould be less than 1% of overall CPU typically and not sustained above 5%. Should be less than 10% of what the store process is during steady state.
Shows the amount of processor time that is currently being consumed by the Exchange Search service.
Process(msftefd*)\%Processor TimeFull crawls will increase overall processing time, but should never exceed overall store CPU capacity. Check throttling counters to determine if throttling is occurring due to server performance bottlenecks.
Shows the amount of processor time that is being consumed to update content indexing within the store process.
MSExchange Search Indices(*)\Recent Average Latency of RPCs Used to Obtain ContentShould coincide with the latencies that Outlook clients are experiencing.
Shows the average latency, in milliseconds, of the most recent RPCs to the Microsoft Exchange Information Store service. These RPCs are used to get content for the filter daemon for the specified database.
MSExchange Search Indices(*)\ Average Document Indexing TimeShould be less than 30 seconds at all time.
Shows the average, in milliseconds, of how long it takes to index documents.
MSExchange Search Indices(*)\Full Crawl Mode StatusIndicates whether this .mdb file is going through a full crawl (value=1) or not (value=0).If CPU resources are high, it is possible content indexing is occurring for a database or set of databases.
Used to determine if a full crawl is occurring for any specified database.


Mailbox Assistant Counters


Process(MSExchangeMailboxAssistants)\%Processor TimeShould be less than 5% of overall CPU capacity.
Shows the amount of processor time that is being consumed by mailbox assistants.
MSExchange Assistants(*)\Events in queueShould be a low value at all times. High values may indicate a performance bottleneck.
Shows the number of events in the in-memory queue waiting to be processed by the assistants.
MSExchange Assistants(*)\Average Event Processing Time in SecondsShould be less than 2 at all times.
Shows the average processing time of the events chosen.


Resource Booking Counters


MSExchange Resource Booking\Average ResourceBooking Processing TimeShould be a low value at all times. High values may indicate a performance bottleneck.
Shows the average time to process an event in the Resource Booking Attendant.
MSExchange Resource Booking\Requests FailedShould be 0 at all times.
Shows the total number of failures that occurred while the Resource Booking Attendant was processing events.


Calendar Attendant Counters


MSExchange Calendar Attendant\Average Calendar Attendant Processing timeShould be a low value at all times. High values may indicate a performance bottleneck.
Shows the average time to process an event in the Calendar Attendant.
MSExchange Calendar Attendant\Requests FailedShould be 0 at all times.
Shows the total number of failures that occurred while the Calendar Attendant was processing events.


Store Client Request Counters


MSExchange Store Interface(_Total)\RPC Latency average (msec)Should be less than 100 ms at all times.
Shows the average latency, in milliseconds, of RPC requests. The average is calculated over all RPCs since exrpc32 was loaded.
MSExchange Store Interface(_Total)\RPC Requests outstandingShould be 0 at all times.
Shows the current number of outstanding RPC requests.
MSExchange Store Interface(*)\RPC Requests failed (%)Should be 0 at all times.
Shows the percentage of failed requests in the total number of RPC requests. Here, failed means the sum of failed with error code plus failed with exception.
MSExchange Store Interface(*)\RPC Slow Requests (%)Should be less than 1 at all times.
Shows the percentage of slow RPC requests among all RPC requests.
A slow RPC request is one that has taken more than 500 ms.
MSExchangeMailSubmission(*)\Hub Servers In RetryShould be 0 at all times.
Shows the number of Hub Transport servers in retry mode.
MSExchangeMailSubmission(*)\Failed Submissions Per SecondShould be 0 at all times.
MSExchangeMailSubmission(*)\Temporary Submission Failures/secShould be 0 at all times.
Shows the number of temporary submission failures per second.
cal Continuous Replication, Cluster Continuous Replication, and Standby Continuous Replication Counters



-Dario

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.