Hi all,
I want to share this performance counter for en Exchange 2010 environment, this post have been splitted in 8 differents post, please check and let me know any suggestion
Performance and Threshold Counters for Exchange Server 2010 - Common Counters
Performance and Threshold Counters for Exchange Server 2010 - Mailbox Server
Performance and Threshold Counters for Exchange Server 2010 - Transport Server
Performance and Threshold Counters for Exchange Server 2010 - CAS Server
Performance and Threshold Counters for Exchange Server 2010 - UM Server
Performance and Threshold Counters for Exchange Server 2010 - CAS Load Based
Performance and Threshold Counters for Exchange Server 2010 - Transport Load Based
Performance and Threshold Counters for Exchange Server 2010 - Mailbox Load Based
Enjoy
-Dario
Tuesday, March 22, 2011
Performance and Threshold Counters for Exchange Server 2010 - UM Server
Good morning!!
Here the most important performance counters and their thresholds is critical to establishing a performance baseline and monitoring plan to proactively monitor your Exchange 2010 environment and troubleshoot and resolve issues when they arise
back to Performance and Threshold Counters for Exchange Server 2010 - All in One
-Dario
Here the most important performance counters and their thresholds is critical to establishing a performance baseline and monitoring plan to proactively monitor your Exchange 2010 environment and troubleshoot and resolve issues when they arise
back to Performance and Threshold Counters for Exchange Server 2010 - All in One
MSExchangeUMAvailability\% of Failed Mailbox Connection Attempts Over the Last Hour | Should less than 5% | |
Recent Failed Mailbox Connection Attempts % is the percentage of mailbox connection attempts that failed in the last hour. | ||
MSExchangeUMAvailability\% of Inbound Calls Rejected by UM Service Over the Last Hour | Should less than 5% | |
% of Inbound Calls Rejected by the UM Service Over the Last Hour is the precentage of inbound calls that were rejected by the Microsoft Exchange Unified Messaging service over the last hour. | ||
MSExchangeUMAvailability\% of Inbound Calls Rejected by UM Worker Process over the Last Hour | Should less than 5% | |
% of Inbound Calls Rejected by the UM Worker Process Over the Last Hour is the percentage of inbound calls that were rejected by the UM Worker process over the last hour. | ||
MSExchangeUMAvailability\% of Messages Successfully Processed Over the Last Hour | Should be greater or equal to 95% | |
% of Messages Successfully Processed Over the Last Hour is the percentage of messages that were successfully processed by the Microsoft Exchange Unified Messaging service over the last hour. | ||
MSExchangeUMAvailability\% of Partner Voice Message Transcription Failures Over the Last Hour | Should less than 5% | |
Recent Partner Voice Message Transcription Failures is the failure rate of partner voice mail transcription in the last hour. | ||
MSExchangeUMAvailability\Call Answer Queued Messages | Should be less than 50 at all times. | |
Shows the number of messages created and not yet submitted for delivery. | ||
MSExchangeUMAvailability\Direct Access Failures | Should be 0 at all times. | |
Shows the number of times that attempts to access Active Directory failed. | ||
MSExchangeUMAvailability\Unhandled Exceptions/sec | Should be 0 at all times. | |
| ||
MSExchangeUMAvailability\Queued OCS User Event Notifications | Should be 0 at all times. | |
Shows the number of notifications that have been created and not yet submitted for delivery. | ||
Represents the number of missed call notifications that have been generated in the Office Communications Server environment and have not been submitted for delivery. | ||
MSExchangeUMPerformance\Operations over Six Seconds | Should be 0 at all times. | |
Shows the number of all Unified Messaging operations that took more than six seconds to complete. This is the time during which a caller was waiting for Unified Messaging to respond. | ||
MSExchangeUMCallAnswer\Calls Disconnected by Callers During UM Audio Hourglass | Should be 0 at all times. | A non-zero value suggests excessive latency between a Unified Messaging server and targeted domain controller. |
Shows the number of calls during which the caller disconnected while Unified Messaging was playing the audio hourglass tones. |
-Dario
Performance and Threshold Counters for Exchange Server 2010 - CAS Server
Good morning!!
Here the most important performance counters and their thresholds is critical to establishing a performance baseline and monitoring plan to proactively monitor your Exchange 2010 environment and troubleshoot and resolve issues when they arise
back to Performance and Threshold Counters for Exchange Server 2010 - All in One
OWA Counters
ASP.NET Counters
Avalability Service Counters
ActiveSync Service Counters
RPC/HTTP Proxy Counters (Windows Server 2008 Only)
RPC Client Access Counters
Microsoft Exchange Address Book Service
Microsoft Exchange Control Panel
Client Access Server OAB Download Counters
-Dario
Here the most important performance counters and their thresholds is critical to establishing a performance baseline and monitoring plan to proactively monitor your Exchange 2010 environment and troubleshoot and resolve issues when they arise
back to Performance and Threshold Counters for Exchange Server 2010 - All in One
Counters | Threshold | Troubleshooting |
OWA Counters
MSExchange OWA\Average Search Time | Should be less than 5,000 ms at all times. |
Shows the average time that elapsed while waiting for a search to complete. |
ASP.NET Counters
ASP.NET\Application Restarts | Should be 0 at all times. | |
Shows the number of times the application has been restarted during the Web server's lifetime. | ||
ASP.NET\Worker Process Restarts | Should be 0 at all times. | |
Shows the number of times a worker process has restarted on the computer. | ||
ASP.NET\Request Wait Time | Should be 0 at all times. | Values greater than this should be investigated. |
Shows the number of milliseconds the most recent request was waiting in the queue. | ||
ASP.NET Applications(*)\Requests In Application Queue | Should be 0 at all times. | Values greater than this should be investigated. |
Shows the number of requests in the application request queue. |
Avalability Service Counters
MSExchange Availability Service\Average Time to Process a Free Busy Request | Should always be less than 5. |
Shows the average time to process a free/busy request in seconds. One request may contain multiple mailboxes. Free/busy responses do not have meeting suggestions. |
ActiveSync Service Counters
MSExchange ActiveSync\Ping Commands Pending | Ping Commands Pending and Sync Commands Pending are the number of hanging requests, which should be almost equal to the number of Direct Push and Hanging sync users | |
MSExchange ActiveSync\Sync Commands Pending | ||
Ping Commands Pending is the number of Ping commands that are currently pending on the server. | ||
Sync Commands Pending is the number of Sync commands that are currently pending on the server. | ||
MSExchange ActiveSync\Requests Queued | Average of 50-100 | |
Requests Queued is the number of HTTP requests that are waiting to be assigned to a thread. |
RPC/HTTP Proxy Counters (Windows Server 2008 Only)
RPC/HTTP Proxy\Number of Failed Back-End Connection attempts per Second | Should be 0 at all times. |
Shows the rate at which the RPC proxy attempts are occurring but fail to establish a connection to a back-end server. |
RPC Client Access Counters
MSExchange RpcClientAccess\RPC Averaged Latency | Below 250ms |
RPC Averaged Latency is the latency, in milliseconds, averaged for the past 1024 packets. | |
MSExchange RpcClientAccess\RPC Operations/sec | |
RPC Operations/sec is the rate at which RPC operations occur, per second. | |
MSExchange RpcClientAccess\RPC Requests | Should not be over 40 |
RPC Requests is the number of client requests that are currently being processed by the RPC Client Access service. |
Microsoft Exchange Address Book Service
MSExchangeAB\NSPI RPC Browse Requests Average Latency | Below 1000ms |
NSPI RPC Browse Requests Average Latency is the average time, in milliseconds, that NSPI browse requests took to complete during the sampling period. | |
MSExchangeAB\NSPI RPC Requests Average Latency | Below 1000ms |
NSPI RPC Requests Average Latency is the average time, in milliseconds, that NSPI requests took to complete during the sampling period. | |
MSExchangeAB\Referral RPC Requests Average Latency | Below 1000ms |
Referral RPC Requests Average Latency is the average time, in milliseconds, that referral requests took to complete during the sampling period. |
Microsoft Exchange Control Panel
MSExchange Control Panel\Outbound Proxy Requests - Average Response Time | Average under 6000ms |
Outbound Proxy Requests - Average Response Time is the average time (in milliseconds) that requests sent to a secondary Client Access server took to complete during the sampling period. | |
MSExchange Control Panel\Requests - Average Response Time | Average under 6000ms |
Requests - Average Response Time is the average time (in milliseconds) the Exchange Control Panel took to respond to a request during the sampling period. |
Client Access Server OAB Download Counters
MSExchangeFDS:OAB(*)\Download Task Queued | Should be 0 at all times. | Values greater than 0 indicate a failure to copy OAB data files from Mailbox servers. |
Shows a value of 1 if the task is queued for execution, otherwise shows 0. | ||
MSExchangeFDS:OAB(*)\Download Tasks Completed | Should be less than or equal to 3 per day. | Values greater than 3 per day indicate the schedule for the Client Access server to download updated OAB files is not a default schedule. |
Shows the number of OAB download tasks completed since the File Distribution service started. The default value is every 480 minutes or 8 hours. |
-Dario
Performance and Threshold Counters for Exchange Server 2010 - Common Counters
Good morning!!
Here the most important performance counters and their thresholds is critical to establishing a performance baseline and monitoring plan to proactively monitor your Exchange 2010 environment and troubleshoot and resolve issues when they arise
back to Performance and Threshold Counters for Exchange Server 2010 - All in One
Processor and Process Counters
Memory Counters
Memory Paging Counters
Process Memory Consumption Counters
Process Working Set Counter
Process Handle Counter
.NET Framework Counters
Network Counters
Exchange Domain Controller Connectivity Counters
-Dario
Here the most important performance counters and their thresholds is critical to establishing a performance baseline and monitoring plan to proactively monitor your Exchange 2010 environment and troubleshoot and resolve issues when they arise
back to Performance and Threshold Counters for Exchange Server 2010 - All in One
Counters | Threshold | Troubleshooting |
Processor and Process Counters
Processor(_Total)\% Processor Time | Should be less than 75% on average. | |
Shows the percentage of time that the processor is executing application or operating system processes. This is when the processor is not idle. | ||
Processor(_Total)\% User Time | Should remain below 75%. | |
Shows the percentage of processor time that is spent in user mode. | ||
User mode is a restricted processing mode designed for applications, environment subsystems, and integral subsystems. | ||
Processor(_Total)\% Privileged Time | Should remain below 75%. | |
Shows the percentage of processor time that is spent in privileged mode. Privileged mode is a processing mode designed for operating system components and hardware-manipulating drivers. It allows direct access to hardware and all memory. | ||
Process(*)\% Processor Time | If total processor time is high, use this counter to determine which process is causing high CPU. | |
Shows the percentage of elapsed processor time that all process threads used to execute instructions. An instruction is the basic unit of execution in a computer; a thread is the object that executes instructions; and a process is the object created when a program is run. Code executed to handle some hardware interruptions and trap conditions are included in this count. | ||
System\Processor Queue Length (all instances) | Should not be greater than 5 per processor | On a computer with a single processor, observations where the queue length is greater than 5 are a warning that there is frequently more work available than the processor can handle readily. When this number is greater than 10, it is a strong indicator that the processor is at capacity, particularly when coupled with high CPU utilization. On systems with multiprocessors, divide the queue length by the number of physical processors. A multiprocessor system configured using hard processor affinity (processes are assigned to specific CPU cores), which have large values for the queue length, can indicate that the configuration is unbalanced. Although Processor Queue Length typically is not used for capacity planning, it can be used to identify if systems within the environment are capable of running the loads or if additional processors or faster processors should be purchased for future servers. |
Indicates the number of threads each processor is servicing. | ||
Processor Queue Length can be used to identify if processor contention or high CPU utilization is caused by the processor capacity being insufficient to handle the workloads assigned to it. Processor Queue Length shows the number of threads that are delayed in the Processor Ready Queue and are waiting to be scheduled for execution. The value listed is the last observed value at the time the measurement was taken. |
Memory Counters
Memory\Available Mbytes | Should remain above 100 MB at all times. | |
Shows the amount of physical memory, in megabytes (MB), immediately available for allocation to a process or for system use. It is equal to the sum of memory assigned to the standby (cached), free, and zero page lists. For a full explanation of the memory manager, refer to Microsoft Developer Network (MSDN) or "System Performance and Troubleshooting Guide" in the Windows Server 2003 Resource Kit. | ||
Memory\Pool Nonpaged Bytes | Not applicable. | Normally not looked at, unless connection counts are very high because each TCP connection consumes nonpaged pool memory. |
Consists of system virtual addresses that are guaranteed to be resident in physical memory at all times and can thus be accessed from any address space without incurring paging input/output (I/O). Like paged pool, nonpaged pool is created during system initialization and is used by kernel-mode components to allocate system memory. | ||
Memory\Pool Paged Bytes | Not applicable. | Monitor for increases in pool paged bytes indicating a possible memory leak. |
Shows the portion of shared system memory that can be paged to the disk paging file. Paged pool is created during system initialization and is used by kernel-mode components to allocate system memory. | ||
Memory\Cache Bytes | Not applicable. | Should remain steady after applications cache their memory usage. Check for large dips in this counter, which could attribute to working set trimming and excessive paging. Used by the content index catalog and continuous replication log copying. |
Shows the current size, in bytes, of the file system cache. By default, the cache uses up to 50 percent of available physical memory. The counter value is the sum of Memory\System Cache Resident Bytes, Memory\System Driver Resident Bytes, Memory\System Code Resident Bytes, and Memory\Pool Paged Resident Bytes. | ||
Memory\Committed Bytes | Not applicable. | Determines the amount of committed bytes in use. |
Shows the amount of committed virtual memory, in bytes. Committed memory is the physical memory that has space reserved on the disk paging files. There can be one or more paging files on each physical drive. This counter displays the last observed value only; it is not an average. | ||
Memory\%Committed Bytes in Use | If this value is very high (more than 90 percent), you may begin to see commit failures. This is a clear indication that the system is under memory pressure. | |
Shows the ratio of Memory\Committed Bytes to the Memory\Commit Limit. Committed memory is the physical memory in use for which space has been reserved in the paging file should it need to be written to disk. The commit limit is determined by the size of the paging file. If the paging file is enlarged, the commit limit increases, and the ratio is reduced. This counter displays the current percentage value only; it is not an average. |
Memory Paging Counters
Memory->Transition Pages Repurposed/sec | Should be less than 100 on average. Spikes should be less than 1,000. | |
Indicates system cache pressure. | ||
Memory\Page Reads/sec | Should be less than 100 on average. | |
Indicates data must be read from the disk instead of memory. Indicates there is not enough memory and paging is beginning. A value of more than 30 per second means the server is no longer keeping up with the load. | ||
Memory\Pages/Sec | Should be below 1,000 on average. | The values that are returned by the Pages/sec counter may be more than you expect. These values may not be related to either paging file activity or cache activity. Instead, these values may be caused by an application that is sequentially reading a memory-mapped file. Use Memory\Pages Input/sec and Memory\Pages Output/sec to determine page file I/O. |
Shows the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays. It is the sum of Memory\Pages Input/sec and Memory\Pages Output/sec. It is counted in numbers of pages, so it can be compared to other counts of pages, such as Memory\Page Faults/sec, without conversion. It includes pages retrieved to satisfy faults in the file system cache (usually requested by applications) and non-cached mapped memory files. | ||
Memory\Pages Input/sec | Should be below 1,000 on average. | |
Shows the rate at which pages are read from disk to resolve hard page faults. Hard page faults occur when a process refers to a page in virtual memory that is not in its working set or elsewhere in physical memory, and must be retrieved from disk. When a page is faulted, the system tries to read multiple contiguous pages into memory to maximize the benefit of the read operation. Compare the value of Memory\Pages Input/sec to the value of Memory\Page Reads/sec to determine the average number of pages read into memory during each read operation. | ||
Memory\Pages Output/sec | Should be below 1,000 on average. | |
Shows the rate at which pages are written to disk to free space in physical memory. Pages are written back to disk only if they are changed in physical memory, so they are likely to hold data, and not code. A high rate of pages output might indicate a memory shortage. Microsoft Windows writes more pages back to disk to free up space when physical memory is in short supply. This counter shows the number of pages, and can be compared to other counts of pages, without conversion. |
Process Memory Consumption Counters
Process(*)\Private Bytes | Not applicable. | This counter can be used for determining any memory leaks against processes. For the information store process, compare this counter value with database cache size to determine if there is a memory leak in the information store process. An increase in information store private bytes, together with the same increase in database cache, equals correct behavior (no memory leak). |
Shows the current number of bytes this process has allocated that cannot be shared with other processes. | ||
Process(*)\Virtual Bytes | Not applicable | Used to determine if processes are consuming a large amount of virtual memory. |
Represents (in bytes) how much virtual address space the process is currently consuming. |
Process Working Set Counter
Process(_Total)\Working Set | Not applicable. | Large increases or decreases in working sets causes paging. Ensure that the paging file is set to the recommended value of RAM+10. If working sets are being trimmed, add Process(*)\Working set to see what processes are affected. This counter could indicate either system-wide or process-wide issues. Cross-reference this counter with Memory\System Cache Resident Bytes to see if system-wide working set trimming is occurring |
Shows the current size, in bytes, of the working set of this process. The working set is the set of memory pages touched recently by the threads in the process. If free memory in the computer is above a threshold, pages are left in the working set of a process event if they are not in use. When free memory falls below a threshold, pages are trimmed from working sets. If they are needed, they will be soft-faulted back to the working set before leaving main memory. |
Process Handle Counter
Process(*)\Handle Count | Not applicable | An increase in handle counts for a particular process may be the symptom of a faulty process with handle leaks, which is causing performance issues on the server. This is not necessarily a problem, but is something to monitor over time to determine if a handle leak is occurring. |
Shows the total number of handles currently open by this process. This number is the sum of the handles currently open by each thread in this process. |
.NET Framework Counters
.NET CLR Memory(*)\% Time in GC | Should be below 10% on average. | If this counter increases to a high value, there might be some objects that are surviving Gen 1 garbage collections and being promoted to Gen 2. Gen 2 collections require a full global catalog for clean up. Add other .NET memory counters to determine if this is the case. |
Shows when garbage collection has occurred. When the counter exceeds the threshold, it indicates that CPU is cleaning up and is not being used efficiently for load. Adding memory to the server would improve this situation. | ||
.NET CLR Exceptions(*)\# of Excepts Thrown / sec | Should be less than 5% of total RPS (Web Server(_Total)\Connection Attempts/sec * .05). | Exceptions should only occur in rare situations and not in the normal control flow of the program. This counter was designed as an indicator of potential performance problems due to a large (>100 sec) rate of exceptions thrown. This counter is not an average over time; it displays the difference between the values observed in the last two samples divided by the duration of the sample interval. |
Displays the number of exceptions thrown per second. These include both .NET exceptions and unmanaged exceptions that get converted into .NET exceptions. For example, the null pointer reference exception in unmanaged code would get thrown again in managed code as a .NET System.NullReferenceException; this counter includes both handled and unhandled exceptions. | ||
.NET CLR Memory(*)\# Bytes in all Heaps | Not applicable. | These regions of memory are of type MEM_COMMIT. (For details, see Platform SDK documentation for VirtualAlloc.) The value of this counter is always less than the value of Process\Private Bytes, which counts all MEM_COMMIT regions for the process. Private Bytes minus # Bytes in all Heaps is the number of bytes committed by unmanaged objects. Used to monitor possible memory leaks or excessive memory usage of managed or unmanaged objects. |
Shows the sum of four other counters: Gen 0 Heap Size, Gen 1 Heap Size, Gen 2 Heap Size, and the Large Object Heap Size. This counter indicates the current memory allocated in bytes on the GC Heaps. |
Network Counters
Network Interface(*)\Bytes Total/sec | For a 100-MBps network adapter, should be below 6–7 MBps. | |
Indicates the rate at which the network adapter is processing data bytes. | For a 1000-Mbps network adapter, should be below 60–70 Mbps. | |
This counter includes all application and file data, in addition to protocol information such as packet headers. | ||
Network Interface(*)\Packets Outbound Errors | Should be 0 at all times. | |
Indicates the number of outbound packets that could not be transmitted because of errors. | ||
TCPv4\Connections Established | Not applicable. | Determines current user load. |
TCPv6\Connections Established | ||
Shows the number of TCP connections for which the current state is either ESTABLISHED or CLOSE-WAIT. | ||
The number of TCP connections that can be established is constrained by the size of the nonpaged pool. When the nonpaged pool is depleted, no new connections can be established. | ||
TCPv4\Connection Failures | An increasing number of failures, or a consistently increasing rate of failures, can indicate a bandwidth shortage. | |
TCPv6\Connection Failures | ||
Shows the number of times TCP connections have made a direct transition to the CLOSED state from the SYN-SENT state or the SYN-RCVD state, plus the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state. | ||
TCPv4\Connections Reset | An increasing number of resets or a consistently increasing rate of resets can indicate a bandwidth shortage. | Some browsers send TCP reset (RST) packets, so be cautious when using this counter to determine reset rate. |
TCPv6\Connections Reset | ||
Shows the number of times TCP connections have made a direct transition to the CLOSED state from either the ESTABLISHED state or the CLOSE-WAIT state. |
Exchange Domain Controller Connectivity Counters
MSExchange ADAccess Caches(*)\LDAP Searches/Sec | Not applicable. | Used to determine current LDAP search rate. |
Shows the number of Lightweight Directory Access Protocol (LDAP) search requests issued per second. | ||
MSExchange ADAccess Domain Controllers(*)\LDAP Read Time | Should be below 50 ms on average. | |
Shows the time in milliseconds (ms) to send an LDAP read request to the specified domain controller and receive a response. | Spikes (maximum values) should not be higher than 100 ms. | |
MSExchange ADAccess Domain Controllers(*)\LDAP Search Time | Should be below 50 ms on average. | |
Shows the time (in ms) to send an LDAP search request and receive a response. | Spikes (maximum values) should not be higher than 100 ms. | |
MSExchange ADAccess Processes(*)\LDAP Read Time | Should be below 50 ms on average. | |
Shows the time (in ms) to send an LDAP read request to the specified domain controller and receive a response. | Spikes (maximum values) should not be higher than 100 ms. | |
MSExchange ADAccess Processes(*)\LDAP Search Time | Should be below 50 ms on average. | |
Shows the time (in ms) to send an LDAP search request and receive a response. | Spikes (maximum values) should not be higher than 100 ms. | |
MSExchange ADAccess Domain Controllers(*)\LDAP Searches timed out per minute | Should be below 10 at all times for all roles. | |
Shows the number of LDAP searches that returned LDAP_Timeout during the last minute. | Higher values may indicate issues with Active Directory resources. | |
MSExchange ADAccess Domain Controllers(*)\Long running LDAP operations/Min | Should be less than 50 at all times. | |
Shows the number of LDAP operations on this domain controller that took longer than the specified threshold per minute. (Default threshold is 15 seconds.) | Higher values may indicate issues with Active Directory resources. |
-Dario
Subscribe to:
Posts (Atom)