When there’s a problem with a server, Windows admins have to play detective to solve it. Fortunately, Windows has various tools to find and decipher clues, in order to solve server performance problems. One of them is Windows Reliability Monitor, which can be used for server monitoring. 

For IT teams, it’s not uncommon for a system to succeed yesterday, but fail today. When this happens, the first troubleshooting step is to find out what changed during that time. Windows Reliability Monitor is one of the most useful troubleshooting utilities that can help IT teams address server issues effectively and efficiently. 

What is Reliability Monitor?

When a server starts to become unstable or its performance is worse than usual, then the IT team needs to verify and determine when this problem occurred. We can get this kind of info from Event Viewer or Performance Monitor, but Windows Reliability Monitor offers an interface that is easier to understand. 

Reliability Monitor uses a graphical interface to display and assess system dependency history, including what changed and when, along with what application or operating system crashed. Reliability Monitor can also display info related to software updates, installations, or uninstallations, and if there are any problems occurring in the hardware. 

Windows Reliability Monitor links issues to the above categories, where possible, to further narrow down the scope of the problem. By doing so, it is able to identify the origin of server instability. As a result, it’s easier for IT teams to find the source of the problem and try to resolve it in the most appropriate way. 

The graph is very easy to read. If performance is good, the graph will continue straight. As soon as there is a drop in performance, the graph will drop. So the IT team can see when it happened and what systems are not running smoothly from the history. The IT team can then immediately check the problematic parts, so that the server can be returned to normal conditions in a short time. 

How Reliability Monitor Works

Like a computer installed in a vehicle, of course, its job is to constantly monitor its performance. Starting from fuel consumption, tire pressure, engine RPM, valve load, and so on. Windows also continuously monitors the status of the operating system from the initial log in until the system is closed. 

Various important components in the system (memory, data drives, fans, and CPU) are constantly checked and the results are collected in the Performance Counters. All activities that occur in the system and applications (such as sending and receiving Outlook, opening Word documents, etc.) are tracked one by one and the information is properly stored as event trace data. 

How to Read Reliability Monitor Results?

Is it that easy to read the results that Reliability Monitor displays? The answer is yes. The way to open this feature is as follows:

  • Open Reliability Monitor by going to Control Panel
  • Select System and Security
  • Click Security and Maintenance

You can also enter this feature by:

  • Open the Run menu by clicking the Win + R keys
  • Then type the perfmon /rel command in the field provided. 

Windows Reliability Monitor only has one interface which is mostly dominated by bar graphs. The graph is very simple with the vertical axis being the reliability scale from 1 – 10. Meanwhile, the horizontal axis represents the performance time range. By default, each column represents 1 day, but IT teams can select the Week setting on the top left, and each column will switch to a 1-week coverage. 

The bottom right corner of the graph shows event categories, including alerts and application failures. An event appears in the corresponding day or week, which can reveal any patterns and indicate exactly when the problem started. 

To see the details of the events that occurred, just click Reliability Details at the bottom. There it will usually show info in the form of a sentence describing the problem. Meanwhile, the View Technical Details panel will provide additional suggestions. The IT team can solve the problem with these suggestions, but can also use other methods that are considered better. 

The reliability rating will drop for each crash or issue that occurs, but will slowly increase towards 10 once the server has stabilized. This info is quite useful for IT teams to troubleshoot issues, such as updating or removing applications, restoring drivers, or replacing hardware that has been damaged and is not fit for use. 

How to Save Reliability Results?

There are several ways to use Windows Reliability Monitor, one of which is to save the check results. The way to save it is to go to the bottom left corner and click Save Reliability History for server auditing purposes as part of the company’s service level agreement archive. 

Interestingly, users can combine and compare the reliability history of multiple servers to identify applications or updates on multiple systems, with the aim of addressing a wider range of issues. This important troubleshooting info can guide IT teams to determine if the problem is isolated to a single system or if it is affecting multiple systems at once. 

Reliability Monitor stores history in XML format. IT teams can configure Windows to send problem reports to Microsoft. In theory, Microsoft uses these reports to identify and address persistent problems with specific devices or hardware. Reliability Monitor provides a link to these reports in the bottom left corner. Click View to check all problem reports and incident lists sent to Microsoft. 

Reliability Monitor is a native tool or feature on supported Windows server and desktop systems, making it a great choice for troubleshooting device issues on clients and servers. Given that Windows is very widely used in companies, of course this feature is very useful. We don’t need to install additional features or activate certain services to use it. 

Windows provides a special feature to perform server monitoring tasks, called Reliability Monitor. It’s easy to use and easy to read, and there’s also a panel that gives troubleshooting suggestions. Of course, for more comprehensive monitoring tasks, companies need to use monitoring services that can be relied on at all times, such as Netmonk with its product, Netmonk Prime. 

Netmonk Prime provides server monitoring, network monitoring, and web/API monitoring services with just 1 application. Reports will be provided with an easy-to-understand interface and there are automation features to solve problems directly. Already used in more than 15 companies in Indonesia, just visit the Netmonk web for more info!