One of the best things you can do for your network is to know what’s going on. You do so with network monitoring. But just having network monitoring is not the same thing as taking full advantage of network monitoring, which you do with network monitoring best practices.
We present to you 8 network monitoring best practices. But this is just part of the network monitoring journey. You can get the complete roadmap by downloading IT Infrastructure Monitoring for Dummies.
1. Replace Multiple Monitors with One Solution
Systems Administrators and other IT pros are sometimes told to cut tasked costs by reducing or, better yet, eliminating monitoring software spawl. If you have several network monitoring systems in place, you likely suffer from dashboard overload.
Having a single source of the truth that’s accessible to the entire team is crucial to the long-term success of your IT organization. If the network team is using one tool, system administrators using another, and your application team another — there will be conflict and finger-pointing. With all your monitoring being done in one solution, teams will demonstrate more accountability and teamwork. Plus, a single solution dramatically reduces costs.
2. Where Should Network Monitoring Be Deployed?
Where can you or should you deploy your network monitoring solution? The general rule of thumb is, if the operating system can reach the systems you want to monitor, it does not matter where it is deployed. You can deploy network monitoring inside Amazon AWS (Amazon Web Services), Microsoft Azure, or other cloud environments. IT pros at smaller shops can even run network monitoring on a system under their desk, or a laptop on top of your desk.
The best advice is to deploy your monitoring solution in a place that makes it highly available while still reaching all the systems you want to monitor.
3. Know What to Monitor
It depends on the use case, but in general, you want to make sure availability as well as performance is looked at for any system that can cause an outage in which either people cannot work, or you lose money.
A simple example is when your router goes down and no one can connect. Obviously, you lose money because you cannot process payments or perform other economic or productive functions.
There is no specific answer for what to monitor. Each environment is different. But in general, you should monitor anything that could affect availability as well as the performance of that system. That opens a world of possibilities. So, don't just have a narrow view of the network -- clearly, you want to track switches, routers, and other dedicated network devices.
But the network is far more than that. The network encompasses the web, which you may well want to monitor. Your applications also run across the network and could be monitored effectively with your network monitoring solution. Again, think of what is most important to your business, most critical to your operations and money-making abilities, and find a way to monitor it.
4. Best Practices for Alerting – Less is More
A cool best practice for alerting may sound counterintuitive but is true – less is more. Here is a perfect example. We came across a customer who was repeating the same actions every two minutes. When a system became unavailable, they’d get an email alert – even when it was only down for a minute. Every two minutes after that, the network monitoring tool kept emailing. They got so used to it, people started ignoring the alerts. When there are too many alerts, people tune them out.
Our recommendation is to make sure emails only go out when someone has to log in and do something. If you are sending out an email from the monitoring system, and no one had to log in and do something – you are spamming them and should reconfigure the system.
5. Alerts Should Turn to Action
Make sure any alarm that comes out gives you something actionable. Take CPU utilization, especially in a virtual environment. A lot of people want an alert if CPU usage exceeds 90% for 30 minutes. That is a bad idea because that is normal for well-designed infrastructure. If you are alerted at the 90% threshold, the natural inclination is to log in and do something like stop a process to free up the CPU. However, in a virtual environment, you are supposed to size systems so they run at near capacity.
The recommendation instead is if a CPU is running at 99% or greater for 30 minutes, then you absolutely want to email about that and offer an action a user should take. You might log in and stop a process in that scenario.
The key best practice when it comes to actions is making sure you only notify people when they can actually do something. Do not just notify for the sake of notification. Instead, notify to alert someone that they need to log in and get an issue fixed.
Even without alerts, key information is still being gathered, and you are still going to get that information when you are looking at your network monitoring dashboards or reports— it's just not sending danger alerts that you must do something right away when you do not.
6. Reporting Best Practices
As with the question of what to monitor, setting up reports is really on a case-by-case basis in terms of what customers need to see. There is no ‘one size fits all,’ which is why the flexibility of network monitoring reporting is so crucial. The good news is that network monitoring reporting is a blank slate, and you can do with it almost whatever you want.
First, you need to define what your requirements are, how often you need reports, and what you want them to cover.
One of the most common use cases is reporting on uptime for the previous month and sending that to management. In many cases, IT is measured based on the percentage of the availability of the systems they are responsible for.
Some people run daily reports about network bandwidth utilization, giving them a close view of their performance, which allows them to easily spot any traffic that is outside of expectations. Others want to see the same information every day about disk space usage. It all depends on the individual, team, or scenario.
7. Get to Know the Value of Scheduling
Scheduling reports are critical. You want to get the data frequently enough that you can act and track it, and long enough that you can see trends. Report frequency depends on the criticality of the function and history of events and issues. Schedule the ones you must see often.
Feel free to explore what the reports have to offer. So much data is at your fingertips. In WhatsUp Gold, you can even just search for a report name. You do not even have to know how to browse for it in the interface. There is a little magnifying glass which you click for more info. You could search CPU and it will show you the CPU reports, et cetera.
Reports can also have various levels of detail. A person whose job involves the network would get more detailed reports, and then higher-ups executives within the company get less detailed reports looking at the overall health of the network.
Those reports, especially, if you look back in a historical way, help guide the future architecture of the network. You can see where your company is at, know that it is growing a certain amount, and plan to ramp up capacity accordingly.
Bandwidth is obviously a key area to track. With bandwidth monitoring, you will spot slower connections or perhaps discover that an older server is slowing things down.
8. Making Network Monitoring Scale
There are various editions and deployment options available for WhatsUp Gold. For most companies, a single WhatsUp Gold server is all they need. For larger businesses or enterprises, we have the option of deploying multiple WhatsUp Gold servers in a distributed architecture. By leveraging what we call ‘scalability pollers,’ we have the capability of scaling up to 100,000 monitors.
Your Next Step
Our eBook, IT Infrastructure Monitoring for Dummies will help you recognize best practices for monitoring and managing your organization’s network, how to grasp network monitoring fundamentals, define alerts and actions, and much more.
In this special edition, you’ll learn best practices and key concepts for network monitoring best practices from WhatsUp Gold’s Product Expert Mark Towler and author and editor Doug Barney. IT Infrastructure Monitoring for Dummies explores:
- Understanding your network
- How to monitor beyond the four walls
- Increase IT visibility
- And much more