Friday, 14 February 2020

How TO Diagnose Enterprise Network Problem and Fix Them


The methodology for solving corporate network problems: an introduction to best diagnostic practices

Diagnosing and troubleshooting a network problem in a corporate network monitor can be a daunting activity. With the potential of multiple branches, hundreds or even thousands of hosts, dozens of routers, switches and servers, all with different vendors or firmware and good old-fashioned human error, knowing where to start is the key. implementing a quick fix. There is a well-established methodology for diagnosing a significant network problem and following its instructions will help administrators maintain an organized approach to troubleshooting.

Know where to start

Previous experience with the network in question can help administrators find and resolve the problem. If most of the network problems that occur during NOC as a Service come from specific errors with a known fix, this will quickly provide troubleshooting for a first choice for resolving a problem. Even without familiarity with the network, it is possible to follow a procedure that will help keep everyone involved.

The first and most obvious step is to define the problem in order to resolve network errors. If a user is unable to connect to a file server to access their work, this would define the problem. This first step is generally known simply by its nature. It is rare to be called for troubleshooting without a clear problem already occurring!

Then collect information from users or affected systems. In the example above of a user having problems connecting to a file server, some basic questions might be worth asking. When was the last time the user was able to access the server? Has anything changed since then? Do other users also have the same problem? If the problem is more prevalent, there is likely to be a problem upstream in the network. If isolated from a single host, there is probably no bigger network problem to solve. Gathering information can be one of the most important and often overlooked steps in troubleshooting a large network. The data and testimonials collected here can be used to guide administrators through the troubleshooting process.



Data collection with Ping and TraceRoute

It is quite important to collect your section. The ping and trace routing tools provide much more information than their simplistic functions imply. A large amount of data can be collected for further analysis using only these two commands.

Using another example, let's assume that some users in a part of an office are unable to connect to the network. The ping command can be used to gather information and isolate the problem. This diagnostic tool works at the network level and its first use can be attributed to the division and conquest approach for troubleshooting. Simply send a package from the host machine to the destination. Keep in mind that some interfaces may have access controls or that there may be a hardware/software firewall that prevents pings from reaching a host, so this command can have limited uses, especially on incoming WAN interfaces.

Cisco recommends a specific four-step procedure when using ping to diagnose network-level IP errors:

  • Ping the loopback address. This is 127.0.0.1 and is used specifically for diagnostic purposes. This confirms that TCP / IP is working on the host.
  • Ping the localhost. This is the internal IP address of the affected host. For example, 10.0.0.2. If this ping is successful, the network card works.
  • Ping the default gateway. If successful, the problem is likely to be upstream of the host computer.
  • Ping an external IP. If successful, but the host still cannot connect to the Internet or another network, there may be a DNS error, an incorrectly configured access control list, or a problem with a firewall.

Depending on the information gathered about the problem, some of these steps may have been skipped. In the example above, if it is already known that the host within this the network can still communicate with each other, it makes sense to skip steps one and two.

Another powerful command is traceroute (on Cisco IOS) or tracert (on the Windows command prompt). The tracking path will send a packet to the destination and report the necessary steps to get there. If the packet cannot communicate with a router on the way to its destination, it will be reported to the user executing the command. This can highlight where a potential problem is occurring and give administrators a good idea of ​​where to start looking for the problem.

Analyze the data and work on a solution

Once you have defined the problem and collected the information, you need to perform an analysis to resolve network problems. This can be simple or complex, depending on the data present. Analyzing available data is an important step in solving a network problem, as it provides advice on the methodology with which to start solving the problem.

Top-down approach

These methods are exactly as they are: troubleshooting from the top of the bottom OSI model or from the bottom of the top OSI model. Working with these methods can be effective because, in general, if a layer works, the underlying layers usually work properly. This will not always be true, but in most cases, it will be. The downside is that if insufficient information has been collected, starting on the wrong side of the model can create an unnecessary amount of extra work. That's why collecting detailed information and analyzing it is so important! If the problem is at the application level and troubleshooting starts from the physical level, it will take a long time and effort to confirm that the other six levels work before reaching the real problem. Depending on network access, it can sometimes be difficult, if not impossible, to verify the higher levels of the OSI model, which must be taken into consideration before selecting this approach.
Divide and conquer
Often the most effective methodology when information is limited, this approach starts in the middle of the OSI model, usually the network layer, and works outward. This is where the ping and traceroute commands come in. Depending on the positive (or negative) the outcome of the ping test, it will guide the troubleshooting up or down in the model. If the ping works correctly, there is probably a problem at a higher level. Likewise, if ping fails, there is a problem at level 3 or lower. This can help you quickly find a path to the problem to be solved and get administrators to work quickly on a solution.

Improvisation and other methods

A handful of methods fall into this category and should generally only be used when the information gathered indicates a very specific problem. Another reason to opt for this method first would be that the same network problem appears regularly and that a fix is ​​already known. If there is a high probability that the problem will be quickly identified and resolved using this method, this will save time and resources compared to other methods. Knowing a particular network will help administrators decide if this is the correct way to go about resolving a problem.


Be flexible

Every network is different, every problem is different, and administrators need to be able to adapt to an evolving network environment in order to diagnose and resolve network problems quickly and effectively. While a consistently followed and well-documented troubleshooting plan will help keep everyone on the same page to quickly resolve potential problems, flexibility is needed to speed up response and correction times. Understanding when not to follow procedures is essential to maintaining a large network.

Solve recurring problems

All networks will experience a significant number of errors and problems. However, if the same problem constantly raises its ugly head, the search for a permanent solution is important. If a router fails regularly, for example, it may be time to replace it. Redundancy can help solve, but not solve, recurring network problems. Likewise, stop-gap or quick-fix solutions should implement long-term solutions as early as possible to avoid future headaches. Moving forward to a problem is often the best way to solve it.

The human factor and harmful intrusions

People make mistakes. They forget to plug things in, turn them on, set them up correctly or just don't know how to make something work. The best way to combat human error is to have knowledge and practice. An experienced user will cause far fewer network nightmares than the uneducated user. Always consider the human factor when analyzing data and finding a solution to a problem.

Likewise, humans sometimes have unscrupulous goals when they access a network. Always follow best security practices and keep in mind that sometimes network errors can come from malicious sources designed to stop the service. These types of attacks come in many forms and the best way to prevent them is through education and proactive defense.

Network monitoring software

There is a multitude of networking software that will help monitor, diagnose and troubleshoot large networks. From open-source tools available for free on the Internet to full-service business-oriented options, there will be a software solution for anyone who can help administrators manage their networks. Using these tools can help speed up the resolution of network problems, taking up a lot of time and human resources and putting them in the hands of the software.

No comments:

Post a Comment

Discover The Many Benefits Of Partnering Up With an IT Managed Service Provider

  Today, many companies work with IT-managed service providers. 60 percent of companies use managed IT services , and this number is expecte...