- Author: Dan Young, CCIE, PMP and VP of Operations at StormWind
Having a team or even an individual who can troubleshoot spectacularly is game changing. Persistent issues can be solved, acute outages can be mitigated, and life is better when you have confidence that if anything goes wrong, it can be fixed. With so many diverse systems running in a typical infrastructure, there’s a greater need to be able to troubleshoot systems. Troubleshooting is a skill that naturally improves with training. It is learned through osmosis when we teach technologies. It is learned forcefully when we teach how to debug, inspect settings and review log files. It is also taught indirectly with methodologies created in a vacuum by organizations like CompTIA, Microsoft and Cisco. Sometimes these overly complex and impractical troubleshooting methodologies are often married to certifications like Network+ and A+. You generally don’t see people following a 7 or 12 step troubleshooting process that goes perfectly, step by step. Nevertheless, training can offer a few key aspects that are immutable to any good troubleshooting methodology, and will assist an individual in answering the following questions:
-
- What systems are affected?
- Did something change?
- What are the symptoms?
- What can explain those systems?
- What diagnostic steps can validate your suspicions?
- How can the problem be safely remediated?
- Are there consequences to the remediation?
- How can you test remediation steps?
- Once a fix has been applied, what must be tested to ensure the problem is resolved?
- How do you document the problem and fix?
There’s a number of facets that make preparation for troubleshooting successful, which include:
-
- An understanding of how a system works: This may be the most crucial aspect of successfully troubleshooting a system. Truly knowing how something works provides the requirements needed for the troubleshooter’s intuition to kick in. The alternative to knowing how something works leads to guesswork which can sometimes make matters worse.
- Familiarity with diagnostic tools: Where do you go to look at logs? How can you parse those logs? What tests can you run to see where a fault lies?
- A good foundation in troubleshooting methodologies: Do you have folks that assume that a fix worked? Do you have secondary issues come up when folks are feverishly working to fix something? Do they get flustered under the pressure induced during an outage?
If someone on your team is intending to get better at troubleshooting, they may want to look towards training. Training for troubleshooting is dependent on what aspect of troubleshooting is to be addressed. Some common topics include:
-
-
- Systems Understanding: Take vendor-specific training. Be mindful of prerequisites and avoid jumping too deep. These are taught in courses such as Cisco CCNA/CCNP and Microsoft Server, backoffice, Azure and Windows classes.
- Diagnostic Tools: Diagnostic tools sometimes get glossed over during entry-level vendor specific classes. Look to intermediate/professional level classes for deeper coverage on these tools.
- Troubleshooting Methodologies: There aren’t classes that spend the bulk of the time teaching how to troubleshoot. We do have a few classes that introduce troubleshooting methods, such as:
-
- CompTIA Network+: This class teaches a viable albeit lengthy troubleshooting method. It is valuable for anyone who doesn’t have a working knowledge of the “do’s and do-nots” of sound troubleshooting approaches.
- Windows 10 Troubleshooting: So much time is spent troubleshooting Windows 10 issues by organizations. Microsoft created a whole course devoted to this subject. We offer this class, and it is a good option for folks who need to brush up on troubleshooting.
- Cisco ENARSI: This course is the logical replacement to the retired Cisco CCNP TSHOOT course. It is for CCNP-level individuals who need to have a deeper understanding of troubleshooting IP routing. Not for the faint of heart (or those who don’t have their CCNA and ENCOR tests done).
-
-
Ultimately, training helps to get people out of the most dangerous mode of troubleshooting, which is Youtubing, Googling and guesswork. A little bit of Googling may be appropriate, but it should not be the main way things are resolved. For quicker and more efficient troubleshooting, good training options are the best bet.