#software-engineering
How to hunt down a bug or an Issue
- Find out what the error is. It can manifest in different ways. Sometimes it’s an error traceback; sometimes a network call not propagating through, and sometimes a pod crashing repeatedly.
- Look one level deeper Whatever way the error manifests, look into it, like really look into it.
- If it is a traceback, figure out which library is causing it
- If it’s a network call, understand the path it can potentially take before reaching the respective server
- If it’s an pod not start look at logs, events,
- Sometimes the errors are generic that’s when you take a step back and get a 360 degree view of all systems involved
- Questions to ask yourself
- Have I seen this before - Life is easy peasy
- Have I seen something like this before
- Searching the internet
- A simple google search
- Github issues
- Community forums
- Experimentation
- Based on the above notes come up with a series of hypothesis for the potential causes of the issue
- Devise a way to test them. Sometimes it’s entering the prod system and running some scripts, sometimes it’s changing some configuration
- Different error is progress, know when you’re complicating the issue v
- Communicate
- People like answers not process
- But it is better to communicate that what you’re doing is an experimentation
- Feelings
- Some issues have the power to take over you
- Like 24/7 process
- Take break
- Talk to someone
- Rubber duck your processs
- Writing/Bug report
- Revisit your notes
- Answer the damn question
- Add details later
- Why should every software engineer do production support?
- Understand the needs of the customer
- Understand the system beyond the context of their development
- Trains your brain to work through the issues quicker
- Debugging a Github action
- Script out the individual steps as bash scripts
- Spend time setting up the open-source tool Act 🔗