Description
The underlying premise of this ability is that there are people who are trying to use software and somehow, someway, become stuck. These folks need help getting unstuck.
The scenario could be…
... a cross-functional product squad that recently built software and is trying to understand how a defect made it into production.
… an incident response team doing some sleuth work to understand the impact of a performance issue.
… a tier-1 support rep trying to help their customer reset their password.
… a non-coding member of a squad trying to figure out why a video a customer is trying to upload isn’t working.
The ultimate goal of the skill of software investigation is getting…
… the stuck, the undiscovered, the frustrating…
… unstuck, revealed, and encouraging.
The goal is NOT to always solve the problem… but instead, the goal is to ensure we know more after you’ve wielded your powers of technical sleuthing than before. That there is a path forward where before there was none.
There are 3 constructs that make up great software investigation:
- Ownership
- All parties involved (requester, anyone helping, as well as whoever it may be escalated to) are able to (strongly) agree to “I am confident this case is understood and am confident in how it is being handled”.
- Severity Context
- Severity is a function of Urgency and Importance
- Risk = Impact * Likelihood
- Scope, Likelihood, and Impact (see more information on these here) of the case is known well enough that even the submitter understands why the full case resolution is being prioritized the way it is.
- Problem Evaluation Depth
- Exposed
- UI
- Network/console
- API
- Guarded
- Database
- Code
- Infrastructure
- Exposed
Notes:
- Note about cases: You will see this ability refer to a “case”. A case is anything that is a problem that needs to be solved. That can come in the form of a question in slack, an incident, a Zendesk ticket, or any other medium where clarity around a Lessonly-built/managed software problem is needed.
- Note about “case classification”: In most organizations, Lessonly included, there is either a universal or team-level way to classify the cases that come in. Some use severity, some a combination of impact and likelihood, others take an algorithmic approach. In all scenarios making the classification is based on your knowledge and skill in investigation. Therefore in the milestones below we will discuss how trusted your classifications are as you make your way through the journey of software investigation.
- Note about escalations: Escalations aren’t inherently bad. Escalations that leave the team feeling more knowledgeable and appreciative are great escalations. We say this to say… the greatest software investigators cannot and will not solve all problems… they will always help and know when to pull others in for help.
- Note about other relevant abilities: Communication, Customer Service, and all technical abilities will undoubtedly help you on your software investigation journey. However, we believe software investigation is deserving of its own focus.
- Note on ownership: Ownership mutates based on your role, ie;
- Engineers may own a case through to resolution, and therefore the key stakeholders for them are the product manager and the tier-1 rep
- Tier-1 may own a case only until escalated to tier-2, and therefore the key stakeholders for them are the tier-2 reps and the customers
- Tier-2 may own a case only until escalated to tier-3/4, and therefore the key stakeholders for them are the tier-1 rep, tier-3/4 rep, and maybe the product manager
Milestone 1
I have observed this person showing a consistent, comfortable, continuous, and clear positive impact to a squad when wielding this ability, and therefore I would put them in situations where they can employ this ability with only a small amount of guidance.
Overall:
- At this milestone, you can tackle the more obvious cases that we run into while operating a software-as-a-service business.
Ownership:
- At this milestone, it is critical that you’ve shown that you know your limits (when to escalate / ask for help). You know how, when, and where to write up bug reports (see bug writing ability for more details).
Severity Context:
- Your input into how a case should be prioritized is valued, but it needs verification from someone Milestone 2 or higher before it is deemed the official priority.
Problem Evaluation Depth:
- You are known for being able to replicate issues (truly replicating in a test or labs environment… imitating the user with the issue is great, but does NOT satisfy the requirement of replication).
- You are known for being able to isolate (to a user, browser, client, piece of data, or system).
- You are aware of developer tools within browsers but may not be acquainted with the console and network features yet.
- If you have “guarded” access (logs, HoneyBadger reports, etc) you are able to use this to assist you in identifying a potential root cause.
Milestone 2
I have observed this person showing a consistent, comfortable, continuous, and clear positive impact to a squad when wielding this ability, and therefore I would put them in situations where they can employ this ability, with no assistance as well as being a trusted active or passive mentor to others.
Overall:
- At this milestone, you are trusted to handle some escalated cases and/or issues with the software misbehaving during the continuous integration process (right before or right after deployment). You are known to be able to identify the root cause of most priority 5, 4, and 3 cases.
Ownership:
- You are able to own a case, regardless of origin, from start to finish (finish, defined as either resolution, de-prioritization, or transfer of ownership/escalation).
Severity Context:
- Your input into how a case should be classified is trusted as the official classification within the system of record (ZenDesk or Clubhouse), but that classification may ultimately be updated by someone with more context.
Problem Evaluation Depth:
- You have found issues that can be found within all of the exposed surfaces (UI, browser-dev-tools, and API by using tools such as postman). You are known for either asking questions about or seeking answers to concerns that can only be found by looking into the logs of the guarded surfaces (code, database, infrastructure).
- If you do have access to the guarded surfaces, you are sometimes called on to see if you can reproduce an issue using guarded surface-exclusive data manipulation techniques (such as modifying data in SQL or server console).
Milestone 3
I have observed this person showing a consistent, comfortable, continuous, and clear positive impact to multiple squads when wielding this ability, and therefore I would put them in situations where they can employ this ability as well as being considered an expert within this discipline.
Overall:
- At this milestone, you are trusted to handle all but the most severe and complex of cases and/or issues with the software misbehaving during the continuous integration process (right before or right after deployment). You are seen as the expert on your squad, or are called upon when other squads need help identifying the root cause or establishing priority of a case.
Ownership:
- You are able to own a case, regardless of origin, from start to finish (finish, defined as either resolution, de-prioritization, or transfer of ownership/escalation).
Severity Context:
- Your input into how a case should be prioritized is trusted as the official priority within the system of record (Zendesk or Clubhouse), but that classification may ultimately be updated by someone with more context.
Problem Evaluation Depth:
- Even when a case is with a vendor, an infrastructure-only, or a code-only issue, your expertise does not leave any rock unturned. You are known for not leaving any question unasked. This is why everyone from experienced engineers to the least technically-adept go to you for help.
- At this milestone it is still not required for you to have direct code, database, or system access to be able to lead a team to the resolution of a problem. The coders, database administrators, and system admins value your ability to identify issues even without access.
Milestone 4
I have observed this person showing a consistent, comfortable, continuous, and clear positive impact to a squad when wielding this ability, and therefore I would put them in situations where they can not only employ this ability but where they set the tone for this at the company level.
Overall:
- At this milestone, you are likely the last line of defense for all issues. You are often on the incident response team for your ability to sniff out where an issue likely originates and/or for your ability to think of deeply technical ways to identify the root cause of an issue. This ability is not about always fixing the issue (if it needs code, data, or system changes, the requirements of those are covered in other abilities), but if you are here you are the example of how we troubleshoot at Lessonly. You actually are likely to be in the group helping assess other’s ability level within technical investigation.
Ownership:
- You are able to own a case, regardless of origin, from start to finish.
Severity Context:
- Your input into how a case should be prioritized is trusted as the official priority within the system of record (Zendesk or Clubhouse).
Problem Evaluation Depth:
- Nearly no case is too tough for you. You are the last line of defense. However, you also know your limits and won’t spin your wheels but will have suggestions on what we can do to get the problem solved (including but not limited to calling in consultants).
Milestone 5
I have observed this person showing a consistent, comfortable, continuous, and clear positive impact to not just internal teams but the community/industry in general when wielding this ability, and they are recognized by the community/industry as an expert.
Overall:
- At this milestone, you are known/published in the industry on new and exciting ways to recognize, evaluate, and ultimately solve complex problems within complex systems.
Ownership:
- You are able to own a case, regardless of origin, from start to finish.
Severity Context:
- Your input into how a case should be prioritized is trusted as the official priority within the system of record (ZenDesk or Clubhouse).
Problem Evaluation Depth:
- Nearly no case is too tough for you. You are the last line of defense. However, you also know your limits and won’t spin your wheels, but will have suggestions on what we can do to get the problem solved (including but not limited to calling in consultants).
Configuration Health
- ✅ Associated with 9 roles
- ✅ Has been referenced in 1 piece of public recognition
- ℹ️ No one has achieved a milestone on this ability
- ⛔️ Last updated: about 5 years ago
- ℹ️ Never conversed about
Role & Position Requirements
- Back End Engineersmust be milestone 1+
- Front End Engineersmust be milestone 1+
- Implementation Engineersmust be milestone 1+
- Incident Remediation Leadsmust be milestone 2+
- On Call Application Engineersmust be milestone 2+
- Tier 2 Escalation Engineersmust be milestone 1+
- Tier 3 Escalation Engineersmust be milestone 2+
- Tier 4 Escalation Engineersmust be milestone 3+
- Triage Application Engineersmust be milestone 1+
Examples / Observations
An observation relating to Software Investigation has not been publicly recognized yet.
Conversations about Software Investigation
This section is for Lessonly folks only. Sign your team up to find your Gruuv!