The System is Down. Whose Fault is It, Anyway?
Blaming and punishing people when things go wrong impedes learning and poisons the organizational culture, leading to hiding problems until they become catastrophic. Instead of playing "the blame game" when something goes wrong, practice "Zero Blame."
Building a Zero Blame company culture requires a commitment from everyone in the organization – from executives down to individual contributors. In a Zero Blame culture, individuals and teams are rewarded for exposing problems early on when they can be resolved with less pain and for taking ownership of issues.
Us vs. Them
Teams in non-DevOps organizations are frequently siloed by function. This results in an adversarial dynamic where teams compete for resources and do not share responsibilities. In fact, one team trying to fulfill their tasks may make it more difficult for teams around them to be successful. A perfect example is when operations teams are responsible for responding to incidents 24×7, even when many of those incidents are a result of problematic code written by the development team. It would be natural for the operations team to resent the unfairness of that situation.
This "Us vs. Them" mentality causes a lack of trust between teams, between management and staff, or between different office locations. Inequality between groups can exacerbate the Us vs. Them mentality, whether it is more power, better perks, or not having to respond in the middle of the night to an on-call incident.
One of the first steps to escaping the blame game is to create shared responsibilities. Make it everyone’s responsibility to respond to operational incidents. This defuses the us-versus-them dynamic between operations and development.
It also provides more rapid feedback to developers and incentivizes writing code with higher resilience and reliability to reduce future incidents.
In today’s increasingly complex systems, a failure is rarely the fault of one person or team. Compared to the slower-paced waterfall development lifecycle, DevOps moves incredibly fast. It is important to realize that there may be several relevant causes and numerous changes that led to a specific problem.
Searching for someone (or team) to blame for an incident is unproductive and harms employee morale. A negative culture that focuses on blame can impede overall productivity, as studies have shown culture influences productivity. Instead, examine why the processes were unable to prevent that failure.
Focus your efforts on auditing the system, rewarding employees who find problems early, and avoid complacency after improving one process or area of your value stream.
Want to learn more about DevOps? Check out our post People, Processes, and Tools: The Foundation of DevOps.