Incident Response with Emil Stolarsky
As a system becomes more complex, the chance of failure increases. At a large enough scale, failures are inevitable. Incident response is the practice of preparing for and effectively recovering from these failures. An engineering team can use checklists and runbooks to minimize failures. They can put a plan in place for responding to failures. And they can use the process of post mortems to reflect on a failure and
Continue reading...