“Site reliability through controlled disruption”
Chaos Engineering (CE) is pioneered by companies like Netflix and Amazon to proactively test how systems respond in presence of failure, to identify and fix problems before they become outages. Thanks to this approach complex and distributed systems can be more reliable and resilient.
During this one day course, you will be introduced to Chaos Engineering and be given the tools and techniques to get started with Chaos Engineering within your own organisation.
This course is written and delivered by Mikolaj Pawlikowski, the author of the book Chaos Engineering: Site reliability through controlled disruption (Manning). Mikolaj leads a team of SREs managing Kubernetes at Bloomberg. He first started with CE as a surprisingly effective sleeping aid - the more failures his team simulated during working hours, the fewer outages were happening when they were asleep.