Presented by

  • Marc MERLIN

    Marc MERLIN
    http://marc.merlins.org/

    Marc has been using linux since 0.99pl15f (slackware 1.1.2, 1994). He has worked for various tech companies in the Silicon Valley, including Network Appliance, SGI, VA Linux, Sourceforge.net, and Google since 2002, both a server sysadmin and software engineer. He has done hacking in various areas like mail with exim, mailman, SpamAssassin and SA-Exim, as well as maintained various linux distributions at Google and elsewhere, and given talks about some of those projects, and others at linux conferences since 2001 (LCA, OLS, Linuxcon, Usenix/LISA). For fun, when he's not trying to beat his mythtv into submission, or hacking misterhouse so that a motion sensor and video camera can trigger a blender to scare the cat off the kitchen counter, he goes snowboarding, mountain biking, racing cars, and flying RC or full size planes.

Abstract

This talk will look into the failures I've encountered in multiple fields, and learned from reading from other people's failures, a common practise in aviation that has saved countless lives in not re-creating failures and accidents out of ignorance. You will also hopefully improve your spidey sense in things that could go wrong and ask the right questions or implement the right procedures or fixes before they become necessary after downtime. As they say in aviation "experience is a cruel teacher: she gives you the test first, and if you survive, then you get to learn the lesson". Examples: - how to avoid spectacular lipo fires or circuit burns - when aviation goes wrong, from AF447, QF32, Boeings 737 Max, and more - failures and avoiding failures in production at Google, including how automation can go wrong - why mkdir -p 0755 /path/to/dir can take you down hard - you know binary drivers suck, but do you need more examples? If so, come on by - why this temporary fix will bite you hard soon after - a problem is not actually fixed until it's root caused