Sound the alarm: Recovering devices at scale

Through many years with Nerves at SmartRent we've seen a lot of failure modes for devices. From the plain to the novel. At the scale we operate, we need most devices to recover from failure on their own. A while back we made a bet on Erlang alarms as a strategy for improving robustness and recovering from failure. The alarmist library is open source and provides a solid alarm handler and a lot of lessons learned. We will look at how this build on top of the failure recovery of Nerves, the fault tolerance of Erlang and what it brings us beyond those fundamentals. Based on real experiences at scale.

Frank Hunleth

Frank Hunleth is an embedded systems programmer, OSS maintainer, and Nerves core team member. At his day job, he uses Elixir and Nerves at SmartRent, a company that provides smart home automation for rental properties. When not in front of a computer, he loves running and spending time with his family.