An Artificially Intelligent Home

Explorations in Home Automation

View the Project on GitHub

Using YAMNet to Monitor Sounds

November 1, 2024

In the Chicago area on the first Tuesday of the month at 10:00am there is a test of the tornado alert sirens. They also go off if there has been a torndado spotted in the area, and this is our cue to head to the basement just in case. So I have cameras outside and it turns out they have pretty good microphones. I decided to see if I could create something sort of analogous to (but much less sophisticated than) Frigate, which is an awesome and very mature system you can use to detect certain objects (visually) and notify (via MQTT) Home Assistant, where you can set up automations to act on detections. I’d really like to have lights flash and sounds go off in my house if, say, the tornado siren goes off at 2am. It’s not something I’d like to sleep through. So that (in addition to wanting to learn how to create an add-on) was what motivated all of the below.

Yamcam - a Home Assistant Add-on

With this motivation, and also becasue I wanted to learn how to do it, I developed Yamcam, a Home Assistant Add-on that uses TensorFlow Lite and the YAMNet sound model to characterize sounds deteced by microphones on networked cameras. It does not record or keep any sound samples after analyzing them. It continually takes 0.975s samples from RTSP feeds, using FFMPEG, and pushes these samples to the YAMNet sound classifier model, which returns scores for each of its 521 sound classes.

I didn’t want to make sense of 521 different sound classes, so I grouped them into about a dozen groups, such as birds, vehicles, animals, people, music, alerts, etc. This way I can set an automation up to act on, say, people sounds without having to check all 70-80 YAMNet sound classes that are various people sounds.

Like Frigate, Yamcam uses MQTT to communicate sound events to Home Assistant, where the parameters for determining the start and stop of a sound event are configurable. If you want to try it, all of the code and instructions are available in the CeC-HA-Addons repo.

YAMNet-based Sound Profiler (YSP) - a Stand-alone tool

Yamcam also logs all of the sounds and sound events it detects (with appropriate thresholds set in its config file), and I decided it would be useful to have a version that can run stand-alone rather than only as a HASS Add-on. This version does not report events - it just creates a logfile so you can characterize your sound mix over time. Of course, you could integrate it into some other notification system. The stand-alone (admittedly it has only been tested on macOS) is called YAMNet-based Sound Profiler (YSP).

SoundViz - Creates Reports from YSP or Yamcam logs

Finally, I wrote a reporting tool, SoundViz that creates a PDF report from the log files that either Yamcam or YSP produce. These csv files can be millions of rows, so the tool (again, only tested so far on macOS) checks to see how many cores are available and runs in parallel to speed things up.

Want to Try it?

For much more detail on the Yamcam HASS add-on and the YSP standalone tool, the README files in both repositories are detailed and include step-by-step instructions to get them running.