Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The Rhasspy [0] author recently got hired by mycroft to work on satelites and fully local. Rhasspy requires a lot of manual work, but replacing Alexa is already possible. I’m somewhat stuck with the current hardware availability issues, but I have a Pi 3 satellite that does wakeword detection (this is supposed to be handled by Pi Zero 2 W in the future) and sends the voice to the MQTT server running on a PI 4, the data gets picked up by the Rhasspy instance also running there, it does STT, intent recognition, sends the intent to home assistant and then does TTS back to the satellite.

My main software issue is currently how to replicate the music functionality. Playing music at the satellite that requested it, lowering the volume when it recognizes the wakeword. Preselection of "commands" for band and genre names should be easily scriptable afterwards.

In a quiet room, I have no issues with wakeword detection using a playstation eye camera (I wanted the seed USB microhphone array, but between discovering it and starting with buying hardware the supply chain bit once again)

[0]: https://rhasspy.readthedocs.io/en/latest/



Didn't realize rhasspy already has satellite support. I shall have to check that out!

I've got a home server and a seed array so would be ideal to split that mic (rasp) and processing


Playing music from a Plex server is a major use case for me, and I have given up on Rhasspy because I couldn’t get all the pieces to work together (I have the mic array HAT and a Synology I can run recognition on). Do you have a write-up of your setup?


> Playing music from a Plex server is a major use case for me, and I have given up on Rhasspy because I couldn’t get all the pieces to work together (I have the mic array HAT and a Synology I can run recognition on). Do you have a write-up of your setup?

I have not yet managed / worked enough on it (the lack of HW making everything theoretical, which kills my motivation). The way I understand it, is that there’ll either be a casting server on the satelite, or a pulse audio/pipewire server reachable via network. But I have next to no experience with consumer linux, so the configuration of those parts is… hard.

But there are many tutorials for playing multi-room audio (with icecast or something), I just assumed it would be easier without multi-room as I don’t need it, but it turns out it’s not ;)


And how well does STT work when the room is not quiet anymore, e.g. when music is playing?


My understanding is, that the seeed array would work better than the PS eye, but for the volume I normally listen music at, it still works okay.


Yeah we aren't using the seeed array in the final Mark II. But we have used the same XMOS XVF-3510 to perform acoustic echo cancellation. That means, even with music blasting out of the speakers, you can still wake the device from across the room.

In a simple fashion you can think of it as subtracting the audio being output from the audio coming in from the microphone.


I was talking about Rhasspy though. As nice as your devices look, tiny satelites are mandatory for me to replace alexa.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: