~ writing/index.md

Putting an AI assistant on Signal (with eyes)

Why Signal

My family already lives in Signal. Asking them to install another chat app just to talk to a house assistant was a non-starter, so I aimed Jarvis at a Signal "Note to Self" thread first — I text myself, it answers in the same thread. If that worked, a shared family group was the next step. The nice part is that Signal's own end-to-end encryption stays intact. The messy part is getting an AI behind a Signal account without handing the message stream to a cloud bot service.

The self-hosted bridge

I run a Signal CLI bridge in native mode on the homelab and linked it as a secondary device named "jarvis." A linked device is load-bearing: it receives a copy of everything on the account, not just the messages meant for the bot. That means filtering has to happen on my side before anything reaches the assistant.

I added two guards in the receiver. First, only my own number is allowed to issue commands. Second, only true "Note to Self" sync messages count. Without the second guard, the bot would see the synced copies of messages I send to other people and could reply to those people as me. That is exactly the kind of bug that sounds hypothetical until it is not. I also kept the bridge inside the LAN; no cloud relay holds the plaintext.

Making it reply

The existing Signal channel was send-only. connect() did nothing, and on_message() never fired. I wrote a background poll-loop receiver that long-polls the receive endpoint, parses both dataMessage and syncMessage.sentMessage, and dispatches to handlers. I built it test-first and ended up with about twenty-five tests before I trusted it enough to link a real device.

Once the plumbing was live, I verified end-to-end by asking whether a service was listed in the homelab docs and watching it query the repo tool and answer with the exact line. That sounds small, but it proves the whole loop: Signal → bridge → orchestrator → tool → response → Signal.

Adding eyes

Photos were a separate trilogy of bugs.

First, attachments: download the image from the bridge, embed it as a data URL on the message, and serialize it correctly for whichever engine is running. Second, and more embarrassing, I had threaded the image through a channel agent class that turned out to be dead code. The live path goes through the system orchestrator. Instrumentation that never fired exposed it; the class looked right, but nothing ever instantiated it. I moved the images parameter through the real call chain instead.

Third, the model itself: the text agent I had been using is text-only and returns a 500 the moment you hand it an image_url. No amount of plumbing fixes a model that cannot see. I routed image-bearing messages to a small local vision model — about seven billion parameters running on the homelab mini — with a config override for the day I want to swap it.

What breaks

Session linking is manual and fragile. Rate limits from Signal's servers are real and not negotiable. Group chats add another dimension: you are not just routing by sender, you are routing by group ID, and a bot that replies in the wrong thread is a social bug, not a technical one. I started with Note-to-Self because the blast radius is exactly one.

Deployment also bit me. A dependency sync without the server extras stripped the web framework and left the service in a crash loop. Rollback repeated the same mistake. The install script uses the full extras; now any redeploy uses the same command.

Takeaways

  • Start with a single, low-risk thread. Note-to-Self is the cheapest production environment you can get.
  • Treat a linked device as a full account tap. Allow-list sender numbers and guard the message type, or you will reply as yourself to people who never asked the bot anything.
  • Verify the live code path before you build features on it. A class that looks correct but is never instantiated is just well-formatted dead code.
  • Separate vision from reasoning. Text agents are cheap and good at tools; vision needs its own model, and the routing layer should know the difference.
  • Test the rollback, not just the deploy. The second outage taught me more than the first.

Self-hosting the Signal bridge costs more effort than a cloud bot, but the privacy model is different: the plaintext lives on hardware I control, behind my own filters, and Signal's encryption still covers the wire. That tradeoff is worth it for a house assistant that actually lives in the house.

← back to writing