note / March 22, 2026
Why I like local AI
VOCO runs Qwen3 on a laptop with no API keys. Here's why that matters more than you think.
01VOCO started from a simple annoyance: every assistant I tried assumed a cloud connection, a paid API key, and an obedient internet connection. I wanted something that still worked when all of those assumptions died.
02The first time I watched Qwen3:4b run a Playwright automation task on a laptop with no internet, it felt less like a demo and more like a small engineering victory. The machine was fully offline, the browser still moved, and nobody had to ask permission from a vendor.
03That experience changed the shape of the project. Offline-first is not just about privacy. It also makes you stop treating the model as a magic oracle and start treating it like one component in a system that should be measured, routed, and sometimes ignored.
04The ML router in VOCO exists because an LLM should not be the first thing touching every command. TF-IDF plus a Random Forest gets the obvious cases out of the way. The model only wakes up when there is actually something ambiguous to do.
05That distinction matters because open-loop planners are fragile in the real world. They sound impressive right up until the browser changes, a modal appears, or a button moves six pixels and the whole plan falls apart. Closed-loop execution is uglier, but ugliness is often what makes it survive contact with reality.
06There is also the unromantic benefit of not needing an API key. No rate limits. No surprise bills. No silent quota ceiling deciding whether your week goes smoothly. Reproducibility becomes your problem again, which is annoying, but also honest.
07Privacy is not the only reason I like local AI, but it is the easiest one to explain. Data stays on the device, the inference stack is yours, and the whole thing feels less like borrowing a service and more like owning a machine.
08The bigger shift is psychological. Once you own the model runtime, you stop asking whether the cloud allows a task and start asking whether your own system is disciplined enough to do it. That is a much better question.