Phase 1 is the live demo: OpenRouter provides inference so the system can run without dedicated local GPUs, while self-hosted Appwrite handles functions, persistence, analytics, and secret management. Phase 2 moves inference on-prem to Ollama once the right hardware is available.
The agent orchestration layer is portable. In the live demo, inference runs through OpenRouter while self-hosted Appwrite Functions keep provider credentials server-side and out of the browser. With dedicated GPU hardware, the same agent behavior can move to self-hosted Ollama on-prem without changing the user-facing experience.