The problem
AI assistants forget everything between sessions. Adding persistent, searchable memory typically requires Pinecone or Weaviate — cloud vector databases with API costs and complex setup. Alternatively, you embed PyTorch or a full ML stack into your app, which is overkill for a VPS deployment and impossible on Render's free tier.
The goal: self-contained semantic memory that any service can call over HTTP, runs on commodity hardware, and requires zero external ML infrastructure.
Architecture
Two services: gerald-core (Flask + SQLite-vec, handles storage and search) and ai-hub-backend (FastAPI, handles Claude API calls and debate orchestration). They communicate over HTTP — gerald-core never touches the Anthropic API directly.
Both deployed on Render. The Flutter AI Hub app talks to both services from a single client layer — gerald-core for settings and memory, ai-hub-backend for streaming AI responses. SSE keeps the Flutter UI live during long generations.
Need AI memory or backend infrastructure?
Start a conversation