The Case for Universal AI Memory (and How I Built a Rough First Draft)

My #1 frustration when I started using LLMs (primarily Claude) heavily personally and at work, was the lack of universal memory across projects. I'd share a useful bit of context in one project, only to have to completely re-explain it in another.

So as a project to improve my vibe coding skills, I thought it would be fun to build my own: enter lodis.ai. Rather than use something off the shelf (and yes, there are plenty), I figured I'd learn more building it myself.

Here's what I learned along the way:

Don't jump straight into building: it would've been much more efficient to spend my tokens and time on research, planning, and scoping.
... But get to building!: never would've learned these lessons without a working prototype that I've been using for weeks.
Design better tests: I had Claude Code build a benchmark inspired by Google's MRCR to compare lodis vs context-stuffing on my real memory data. Super enlightening about (1) how bad the initial retrieval was, and (2) the fixes + methodology to iterate. Three rounds later, recall went from 17% to 83%.
The LLM IS your frontend: as a consumer PM I obsess over UI/UX, button placement, look and feel. But once I stopped building dashboard features and let Claude Desktop/Mobile be the UI, the product got way better, way faster.

If you want to give it a try, I built 2 flavors that integrate with Claude:

Local-only for the privacy-conscious
Cloud-based for those working cross-device (desktop + mobile)

And here's my honest product review of it in advance:

Loving it

Core functionality: save a detail once, retrieve it from anywhere. When it's wrong, correct it once in the normal flow of chat.
Memory types: categories (people, projects, snippets, entities) and durations (canonical, ephemeral) make storage and retrieval efficient for you + Claude.

Needs work

Performant retrieval: ridiculously hard to do well, and I'm not an infra engineer. Graph densification, re-ranking, decay, query caching, and on it goes. Still not happy with it.
Overriding the system default: developing a reliable harness and system prompt that replaces Claude's own native memory tool has been an uphill battle.
Portability: my real ambition is memory that travels from Claude to Gemini to ChatGPT. MCP gets you partway, but real cross-vendor portability needs more. That's the next frontier.

Still Broken

Visibility/Auditability: a dashboard works until it's overloaded. Mine has 4200+ memories now, the dashboard takes too long to load and can't be reasonably audited.

If you've got thoughts, feedback, or want to try the not-free-for-me cloud-version: comment, email me (james@sunriselabs.ai), or better yet, send a PR ;)