Design
An extremely simple inference-time memory module that treats memory management as a series of LLM calls and agent loops over a markdown-based file tree. We plan to build it around the Pi agent harness and package the nanomem project as an integrated Pi extension. The memory state is stored locally and encrypted on-device, and all user queries can be augmented with the minimal relevant context before being sent to the model.
Selective memory
This setup supports selective disclosure of memory content, giving users control over what information is shared with the model. Rather than injecting a persistent global memory into every interaction, memory is retrieved at inference time and only the minimal relevant context is surfaced.

P2P Memory Sync
The memory files stay local on the user’s machine as a markdown tree. If the user chooses to sync these memory trees peer to peer, they can be semantically merged across devices using a local model driven by agents, keeping user memory consistent across machines without compromising data ownership or requiring centralized management.
