and how it might works in WATCHFIT®
While it may appear like a simple “notepad” that stores and recalls user-specific facts, ChatGPT’s memory architecture is significantly more advanced, particularly in how it decides what to store, how to store it, and when to surface it.
✅ High-level Summary
- The memory system stores long-term user-relevant information (e.g., project names, preferences, collaborators) in the form of structured entries.
- These entries are not part of the immediate token context; instead, they’re retrieved and inserted into the prompt dynamically when the model decides they’re relevant.
⚙️ Key Mechanisms Behind Memory
1. Saliency-Based Capture
The system uses internal heuristics and semantic filters to decide:
- What’s worth persisting beyond the current conversation.
- When an existing memory should be updated or deleted.
This is not rule-based; it relies on natural language understanding to determine intent and importance.
2. Semantic Compression & Canonicalization
Captured memory entries are not raw quotes from chat, but summarized representations (e.g., “Paolo is working on a wearable fitness tracker called WatchFit”).
- This ensures that memory remains compact and interpretable.
- It also avoids bloating with noisy or redundant data.
3. Contextual Injection
Memory entries are injected into the system prompt when:
- They are deemed relevant to the current user input.
- They add useful grounding without causing contradiction or redundancy.
This is handled via a retrieval mechanism that works in tandem with the attention system, rather than being always-on.
📚 Memory vs Context Window
- The context window is a sliding window of ~128k tokens (for GPT-4o), representing short-term memory.
- The long-term memory is stored outside this window and must be explicitly retrieved and merged into the prompt.
- This separation allows for both recency and continuity, without overwhelming the model with stale context.
🛡️ Privacy and Control
- Users can view, delete, or edit memory entries.
- Memory doesn’t store sensitive or private data unless explicitly entered and deemed relevant.
- The user is notified when memory is active and has full control over it.