ChatGPT’s Memory System — Under the Hood

9 May , 2025 - article,updates

ChatGPT’s Memory System — Under the Hood

and how it might works in WATCHFIT®

While it may appear like a simple “notepad” that stores and recalls user-specific facts, ChatGPT’s memory architecture is significantly more advanced, particularly in how it decides what to store, how to store it, and when to surface it.


✅ High-level Summary

  • The memory system stores long-term user-relevant information (e.g., project names, preferences, collaborators) in the form of structured entries.
  • These entries are not part of the immediate token context; instead, they’re retrieved and inserted into the prompt dynamically when the model decides they’re relevant.

⚙️ Key Mechanisms Behind Memory

1. Saliency-Based Capture

The system uses internal heuristics and semantic filters to decide:

  • What’s worth persisting beyond the current conversation.
  • When an existing memory should be updated or deleted.

This is not rule-based; it relies on natural language understanding to determine intent and importance.

2. Semantic Compression & Canonicalization

Captured memory entries are not raw quotes from chat, but summarized representations (e.g., “Paolo is working on a wearable fitness tracker called WatchFit”).

  • This ensures that memory remains compact and interpretable.
  • It also avoids bloating with noisy or redundant data.

3. Contextual Injection

Memory entries are injected into the system prompt when:

  • They are deemed relevant to the current user input.
  • They add useful grounding without causing contradiction or redundancy.

This is handled via a retrieval mechanism that works in tandem with the attention system, rather than being always-on.


📚 Memory vs Context Window

  • The context window is a sliding window of ~128k tokens (for GPT-4o), representing short-term memory.
  • The long-term memory is stored outside this window and must be explicitly retrieved and merged into the prompt.
  • This separation allows for both recency and continuity, without overwhelming the model with stale context.

🛡️ Privacy and Control

  • Users can view, delete, or edit memory entries.
  • Memory doesn’t store sensitive or private data unless explicitly entered and deemed relevant.
  • The user is notified when memory is active and has full control over it.

Leave a Reply

Your email address will not be published. Required fields are marked *