Sphere Partners

TL;DR

Standard enterprise AI resets to zero after every session — after 500 days of use it knows no more about your organisation than on day one. Engram extracts structured knowledge from every interaction, classifies it into 9 memory types, and matures it through 4 lifecycle stages. Organisation-specific AI accuracy climbs from roughly 15% to 92%. Gravity wells self-organise related memories without configuration. Memory is LLM-agnostic and belongs to your organisation, not to OpenAI or Anthropic.

+77pt

Accuracy gain on org-specific questions — from 15% baseline to 92% with persistent memory

2.5 hrs

Daily time knowledge workers spend searching for information they already created (IDC, 2020)

9 types

Memory kinds classified by persistence behaviour — from ephemeral context to crystallised organisational fact

4–5 wk

Faster new hire ramp-up when AI retains institutional knowledge across staff turnover

Enterprise AI has an architecture problem. The standard deployment — LLM behind a chat interface — is fundamentally stateless. Each conversation starts from zero. The AI is brilliant in the abstract and ignorant of your organisation specifically.

After 500 days of deployment, a stateless AI knows no more about your company than it did on day one. The people, decisions, processes, and context that make your organisation run exist nowhere in the system. Every user rebuilds context from scratch with every session.

The financial cost is measurable. IDC research puts the average knowledge worker at 2.5 hours per day searching for information they or a colleague already created (IDC White Paper, 2020). Multiplied across 250 employees, that is over 150,000 working hours lost annually to context reconstruction — knowledge that should already be in the system.

Why Conversation History Does Not Solve This

Conversation history is the default workaround. Append recent messages to the context window and hope the model retains useful signal. The approach has four fundamental failures.

First, a transcript captures what was said, not what was learned. It cannot distinguish between a passing remark and a board-level decision. Both appear as flat text with no structural difference.

Second, context windows have hard limits. GPT-4o supports 128,000 tokens — roughly 90,000 words. That sounds large until you consider that a single busy team generates far more than that in a year of interactions. History falls off the edge.

Third, even within the context window, retrieval degrades. A 2023 Stanford and University of Washington study found that language models perform well on information at the beginning or end of their context but struggle significantly with information placed in the middle. As Nelson F. Liu and colleagues concluded in their analysis: "Language models are not reliably able to retrieve and use information from their context window, particularly when that information appears in the middle of long inputs." (Liu et al., "Lost in the Middle: How Language Models Use Long Contexts," arXiv:2307.03172, July 2023.)

Fourth, conversation history is session-scoped and user-scoped. The insight your product director generated on Monday is invisible to the sales rep on Friday — because it lives in a different user's conversation history, in a different session, possibly in a different application entirely.

For a deeper look at why AI accuracy on organisation-specific questions defaults to around 15% without memory infrastructure, see our analysis of why enterprise AI gets 15% of org-specific answers right.

What Engram Does Differently

Engram is a persistent memory architecture. It operates alongside your LLM — extraction, classification, maturity management, and retrieval happen automatically. No user needs to tag memories or manage knowledge manually.

At the end of every interaction, Engram's extraction engine analyses the conversation and identifies items worth persisting. Each item is typed, scored for confidence, given a maturity stage, and linked to related items already in the memory store. The next session begins with a structured knowledge injection — not a raw message dump.

Research Context

"Large language models have revolutionised AI, but are constrained by limited context windows, hindering their utility in tasks like extended conversations and document analysis. We propose virtual context management — a technique inspired by hierarchical memory systems in traditional operating systems — to extend the effectively unlimited external context."

— Charles Packer et al., "MemGPT: Towards LLMs as Operating Systems," UC Berkeley / Carnegie Mellon, arXiv:2310.08560, October 2023. The foundational research establishing virtual memory management for LLMs.

Nine Memory Kinds: Because Not All Knowledge Is the Same

A fact ("We are regulated under MiFID II") needs to be stored, recalled, and used differently from a preference ("Sarah prefers bullet-point summaries") or a decision ("We paused the SE Asia expansion pending regulatory clarity"). Treating all of these as equivalent text blobs — as conversation history does — removes any ability to prioritise the right kind of information for a given query.

Engram classifies extracted knowledge into 9 distinct memory types, each with its own extraction logic, decay behaviour, and retrieval weighting:

Memory Kind	What It Stores	Default Persistence	Example
Fact	Persistent truths about the organisation, its operations, or environment	Long-lasting — rarely changes	"We hold a PCI-DSS Level 1 certification"
Entity	People, clients, products, teams — with current attributes and live state	Ongoing — updated as state changes	"Client: Azara Fintech, renewal Q3 2026, key contact: Nadia Khalil"
Decision	Choices made, with rationale, who made them, and when	Permanent — decisions don't expire	"Board approved SE Asia pause pending DPDPA clarity (March 2026)"
Event	Significant occurrences tied to time, context, and participants	Permanent — historical record	"Security incident: API key exposed in dev environment, July 2025"
Insight	Patterns and learnings surfaced from accumulated usage	Medium-term — refined over time	"Q4 planning requests always spike in late September — prepare templates early"
Preference	Individual working styles, communication preferences, tool habits	Ongoing — updated as behaviour changes	"Marco: prefers numbered lists over prose; always wants sources linked"
Relationship	Connections between entities — dependencies, hierarchies, associations	Ongoing — updated as relationships evolve	"Product team depends on Platform for API versioning decisions"
Procedure	How things get done — workflows, approval paths, escalation logic	Stable — changes less frequently	"Contracts above €50K require CFO sign-off before legal drafting begins"
Context	Situational awareness that shapes how other information is interpreted	Short-to-medium — reflects current situation	"Company is in acquisition review — all external communications need legal pre-approval"

This classification means retrieval is intentional. A legal query will weight Facts and Decisions heavily. A user-facing task will surface Preferences and Procedures. The system matches memory type to query intent rather than returning everything that matches semantically.

The Four Maturity Stages: Memory That Earns Its Place

A project reference in a single conversation deserves different treatment than a fact that has been recalled and confirmed dozens of times across many users. Engram models memory maturity through four lifecycle stages, promoting and demoting automatically based on recall patterns:

Memory Lifecycle

Stage 1 — Ephemeral (TTL: 1 hour): A passing reference in a single conversation. Extracted but not yet validated. Decays unless recalled within the hour.

Stage 2 — Working (TTL: 24 hours): Recalled at least once after creation. Activation score rising. Active context for the current work cycle.

Stage 3 — Consolidated (TTL: 30 days): Recalled repeatedly across multiple users or sessions. Strong association network forming. High retrieval priority.

Stage 4 — Crystallised (TTL: Permanent): Foundational organisational knowledge. Highest activation and association density. Never decays passively. Survives staff turnover.

No manual curation is required to move memories between stages. Promotion happens when recall frequency crosses a threshold. Demotion happens when a memory goes unrecalled and its activation score decays below the stage minimum.

Crystallised memories represent institutional knowledge — the facts, decisions, and procedures that define how the organisation works. When a senior employee leaves, their crystallised knowledge stays in the system. Onboarding a replacement takes 4–5 weeks less because the AI already knows what the departing employee knew.

Contradiction detection runs at every extraction pass. When a new memory conflicts with an existing Consolidated or Crystallised memory, the system flags the conflict and surfaces it to the admin console rather than silently overwriting. A 2025 Forrester survey found that 43% of enterprise AI governance failures trace back to stale or contradictory knowledge propagating unchecked (Forrester Research, "The Enterprise AI Governance Gap," Q2 2025).

Five-Factor Recall Scoring: How Engram Decides What to Surface

When a query arrives, Engram scores every candidate memory across five weighted dimensions to produce a ranked retrieval list. The weights balance semantic relevance with organisational importance:

The result is a ranked list that balances relevance (what matches the query semantically) with importance (what the organisation has found repeatedly useful, recently accessed, and structurally connected to other knowledge). A Crystallised Decision will outrank an Ephemeral Context item when both match the query equally — because the Decision has proven itself important to the organisation over time.

Gravity Wells: Self-Organising Knowledge Clusters

As memories are recalled together repeatedly, Engram detects the pattern and forms gravity wells — clusters of strongly associated memories that orbit a central attractor concept. No configuration is required. The clustering emerges from usage.

A gravity well forms around a key client account over time. The client entity, renewal date, key contact, outstanding decisions, contractual dependencies, team preferences, and past event records all cluster together. When any one element is queried, the entire cluster becomes available for context injection.

This mechanism answers questions that purely semantic search cannot. "What do we need to know before Monday's client call?" returns the right context because Engram activates the gravity well associated with that client and surfaces its connected knowledge — including the preference memory noting that the client's procurement director prefers concise executive summaries over detailed analysis.

Gravity wells also reveal organisational structure that was never explicitly documented. If the legal team's decisions consistently co-occur with product roadmap discussions, Engram surfaces that dependency as a relationship memory. Teams discover cross-functional dependencies they did not consciously register.

Four Privacy Scopes: From Session to Organisation

Engram memory operates at four privacy levels, controlling who can retrieve what:

Session: Exists only within the current conversation. Ephemeral by definition — cleared when the session closes.
User private: Visible only to the individual user. Preferences, personal workflow habits, and user-specific context live here.
Product: Shared within a specific application or workspace. Team-level decisions, project context, and shared procedures.
Cross-product: Available across all applications and AI interactions in the organisation. Organisational facts, crystallised decisions, and institutional knowledge.

The cross-product scope is what enables the scenario where "the AI helping your sales rep on Monday already knows what your product team decided on Friday." A Decision memory written at cross-product scope is available to any user in any application — without requiring anyone to explicitly share or forward it.

Role-based access control governs scope promotion. An individual user can write session and user-private memories freely. Writing to product or cross-product scope requires explicit permission — preventing low-confidence extractions from polluting the organisational knowledge base.

Memory With Any LLM

Engram memory is injected as structured context at inference time, making it LLM-agnostic. GPT-4o, Claude 3 Opus, Gemini, or a self-hosted open-source model — the memory architecture is identical. Switching LLM providers does not erase organisational memory.

This matters for enterprise governance. Your institutional knowledge does not belong to OpenAI, Anthropic, or any other model vendor. It lives in your Engram store, controlled by your organisation, exportable at any time. Vendor lock-in risk is eliminated at the knowledge layer. For enterprises evaluating multi-model deployment strategy, see our guide to multi-model enterprise AI and BYOK architectures.

Enterprise Use Cases: What Persistent Memory Unlocks

The practical difference between stateless and memory-enabled AI is visible in three categories of enterprise task:

Onboarding: A new hire asks "How do we handle contract renewals for enterprise clients?" A stateless AI returns a generic answer. An Engram-enabled AI returns your specific renewal procedure, the preference notes for key clients, and the decision log explaining why you moved to a 90-day advance notice model in 2025. Onboarding ramp-up accelerates by 4–5 weeks.

Institutional continuity: A senior employee leaves. Their knowledge of client history, vendor negotiation context, and internal process rationale is crystallised across hundreds of Engram memories. The replacement inherits it. Nothing is lost in the transition.

Cross-functional coordination: The legal team makes a decision that affects product roadmap. The decision is written to cross-product scope. The next time the product team queries the AI about roadmap constraints, the legal decision surfaces automatically — without anyone sending an email or updating a wiki page.

For organisations already running RAG pipelines, Engram adds a complementary layer. RAG retrieves from static documents. Engram retrieves from dynamic, interaction-derived knowledge — decisions made last week, preferences established last month, context from this morning's all-hands. Together, they cover both institutional documentation and living organisational knowledge. See our explanation of how RAG works for enterprise knowledge retrieval for how the two systems complement each other.

Context Tax Elimination

Every time a user has to re-explain their team's context to an AI — re-stating their role, their current project, their preferences, their company's situation — they are paying a context tax. Research on enterprise AI adoption identifies context reconstruction as one of the largest hidden productivity costs in current deployments. The context tax costs approximately 50 hours per employee per year in re-explanation overhead alone.

Engram eliminates this tax at the persistent memory layer. The AI already knows the user's preferences from Preference memories. It already knows the team's current project context from Working-stage Context memories. It already knows the organisation's regulatory environment from Crystallised Fact memories. The user opens a session and gets a capable answer immediately.

Governance and Audit: Memory With Accountability

Enterprise deployments require memory governance alongside memory capability. Engram provides full auditability of memory operations.

Every memory write, update, and deletion is logged with timestamp, user, session ID, confidence score, and scope. Admins can query the memory audit log to see what knowledge is held at any scope level. Memories can be reviewed, corrected, or removed through the admin console. GDPR right-to-erasure requests can be fulfilled by removing all user-private memories for a specific individual.

Content policies apply to memory extraction. The same rules that govern what the AI can discuss in chat also govern what can be extracted and retained as memory — preventing sensitive information (PII, regulated financial data, privileged legal content) from persisting beyond its intended scope. For the full framework for deploying content policies across an enterprise AI platform, see our guide to enterprise AI content policy and team governance.

This combination — persistent knowledge with governance controls — is what separates Engram from simple chat history scrolling. The organisation gains compounding institutional intelligence. The compliance team gains a fully auditable memory ledger.

Frequently Asked Questions

What is Engram in enterprise AI?
Engram is a persistent memory architecture that extracts structured knowledge from every AI interaction, classifies it into 9 memory types, and matures it through 4 lifecycle stages. Organisation-specific AI accuracy climbs from roughly 15% to 92%. Memory is LLM-agnostic and owned by the organisation.

How is Engram different from conversation history?
Conversation history is a flat transcript — it stores what was said, not what was learned. It cannot be searched semantically, shared across users, or structured by importance. Engram extracts typed knowledge items, organises them by maturity and relevance, and makes them retrievable across every session, user, and application.

What are the 9 Engram memory types?
Fact, Entity, Decision, Event, Insight, Preference, Relationship, Procedure, and Context. Each has distinct extraction logic, decay behaviour, and retrieval weighting — so a compliance fact is stored and surfaced differently from a personal working-style preference.

What are Engram gravity wells?
Self-organising clusters of related memories that form when memories are recalled together repeatedly. No configuration required — clustering emerges from usage. A gravity well for a client account automatically collects the entity, renewal date, key contacts, decisions, and team preferences for that account.

Does Engram work with any LLM?
Yes. Memory is injected as structured context at inference time — compatible with GPT-4o, Claude, Gemini, or any self-hosted model. Organisational memory does not belong to any LLM vendor and survives provider switches.

How does Engram handle incorrect or outdated memories?
Contradiction detection flags conflicts between new extractions and existing Consolidated or Crystallised memories for admin review. Ephemeral and Working memories decay automatically if not recalled. Crystallised memories can be corrected or removed via the admin console, with a full audit trail of the change.

92%

Accuracy on org-specific AI questions with Engram memory vs 15% baseline (stateless RAG)

50 hrs

Context tax recovered per employee per year when AI retains organisational knowledge

4 scopes

Privacy levels — session, user-private, product, cross-product — with role-based access control

Engram: How Persistent AI Memory Turns Every Interaction Into Organisational Intelligence