I run 60+ projects through synthesis engineering — sustained human-AI collaboration with Claude Code as the primary development partner. Each project has a CONTEXT.md file that the AI reads at the start of every session. For months, this worked. Then it stopped working.
The largest CONTEXT.md reached 1,014 lines. It combined session logs from December, team rosters that hadn’t changed in weeks, completed task checklists, architecture decisions made months ago, and the three tasks I actually needed to work on today. All of it was loaded into every session, consuming tokens and diluting the AI’s attention.
An audit of all 60+ projects revealed this wasn’t an isolated problem. Eighteen projects had CONTEXT.md files over 500 lines. The median was 178 lines and growing. There was no mechanism for information to leave these files once written, except by closing the entire project and starting fresh.
The problem is structural, and I built a structural solution: the tiered context architecture.
The root cause
A single context file serves four functions with fundamentally different lifecycles:
| Function | How often it’s needed | How it grows | What it should do |
|---|---|---|---|
| Working memory — current state, active tasks | Every session | Stays constant | Refresh frequently |
| Reference facts — team, URLs, architecture | Most sessions, changes rarely | Slow, update-in-place | Stay current |
| Session history — what happened when | Rarely after 1 week | Unbounded append | Archive |
| Completed records — done checklists | Almost never | Unbounded append | Delete |
Combining all four in one file means the file grows linearly with session count. A project that runs for 15 sessions at 60 lines per session produces a 900-line context file. Most of those 900 lines aren’t needed today.
This is the hot/warm/cold data problem from database engineering. When you put hot transactional data and cold analytics data in the same table with no partitioning, performance degrades. The same thing happens when you put hot working memory and cold session archives in the same file with no lifecycle management.
The architecture
Three tiers, each matching a different information lifecycle:
project/
├── CONTEXT.md # Working memory (budget: ≤150 lines)
├── REFERENCE.md # Stable facts (updated in place)
├── sessions/ # Session history (archived monthly)
│ └── YYYY-MM.md
└── [other files]
CONTEXT.md is working memory. It contains what the AI needs for today’s session: current project state, active tasks with priorities, and a brief summary of the last one or two sessions. It has a hard budget of 150 lines. Everything else lives elsewhere.
REFERENCE.md is semantic memory. Team rosters, repository URLs, deployment configuration, architecture decisions. This file is updated in place — when a team member leaves, you change the roster, you don’t append a dated note. It’s loaded on demand when the AI needs reference details, not every session.
sessions/ is episodic memory. Monthly archive files containing session logs moved from CONTEXT.md when they age past a week. Append-only. Rarely read, but searchable when historical context is needed.
Why these map to human memory
This isn’t a metaphor — it’s a design principle:
| Human memory type | Synthesis equivalent | Properties |
|---|---|---|
| Working memory | CONTEXT.md | Small capacity, constantly refreshed, what you’re thinking about right now |
| Semantic memory | REFERENCE.md | Facts and relationships, updated when facts change, not time-stamped |
| Episodic memory | sessions/ | Chronological events, “what happened when,” rarely recalled |
| Procedural memory | CLAUDE.md + _lessons/ | How to do things, patterns, rules |
Each memory type has different storage, retrieval, and maintenance characteristics. Treating them identically is like using a single data structure for a cache, a database, and a log file.
Or, if you prefer systems analogies
| Cache level | Synthesis equivalent | Size | Load frequency |
|---|---|---|---|
| L1 cache | CONTEXT.md | ≤150 lines | Every session |
| L2 cache | REFERENCE.md | ≤300 lines | On demand |
| L3 cache | sessions/ | Unbounded | When searched |
| Disk | Transcripts, raw files | Unbounded | Explicit read |
The archival protocol
The architecture only works if information moves between tiers. This is garbage collection for context.
At session start, if CONTEXT.md exceeds 120 lines:
- Archive first. Write completed tasks and old session summaries to
sessions/YYYY-MM.md. Move stable facts to REFERENCE.md. - Verify. Confirm the archived content exists in its destination file.
- Only then rewrite CONTEXT.md with the archived content removed.
- Check budget. Verify CONTEXT.md is under 150 lines.
The principle: content must exist in the destination before it’s removed from the source. Two-phase commit. Never delete from CONTEXT.md based on the assumption that something was “already captured” — verify first.
The question for every piece of content: is this needed for today’s work?
- Yes → stays in CONTEXT.md
- No, but it’s a stable fact → REFERENCE.md
- No, but it’s a session record → sessions/
- No, and it’s a reusable lesson →
_lessons/ - No, and it’s none of the above → delete it
Results
Applied to 62 projects in a single session:
| Metric | Before | After |
|---|---|---|
| Largest CONTEXT.md | 1,014 lines | 128 lines |
| Projects over 150 lines | 18 | 0 |
| Projects with REFERENCE.md | 1 | 21 |
| Projects with session archives | 1 | 14 |
Not every project needs all three tiers. 41 projects are small enough that CONTEXT.md alone is sufficient. The architecture scales down as well as up — you create REFERENCE.md and sessions/ when you need them, not before.
How this differs from tool-native memory
Claude Code has MEMORY.md — an automatic memory file that persists facts across conversations. ChatGPT has its own persistent memory. These are useful, and the tiered context architecture coexists with them without conflict. But they solve different problems.
| Tool-native memory (MEMORY.md, etc.) | Tiered context architecture | |
|---|---|---|
| Scope | Per-tool, per-directory | Per-project, tool-agnostic |
| Maintenance | Automatic (AI decides what to save) | Deliberate (human-AI collaboration) |
| Structure | Flat, append-oriented | Three tiers by information lifecycle |
| Budget | Usually none — grows until truncated | 150-line hard limit on working memory |
| Lifecycle | None — facts accumulate | Garbage collection at session boundaries |
| Portability | Tied to one tool | Works with any AI assistant |
| Project awareness | Minimal — knows recent facts | Full — knows state, tasks, history, reference |
Tool-native memory is a useful supplement. It catches things you didn’t explicitly save. But it’s not a substitute for deliberate context management any more than auto-save is a substitute for version control.
Getting started
Step 1: Set up the structure
Create a projects/ directory in your workspace. Add index.yaml with your first project:
projects:
- id: my-project
name: My Project Name
status: active
description: Brief description
last_session: 2026-03-04
Step 2: Create CONTEXT.md
In projects/my-project/, create CONTEXT.md:
# My Project — Working Context
**Phase:** Initial
**Last session:** 2026-03-04
---
## Current State
[What exists, what doesn't, starting conditions]
## What's Next
1. [ ] [First task]
2. [ ] [Second task]
---
*This file follows the Tiered Context Architecture. Budget: ≤150 lines.*
Step 3: Add instructions to your AI assistant
Add context lifecycle instructions to your CLAUDE.md (or equivalent configuration for your AI tool):
## Context Lifecycle
After completing ANY significant task:
1. Update CONTEXT.md immediately (budget: ≤150 lines)
2. Move stable facts to REFERENCE.md
3. Archive session logs older than 1 week to sessions/
4. Update index.yaml last_session date
5. Commit to git
Step 4: Let the tiers emerge naturally
Don’t create REFERENCE.md or sessions/ on day one. Use CONTEXT.md alone until it approaches 120 lines. When it does, that’s when you:
- Extract stable facts (team, URLs, architecture) into REFERENCE.md
- Move old session logs into sessions/YYYY-MM.md
- Trim CONTEXT.md back to lean working memory
The tiers emerge from need, not planning.
Step 5: Run garbage collection at session boundaries
Every time you start a new session on a project:
- Read CONTEXT.md
- Identify cold content: completed tasks, old session logs, stable facts.
- Archive first: write completed tasks and old logs to sessions/, move stable facts to REFERENCE.md.
- Verify the archived content exists in its destination.
- Only then remove the archived content from CONTEXT.md.
- Now start working.
This takes 2-3 minutes and keeps everything clean.
Where this is going
The tiered architecture is stage 3 of an evolution:
- Ad hoc — Re-explain everything each session. Most AI users today.
- Monolithic — Single context file that grows forever. The pre-March-2026 approach.
- Tiered — Working memory + reference + archive with lifecycle management. Current.
- Compiled — Context automatically assembled from live project state. Future.
Stage 4 is the long-term vision: the AI’s context at session start is compiled from live project state, code, and history rather than manually maintained. But stage 3 is the 80/20 solution that makes long-running AI-assisted projects sustainable today.
The complete system
The full implementation is available as open-source runbooks:
- Context lifecycle management — Templates, decision trees, migration guides, and quality metrics
- Synthesis project management — The broader project management system that the tiered architecture is part of
Both are MIT licensed and part of the Ragbot.AI open-source project.
The tiered context architecture is part of synthesis project management, a discipline within synthesis engineering.
Rajiv Pant is President of Flatiron Software and Snapshot AI, where he leads organizational growth and AI innovation. He is former Chief Product & Technology Officer at The Wall Street Journal, The New York Times, and Hearst Magazines. Earlier in his career, he headed technology for Condé Nast’s brands including Reddit. Rajiv coined the terms “synthesis engineering” and “synthesis coding” to describe the systematic integration of human expertise with AI capabilities in professional software development. Connect with him on LinkedIn or read more at rajiv.com.