The Problem: Sessions Ending Mid-Task

I’ve been hitting Claude Pro’s 200k token limit repeatedly. Not just occasionally—nearly every session.

The frustrating part? Sessions would end right when I needed to update logs. I’d complete 90% of the work, run out of tokens trying to document it, and start the next session with incomplete context.

Typical failed session:

0-50k tokens:   Read documentation, get oriented
50-150k tokens: Work on features
150-200k:       Try to update logs... out of tokens ❌

Next session starts with: “What was I working on again?”

This isn’t sustainable for a 100-day project.

Understanding Token Consumption

First, I analyzed where tokens go. Turns out context loading is the killer.

Token usage breakdown:

  • Context per message: 10k-15k tokens
  • File reads: 1k-10k per file
  • Response generation: 2k-10k
  • Reading same files multiple times: Massive waste

In a 200-message conversation with 10k context each = 2M tokens consumed just in repeated context loading.

Key insight: Every message reloads the entire context. That blog post you read once? Loaded 50 more times as context.

Solution 1: Session Log Maintenance Automation

Problem: SESSION_LOG.md grew to 890 lines. Every time Claude read it at session start = 20k tokens wasted.

Created maintain_session_log.py:

# Keeps only 3 most recent sessions
# Archives older sessions to backup/
# Preserves structure and continuity

Impact:

  • Before: 890 lines = 20k tokens per session start
  • After: ~200 lines = 5k tokens per session start
  • Savings: 15k tokens per session

The script runs automatically at session end. Older sessions move to backup/SESSION_LOG.md for reference.

Solution 2: Token Management System

Built a complete protocol with clear thresholds:

Token Usage Zones:

  • 🟢 0-150k (Safe): Work normally, update logs incrementally
  • 🟡 150k-160k (Wind Down): Finish current task, don’t start new work
  • 🟠 160k-170k (Update NOW): MANDATORY log update
  • 🔴 170k-180k (Wrap Up): Final summary, handoff notes
  • ⛔ 180k+ (Emergency): Minimal communication only

The Key Innovation: Incremental Updates

Old way:

Work for 150k tokens → Try to update logs at end → Out of tokens ❌

New way:

Task 1 done → Update log [20k tokens]
Task 2 done → Update log [40k tokens]
Task 3 done → Update log [60k tokens]
Logs always current! ✅

Update as you go, not at the end. By 160k tokens, everything’s already documented.

Solution 3: Emergency Template Generator

Created emergency_log_update.py for when tokens are running low:

python3 emergency_log_update.py

Generates a pre-formatted template in seconds. Just fill in the blanks:

  • What was completed
  • What’s incomplete
  • Next session should…

Takes less than 2 minutes even at 180k tokens. Ensures logs always get updated.

Solution 4: Token Optimization

Enhanced .claudeignore to exclude:

  • node_modules, dist, build files (50k+ tokens saved)
  • Images, PDFs, videos (30k+ tokens saved)
  • Backup and archive folders (20k+ tokens saved)
  • Generated assets (read sources instead)

Total automatic savings: ~100k tokens per session!

Best Practices Documented:

#1: Never read files twice (saves 30k+)

Bad:  Read file.py → Edit → Read again → Edit
Good: Read file.py once → Edit multiple times

#2: Use Task tool for exploration (saves 40k+)

Bad:  "How does X work?" → Read 10 files manually (50k tokens)
Good: Task/Explore agent investigates → Reports back (5k tokens)

#3: Grep before Read (saves 20k+)

Bad:  Read 5 files to find function (30k tokens)
Good: Grep → Found it → Read that one file (5.5k tokens)

The Complete System

Created comprehensive documentation:

TOKEN_MANAGEMENT.md - Emergency protocols

  • Visual token zones (🟢🟡🟠🔴⛔)
  • Step-by-step instructions for each threshold
  • Session workflow diagram

TOKEN_OPTIMIZATION.md - Best practices guide

  • Common token wasters and fixes
  • Tool usage patterns (Read vs Grep vs Task)
  • Token budgeting strategies
  • Before/after comparisons

QUICK_START_TOKEN_OPTIMIZATION.md - TL;DR version

  • Top 5 token savers
  • Quick reference checklist
  • Expected results

Updated CLAUDE.md

Added Token Management & Session Closing Protocol section that instructs AI agents to:

  • Monitor tokens proactively
  • Update logs incrementally throughout session
  • Stop new work at 150k tokens
  • MANDATORY update at 160k tokens
  • Use Task tool for exploration instead of manual file reading

Now the AI follows these protocols automatically.

Results

Before optimization:

Sessions: Frequently incomplete ❌
Token usage: 200k with no buffer
Documentation: Often missing
Continuity: Poor between sessions

After optimization:

Sessions: Complete with 20-40k buffer ✅
Work capacity: 2-3x more per session
Documentation: Always updated
Continuity: Seamless handoffs

Actual token savings per session:

  • Enhanced .claudeignore: 100k automatically
  • No duplicate reads: 30k saved
  • Task tool usage: 40k saved per exploration
  • Incremental updates: Ensures completion

Total: 50-70% more work capacity!

Key Learnings

1. Token limits aren’t about doing more—they’re about completing what you start

A clean 80k session that finishes properly beats an incomplete 200k session every time.

2. Incremental updates are superior to batch updates

Update logs after each task when you have tokens to spare. Don’t wait until the end.

3. Context loading is the hidden killer

You don’t see it, but every message reloads everything. Aggressive .claudeignore is essential.

4. Automation saves cognitive overhead

Don’t make developers remember to update logs. Make it automatic. Build systems, not checklists.

5. Tools should be fast in emergencies

The emergency template script works in <2 minutes even at 180k tokens. When you’re running out, speed matters.

What Changed

Files Created:

  • maintain_session_log.py - Auto-archives old sessions
  • emergency_log_update.py - Emergency template generator
  • TOKEN_MANAGEMENT.md - Emergency protocols
  • TOKEN_OPTIMIZATION.md - Best practices guide
  • QUICK_START_TOKEN_OPTIMIZATION.md - Quick reference

Files Updated:

  • CLAUDE.md - Added token management protocol
  • .claudeignore - Enhanced with comprehensive exclusions
  • SESSION_LOG.md - Updated with Day 17 work
  • PROJECT_STATUS.md - Added Days 14-17 milestones

Philosophy

The goal isn’t to cram everything into one session.

The goal is to make continuous progress with clear continuity.

Better to have 2-3 complete sessions with good handoff than 1 incomplete session with missing documentation.

Next Steps

This system is ready for the remaining 83 days of the project. Every session will now:

  • Start with minimal context (enhanced .claudeignore)
  • Update logs incrementally throughout
  • Complete properly with documentation
  • Hand off cleanly to the next session

The token limit problem is solved.

Time to build more stories.

To be continued…