Files
Mai/.planning/phases/01-model-interface/01-02-PLAN.md
Mai Development 1d9f19b8c2
Some checks failed
Discord Webhook / git (push) Has been cancelled
docs(01): create phase plan
Phase 01-model-interface: Foundation systems
- 3 plan(s) in 2 wave(s)
- 2 parallel, 1 sequential
- Ready for execution
2026-01-27 10:45:52 -05:00

126 lines
4.8 KiB
Markdown

---
phase: 01-model-interface
plan: 02
type: execute
wave: 1
depends_on: []
files_modified: ["src/models/context_manager.py", "src/models/conversation.py"]
autonomous: true
must_haves:
truths:
- "Conversation history is stored and retrieved correctly"
- "Context window is managed to prevent overflow"
- "Old messages are compressed when approaching limits"
artifacts:
- path: "src/models/context_manager.py"
provides: "Conversation context and memory management"
min_lines: 60
- path: "src/models/conversation.py"
provides: "Message data structures and types"
min_lines: 30
key_links:
- from: "src/models/context_manager.py"
to: "src/models/conversation.py"
via: "import conversation types"
pattern: "from.*conversation import"
- from: "src/models/context_manager.py"
to: "future model manager"
via: "context passing interface"
pattern: "def get_context_for_model"
---
<objective>
Implement conversation context management and memory system.
Purpose: Create the foundation for managing conversation history, context windows, and memory compression before model switching logic is added.
Output: Working context manager with message storage, compression, and token budget management.
</objective>
<execution_context>
@~/.opencode/get-shit-done/workflows/execute-plan.md
@~/.opencode/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/phases/01-model-interface/01-RESEARCH.md
@.planning/phases/01-model-interface/01-CONTEXT.md
@.planning/codebase/ARCHITECTURE.md
@.planning/codebase/STRUCTURE.md
</context>
<tasks>
<task type="auto">
<name>Task 1: Create conversation data structures</name>
<files>src/models/conversation.py</files>
<action>
Create conversation data models following research architecture:
1. Define Message class with role, content, timestamp, metadata
2. Define Conversation class to manage message sequence
3. Define ContextWindow class for token budget tracking
4. Include message importance scoring for compression decisions
5. Add Pydantic models for validation and serialization
6. Support message types: user, assistant, system, tool_call
Key classes:
- Message: role, content, timestamp, token_count, importance_score
- Conversation: messages list, metadata, total_tokens
- ContextBudget: max_tokens, used_tokens, available_tokens
- MessageMetadata: source, context, priority flags
Use dataclasses or Pydantic BaseModel for type safety and validation. Include proper type hints throughout.
</action>
<verify>python -c "from src.models.conversation import Message, Conversation; msg = Message(role='user', content='test'); print(msg.role)"</verify>
<done>Conversation data structures support message creation and management</done>
</task>
<task type="auto">
<name>Task 2: Implement context manager with compression</name>
<files>src/models/context_manager.py</files>
<action>
Create ContextManager class following research patterns:
1. Implement sliding window context management
2. Add hybrid compression: summarize old messages, preserve recent ones
3. Trigger compression at 70% of context window (from CONTEXT.md)
4. Prioritize user instructions and explicit requests during compression
5. Implement semantic importance scoring for message retention
6. Support different model context sizes (adaptive based on model)
Key methods:
- add_message(message): Add message to conversation, check compression need
- get_context_for_model(model_key): Return context within model's token limit
- compress_conversation(target_ratio): Apply hybrid compression strategy
- estimate_tokens(text): Estimate token count for text (approximate)
- get_conversation_summary(): Generate summary of compressed messages
Follow research anti-patterns: Don't ignore context window overflow, use proven compression algorithms.
</action>
<verify>python -c "from src.models.context_manager import ContextManager; cm = ContextManager(); print(cm.add_message) and hasattr(cm, 'compress_conversation')"</verify>
<done>Context manager handles conversation history with intelligent compression</done>
</task>
</tasks>
<verification>
Verify conversation management:
1. Messages can be added and retrieved from conversation
2. Context compression triggers at correct thresholds
3. Important messages are preserved during compression
4. Token estimation works reasonably well
5. Context adapts to different model window sizes
</verification>
<success_criteria>
Conversation context system operational:
- Message storage and retrieval works correctly
- Context window management prevents overflow
- Intelligent compression preserves important information
- System ready for integration with model switching
</success_criteria>
<output>
After completion, create `.planning/phases/01-model-interface/01-02-SUMMARY.md`
</output>