Files
Mai/.planning/phases/03-resource-management/03-03-PLAN.md
Mai Development 1e071398ff
Some checks failed
Discord Webhook / git (push) Has been cancelled
docs(03): create phase plan
Phase 3: Resource Management
- 4 plan(s) in 2 wave(s)
- 2 parallel, 2 sequential
- Ready for execution
2026-01-27 17:58:09 -05:00

169 lines
6.7 KiB
Markdown

---
phase: 03-resource-management
plan: 03
type: execute
wave: 2
depends_on: [03-01, 03-02]
files_modified: [src/resource/scaling.py, src/models/model_manager.py]
autonomous: true
user_setup: []
must_haves:
truths:
- "Proactive scaling prevents performance degradation before it impacts users"
- "Hybrid monitoring combines continuous checks with pre-flight validation"
- "Graceful degradation completes current tasks before model switching"
artifacts:
- path: "src/resource/scaling.py"
provides: "Proactive scaling algorithms with hybrid monitoring"
min_lines: 150
- path: "src/models/model_manager.py"
provides: "Enhanced model manager with proactive scaling integration"
contains: "ProactiveScaler"
min_lines: 650
key_links:
- from: "src/resource/scaling.py"
to: "src/models/resource_monitor.py"
via: "Resource monitoring for scaling decisions"
pattern: "ResourceMonitor"
- from: "src/resource/scaling.py"
to: "src/resource/tiers.py"
via: "Hardware tier-based scaling thresholds"
pattern: "HardwareTierDetector"
- from: "src/models/model_manager.py"
to: "src/resource/scaling.py"
via: "Proactive scaling integration"
pattern: "ProactiveScaler"
---
<objective>
Implement proactive scaling algorithms that combine continuous background monitoring with pre-flight checks to prevent performance degradation before it impacts users, with graceful degradation cascades and stabilization periods.
Purpose: Enable Mai to anticipate resource constraints and scale models proactively while maintaining smooth user experience.
Output: Proactive scaling system with hybrid monitoring, graceful degradation, and intelligent stabilization.
</objective>
<execution_context>
@~/.opencode/get-shit-done/workflows/execute-plan.md
@~/.opencode/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
# Enhanced components from previous plans
@src/models/resource_monitor.py
@src/resource/tiers.py
# Research-based scaling patterns
@.planning/phases/03-resource-management/03-RESEARCH.md
</context>
<tasks>
<task type="auto">
<name>Implement ProactiveScaler class</name>
<files>src/resource/scaling.py</files>
<action>Create the ProactiveScaler class implementing hybrid monitoring and proactive scaling:
1. **Hybrid Monitoring Architecture:**
- Continuous background monitoring thread/task
- Pre-flight checks before each model operation
- Resource trend analysis with configurable windows
- Performance metrics tracking (response times, failure rates)
2. **Proactive Scaling Logic:**
- Scale at 80% resource usage (configurable per tier)
- Consider overall system load context
- Implement stabilization periods (5 minutes for upgrades)
- Prevent thrashing with hysteresis
3. **Graceful Degradation Cascade:**
- Complete current task at lower quality
- Switch to smaller model after completion
- Notify user of capability changes
- Suggest resource optimizations
4. **Key Methods:**
- start_continuous_monitoring(): Background monitoring loop
- check_preflight_resources(): Quick validation before operations
- analyze_resource_trends(): Predictive scaling decisions
- initiate_graceful_degradation(): Controlled capability reduction
- should_upgrade_model(): Check if resources allow upgrade
5. **Integration Points:**
- Use enhanced ResourceMonitor for accurate metrics
- Use HardwareTierDetector for tier-specific thresholds
- Provide callbacks for model switching
- Log scaling decisions with context
Include proper async handling for background monitoring and thread-safe state management.</action>
<verify>python -c "from src.resource.scaling import ProactiveScaler; ps = ProactiveScaler(); print('ProactiveScaler initialized:', hasattr(ps, 'check_preflight_resources'))" confirms the class structure</verify>
<done>ProactiveScaler implements hybrid monitoring with graceful degradation</done>
</task>
<task type="auto">
<name>Integrate proactive scaling into ModelManager</name>
<files>src/models/model_manager.py</files>
<action>Enhance ModelManager to integrate proactive scaling:
1. **Add ProactiveScaler Integration:**
- Import and initialize ProactiveScaler in __init__
- Start continuous monitoring on initialization
- Pass resource monitor and tier detector references
2. **Enhance generate_response with Proactive Scaling:**
- Add pre-flight resource check before generation
- Implement graceful degradation if resources constrained
- Use proactive scaling recommendations for model selection
- Track performance metrics for scaling decisions
3. **Update Model Selection Logic:**
- Incorporate tier-based preferences
- Use scaling thresholds from HardwareTierDetector
- Factor in trend analysis predictions
- Apply stabilization periods for upgrades
4. **Add Resource-Constrained Handling:**
- Complete current response with smaller model if needed
- Switch models proactively based on scaling predictions
- Handle resource exhaustion gracefully
- Maintain conversation context through switches
5. **Performance Tracking:**
- Track response times and failure rates
- Monitor resource usage during generation
- Feed metrics back to ProactiveScaler
- Adjust scaling behavior based on observed performance
6. **Cleanup and Shutdown:**
- Stop continuous monitoring in shutdown()
- Clean up scaling state and resources
- Log scaling decisions and outcomes
Ensure backward compatibility and maintain silent switching behavior per Phase 1 decisions.</action>
<verify>python -c "from src.models.model_manager import ModelManager; mm = ModelManager(); print('Proactive scaling integrated:', hasattr(mm, '_proactive_scaler'))" confirms integration</verify>
<done>ModelManager integrates proactive scaling for intelligent resource management</done>
</task>
</tasks>
<verification>
Test proactive scaling behavior under various scenarios:
- Gradual resource increase (should detect and upgrade after stabilization)
- Sudden resource decrease (should immediately degrade gracefully)
- Stable resource usage (should not trigger unnecessary switches)
- Mixed workload patterns (should adapt scaling thresholds appropriately)
Verify stabilization periods prevent thrashing and graceful degradation maintains user experience.
</verification>
<success_criteria>
ProactiveScaler successfully combines continuous monitoring with pre-flight checks, implements graceful degradation cascades, respects stabilization periods, and integrates seamlessly with ModelManager for intelligent resource management.
</success_criteria>
<output>
After completion, create `.planning/phases/03-resource-management/03-03-SUMMARY.md`
</output>