Mai/.planning/phases/01-model-interface/01-01-PLAN.md

---
phase: 01-model-interface
plan: 01
type: execute
wave: 1
depends_on: []
files_modified: ["src/models/__init__.py", "src/models/lmstudio_adapter.py", "src/models/resource_monitor.py", "config/models.yaml", "requirements.txt", "pyproject.toml"]
autonomous: true

must_haves:
  truths:
    - "LM Studio client can connect and list available models"
    - "System resources (CPU/RAM/GPU) are monitored in real-time"
    - "Configuration defines models and their resource requirements"
  artifacts:
    - path: "src/models/lmstudio_adapter.py"
      provides: "LM Studio client and model discovery"
      min_lines: 50
    - path: "src/models/resource_monitor.py"
      provides: "System resource monitoring"
      min_lines: 40
    - path: "config/models.yaml"
      provides: "Model definitions and resource profiles"
      contains: "models:"
  key_links:
    - from: "src/models/lmstudio_adapter.py"
      to: "LM Studio server"
      via: "lmstudio-python SDK"
      pattern: "import lmstudio"
    - from: "src/models/resource_monitor.py"
      to: "system APIs"
      via: "psutil library"
      pattern: "import psutil"
---

<objective>
Establish LM Studio connectivity and resource monitoring foundation.

Purpose: Create the core infrastructure for model discovery and system resource tracking, enabling intelligent model selection in later plans.
Output: Working LM Studio client, resource monitor, and model configuration system.
</objective>

<execution_context>
@~/.opencode/get-shit-done/workflows/execute-plan.md
@~/.opencode/get-shit-done/templates/summary.md
</execution_context>

<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/phases/01-model-interface/01-RESEARCH.md
@.planning/phases/01-model-interface/01-CONTEXT.md
@.planning/codebase/ARCHITECTURE.md
@.planning/codebase/STRUCTURE.md
@.planning/codebase/STACK.md
</context>

<tasks>

<task type="auto">
  <name>Task 1: Create project foundation and dependencies</name>
  <files>requirements.txt, pyproject.toml, src/models/__init__.py</files>
  <action>
Create Python project structure with required dependencies:
1. Create pyproject.toml with project metadata and lmstudio, psutil, pydantic dependencies
2. Create requirements.txt as fallback for pip install
3. Create src/models/__init__.py with proper imports and version info
4. Create basic src/ directory structure if not exists
5. Set up Python package structure following PEP 518

Dependencies from research:
- lmstudio >= 1.0.1 (official LM Studio SDK)
- psutil >= 6.1.0 (system resource monitoring)
- pydantic >= 2.10 (configuration validation)
- gpu-tracker >= 5.0.1 (GPU monitoring, optional)

Follow packaging best practices with proper metadata, authors, and optional dependencies.
  </action>
  <verify>pip install -e . succeeds and imports work: python -c "import lmstudio, psutil, pydantic"</verify>
  <done>Project structure created with all dependencies installable via pip</done>
</task>

<task type="auto">
  <name>Task 2: Implement LM Studio adapter and model discovery</name>
  <files>src/models/lmstudio_adapter.py</files>
  <action>
Create LM Studio client following research patterns:
1. Implement LMStudioAdapter class using lmstudio-python SDK
2. Add context manager for safe client handling: get_client()
3. Implement list_available_models() using lms.list_downloaded_models()
4. Add load_model() method with error handling and fallback logic
5. Include model validation and capability detection
6. Follow Pattern 1 from research: Model Client Factory

Key methods:
- __init__: Initialize client configuration
- list_models(): Return list of (model_key, display_name, size_gb)
- load_model(model_key): Load model with timeout and error handling
- unload_model(model_key): Clean up model resources
- get_model_info(model_key): Get model metadata and context window

Use proper error handling for LM Studio not running, model loading failures, and network issues.
  </action>
  <verify>Unit test passes: python -c "from src.models.lmstudio_adapter import LMStudioAdapter; adapter = LMStudioAdapter(); print(len(adapter.list_models()) >= 0)"</verify>
  <done>LM Studio adapter can connect and list available models, handles errors gracefully</done>
</task>

<task type="auto">
  <name>Task 3: Implement system resource monitoring</name>
  <files>src/models/resource_monitor.py</files>
  <action>
Create ResourceMonitor class following research patterns:
1. Monitor CPU usage (psutil.cpu_percent)
2. Track available memory (psutil.virtual_memory)
3. GPU VRAM monitoring if available (gpu-tracker library)
4. Provide resource snapshot with current usage and availability
5. Add resource trend tracking for load prediction
6. Implement should_switch_model() logic based on thresholds

Key methods:
- get_current_resources(): Return dict with memory_percent, cpu_percent, available_memory_gb, gpu_vram_gb
- get_resource_trend(window_minutes=5): Return resource usage trend
- can_load_model(model_size_gb): Check if enough resources available
- is_system_overloaded(): Return True if resources exceed thresholds

Follow Pattern 2 from research: Resource-Aware Model Selection
Set sensible thresholds: 80% memory/CPU usage triggers model downgrading.
  </action>
  <verify>python -c "from src.models.resource_monitor import ResourceMonitor; monitor = ResourceMonitor(); print('memory' in monitor.get_current_resources())"</verify>
  <done>Resource monitor provides real-time system metrics and trend analysis</done>
</task>

<task type="auto">
  <name>Task 4: Create model configuration system</name>
  <files>config/models.yaml</files>
  <action>
Create model configuration following research architecture:
1. Define model categories by capability tier (small, medium, large)
2. Specify resource requirements for each model
3. Set context window sizes and token limits
4. Define model switching rules and fallback chains
5. Include model metadata (display names, descriptions)

Example structure:
models:
  - key: "qwen/qwen3-4b-2507"
    display_name: "Qwen3 4B"
    category: "medium"
    min_memory_gb: 4
    min_vram_gb: 2
    context_window: 8192
    capabilities: ["chat", "reasoning"]
  - key: "qwen/qwen2.5-7b-instruct"
    display_name: "Qwen2.5 7B Instruct"
    category: "large"
    min_memory_gb: 8
    min_vram_gb: 4
    context_window: 32768
    capabilities: ["chat", "reasoning", "analysis"]

Include fallback chains for graceful degradation when resources are constrained.
  </action>
  <verify>YAML validation passes: python -c "import yaml; yaml.safe_load(open('config/models.yaml'))"</verify>
  <done>Model configuration defines available models with resource requirements and fallback chains</done>
</task>

</tasks>

<verification>
Verify core connectivity and monitoring:
1. LM Studio adapter can list available models
2. Resource monitor returns valid system metrics
3. Model configuration loads without errors
4. All dependencies import correctly
5. Error handling works when LM Studio is not running
</verification>

<success_criteria>
Core infrastructure ready for model management:
- LM Studio client connects and discovers models
- System resources are monitored in real-time
- Model configuration defines resource requirements
- Foundation supports intelligent model switching
</success_criteria>

<output>
After completion, create `.planning/phases/01-model-interface/01-01-SUMMARY.md`
</output>