Files
Mai/.planning/phases/02-safety-sandboxing/02-01-PLAN.md
Mai Development f7d263e173
Some checks failed
Discord Webhook / git (push) Has been cancelled
docs(02): create phase plan
Phase 02: Safety & Sandboxing
- 4 plans in 3 waves
- Security assessment, sandbox execution, audit logging, integration
- Wave 1 parallel: assessment (02-01) + sandbox (02-02)
- Wave 2: audit logging (02-03)
- Wave 3: integration (02-04)
- Ready for execution
2026-01-27 14:28:35 -05:00

92 lines
3.9 KiB
Markdown

---
phase: 02-safety-sandboxing
plan: 01
type: execute
wave: 1
depends_on: []
files_modified: [src/security/__init__.py, src/security/assessor.py, requirements.txt, config/security.yaml]
autonomous: true
must_haves:
truths:
- "Security assessment runs before any code execution"
- "Code is categorized as LOW/MEDIUM/HIGH/BLOCKED"
- "Assessment is fast and doesn't block user workflow"
artifacts:
- path: "src/security/assessor.py"
provides: "Security assessment engine"
min_lines: 40
- path: "requirements.txt"
provides: "Security analysis dependencies"
contains: "bandit, semgrep"
- path: "config/security.yaml"
provides: "Security assessment policies"
contains: "BLOCKED, HIGH, MEDIUM, LOW"
key_links:
- from: "src/security/assessor.py"
to: "bandit CLI"
via: "subprocess.run"
pattern: "bandit.*-f.*json"
- from: "src/security/assessor.py"
to: "semgrep CLI"
via: "subprocess.run"
pattern: "semgrep.*--config"
---
<objective>
Create multi-level security assessment infrastructure to analyze code before execution.
Purpose: Prevent malicious or unsafe code from executing by implementing configurable security assessment with Bandit and Semgrep integration.
Output: Working security assessor that categorizes code as LOW/MEDIUM/HIGH/BLOCKED with specific thresholds.
</objective>
<execution_context>
@~/.opencode/get-shit-done/workflows/execute-plan.md
@~/.opencode/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
# Research references
@.planning/phases/02-safety-sandboxing/02-RESEARCH.md
</context>
<tasks>
<task type="auto">
<name>Task 1: Create security assessment module</name>
<files>src/security/__init__.py, src/security/assessor.py</files>
<action>Create SecurityAssessor class with assess(code: str) method that runs both Bandit and Semgrep analysis. Use subprocess to run bandit -f json - and semgrep --config=p/python commands. Parse results, categorize by severity levels per CONTEXT.md decisions (BLOCKED for malicious patterns + known threats, HIGH for privileged access attempts). Return SecurityLevel enum with detailed findings.</action>
<verify>python -c "from src.security.assessor import SecurityAssessor; print('SecurityAssessor imported successfully')"</verify>
<done>SecurityAssessor class runs Bandit and Semgrep, returns correct severity levels, handles malformed input gracefully</done>
</task>
<task type="auto">
<name>Task 2: Add security dependencies and configuration</name>
<files>requirements.txt, config/security.yaml</files>
<action>Add bandit>=1.7.7, semgrep>=1.99 to requirements.txt. Create config/security.yaml with security assessment policies: BLOCKED triggers (malicious patterns, known threats), HIGH triggers (admin/root access, system file modifications), threshold levels, and trusted code patterns. Follow CONTEXT.md decisions for user override requirements.</action>
<verify>pip install -r requirements.txt && python -c "import bandit, semgrep; print('Security dependencies installed')"</verify>
<done>Security analysis tools install successfully, configuration file defines assessment policies matching CONTEXT.md decisions</done>
</task>
</tasks>
<verification>
- SecurityAssessor class successfully imports and runs analysis
- Bandit and Semgrep can be executed via subprocess
- Security levels align with CONTEXT.md decisions (BLOCKED, HIGH, MEDIUM, LOW)
- Configuration file exists with correct policy definitions
- Analysis completes within reasonable time (<5 seconds for typical code)
</verification>
<success_criteria>
Security assessment infrastructure ready to categorize code by severity before execution, with both static analysis tools integrated and user-configurable policies.
</success_criteria>
<output>
After completion, create `.planning/phases/02-safety-sandboxing/02-01-SUMMARY.md`
</output>