Initial commit: NOVA - Neuro-Optimizing Versatile Agent
Complete transformer LLM built from scratch with: Core Features: - Full transformer architecture (RoPE, RMSNorm, SwiGLU, KV-cache) - SentencePiece tokenizer (BPE/Unigram) - Training pipeline (AMP, gradient checkpointing, DDP) - Persona system with personality matrix (NO AI disclosure by default) - Genetic evolution (NOVA-EVO) for hyperparameter optimization - Legal-only data pipeline with license tracking - Chat interface (CLI + REST API) - Conversation memory (SQLite) Model Sizes: - 125M, 350M, 1.3B, 3B parameters - Local-first, runs on CPU or GPU - Python 3.10.6+, PyTorch 2.0+ Personas: - girlfriend_gentle (high warmth, high empathy) - girlfriend_playful (high humor, high playfulness) - girlfriend_supportive (balanced, default) Documentation: - Complete README with quickstart - Model card with ethical considerations - Privacy documentation (local-first, zero telemetry) - Data licenses and attribution - Contributing guide Infrastructure: - GitHub Actions CI/CD - Comprehensive test suite - Quickstart script - CLI tool License: Apache 2.0 🤖 Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
88
.gitignore
vendored
Normal file
88
.gitignore
vendored
Normal file
@@ -0,0 +1,88 @@
|
||||
# Python
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
*.so
|
||||
.Python
|
||||
build/
|
||||
develop-eggs/
|
||||
dist/
|
||||
downloads/
|
||||
eggs/
|
||||
.eggs/
|
||||
lib/
|
||||
lib64/
|
||||
parts/
|
||||
sdist/
|
||||
var/
|
||||
wheels/
|
||||
pip-wheel-metadata/
|
||||
share/python-wheels/
|
||||
*.egg-info/
|
||||
.installed.cfg
|
||||
*.egg
|
||||
MANIFEST
|
||||
|
||||
# PyTorch
|
||||
*.pt
|
||||
*.pth
|
||||
*.ckpt
|
||||
checkpoints/
|
||||
*.safetensors
|
||||
!configs/**/*.safetensors
|
||||
|
||||
# Virtual environments
|
||||
venv/
|
||||
ENV/
|
||||
env/
|
||||
.venv
|
||||
|
||||
# IDEs
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# Jupyter
|
||||
.ipynb_checkpoints/
|
||||
*.ipynb
|
||||
|
||||
# Data
|
||||
data/raw/
|
||||
data/processed/
|
||||
*.arrow
|
||||
*.parquet
|
||||
*.bin
|
||||
*.idx
|
||||
|
||||
# Tokenizer training
|
||||
*.model
|
||||
*.vocab
|
||||
!nova_tokenizer/pretrained/*.model
|
||||
!nova_tokenizer/pretrained/*.vocab
|
||||
|
||||
# Logs
|
||||
logs/
|
||||
*.log
|
||||
wandb/
|
||||
tensorboard/
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
desktop.ini
|
||||
|
||||
# Evolution
|
||||
nova_evo/populations/
|
||||
nova_evo/hall_of_fame/
|
||||
!nova_evo/hall_of_fame/.gitkeep
|
||||
|
||||
# Temporary
|
||||
tmp/
|
||||
temp/
|
||||
*.tmp
|
||||
|
||||
# Large files tracked by Git LFS
|
||||
*.gguf
|
||||
*.ggml
|
Reference in New Issue
Block a user