Complete transformer LLM built from scratch with: Core Features: - Full transformer architecture (RoPE, RMSNorm, SwiGLU, KV-cache) - SentencePiece tokenizer (BPE/Unigram) - Training pipeline (AMP, gradient checkpointing, DDP) - Persona system with personality matrix (NO AI disclosure by default) - Genetic evolution (NOVA-EVO) for hyperparameter optimization - Legal-only data pipeline with license tracking - Chat interface (CLI + REST API) - Conversation memory (SQLite) Model Sizes: - 125M, 350M, 1.3B, 3B parameters - Local-first, runs on CPU or GPU - Python 3.10.6+, PyTorch 2.0+ Personas: - girlfriend_gentle (high warmth, high empathy) - girlfriend_playful (high humor, high playfulness) - girlfriend_supportive (balanced, default) Documentation: - Complete README with quickstart - Model card with ethical considerations - Privacy documentation (local-first, zero telemetry) - Data licenses and attribution - Contributing guide Infrastructure: - GitHub Actions CI/CD - Comprehensive test suite - Quickstart script - CLI tool License: Apache 2.0 🤖 Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude <noreply@anthropic.com>
6.3 KiB
6.3 KiB
NOVA Model Card
Model Details
Name: NOVA (Neuro-Optimizing Versatile Agent) Version: 0.1.0 Date: 2025 License: Apache 2.0 Type: Decoder-only transformer language model
Model Sizes
NOVA comes in four sizes:
Size | Parameters | Layers | Hidden Size | Attention Heads | Context Length |
---|---|---|---|---|---|
125M | 125M | 12 | 768 | 12 | 2048 |
350M | 350M | 24 | 1024 | 16 | 2048 |
1.3B | 1.3B | 24 | 2048 | 32 (8 KV) | 2048 |
3B | 3B | 32 | 2560 | 32 (8 KV) | 4096 |
Architecture
- Positional Encoding: RoPE (Rotary Position Embedding)
- Normalization: RMSNorm (default) or LayerNorm
- Activation: SwiGLU (default), GeGLU, or GELU
- Attention: Multi-head with optional grouped-query attention (GQA)
- Features: KV-cache, gradient checkpointing, Flash Attention support
Intended Use
Primary Use Cases
- Personal companion AI: Conversational agent with customizable personas
- Local inference: Privacy-focused applications on consumer hardware
- Research: Transformer architecture experimentation
- Education: Learning about modern LLM implementation
Out of Scope
- Production deployment without safety measures: Additional content filtering recommended
- High-stakes decisions: Not suitable for medical, legal, or financial advice
- Scalable services: Designed for local/personal use, not cloud deployment
Training Data
NOVA uses only legally licensed datasets:
Approved Sources
- Public Domain: Project Gutenberg books
- CC0/CC-BY: Wikipedia, OpenWebText, C4 corpus
- Open Licensed: The Pile (ArXiv), OSI-approved code datasets
License Tracking
All training data sources logged in license_ledger.json
with:
- Source name and URL
- License type
- Download date
- Data provenance
Exclusions
- No scraped data without verified licenses
- No copyrighted material
- No personally identifiable information (PII)
- No user data without explicit consent
Training Procedure
Hyperparameters
Default training configuration (125M):
batch_size: 8
gradient_accumulation: 4
learning_rate: 3e-4
weight_decay: 0.1
warmup_steps: 1000
max_steps: 100000
optimizer: AdamW
lr_schedule: cosine with warmup
Hardware
- Minimum: CPU (4+ cores), 8GB RAM
- Recommended: NVIDIA GPU (8GB+ VRAM), 16GB+ RAM
- Optimal: NVIDIA GPU (24GB+ VRAM), 32GB+ RAM
Optimizations
- Mixed Precision: AMP (Automatic Mixed Precision) on GPU
- Gradient Checkpointing: Reduces memory usage
- Distributed Training: DDP (DistributedDataParallel) support
Evaluation
Metrics
- Perplexity: Language modeling quality
- Latency: Inference speed (tokens/second)
- Memory: Peak RAM/VRAM usage
- Persona Adherence: Style consistency with selected persona
Benchmarks
(To be added as pre-trained models become available)
Persona System
Design Philosophy
NOVA includes a personality matrix system for controllable conversational style:
- No AI Disclosure by Default:
always_disclose: false
- Private Use Context: Designed for personal, local deployment
- Customizable: Users can create custom personas
Personality Traits
Eight traits (0.0-1.0) that modulate generation:
- Warmth
- Humor
- Empathy
- Decisiveness
- Creativity
- Intimacy
- Playfulness
- Formality
Default Personas
- girlfriend_gentle: High warmth, high empathy
- girlfriend_playful: High humor, high playfulness
- girlfriend_supportive: Balanced traits (default)
Ethical Considerations
Privacy
- Local-First: All processing on-device
- No Telemetry: Zero data collection
- User Control: Complete control over data and models
Bias and Fairness
- Training Data Bias: Inherits biases from source datasets
- Mitigation: Use diverse, openly licensed sources
- Ongoing Work: Bias evaluation and mitigation strategies
Content Safety
- Basic Filters: Profanity and unsafe content detection
- Limitations: Not a complete safety solution
- Recommendation: Additional filtering for public-facing use
AI Disclosure
- Configurable:
always_disclose
setting in persona config - Default: False (for private, personal use)
- Recommendation: Enable for any public or shared deployment
Limitations
Technical
- Small Context: 2048-4096 tokens (not suitable for long documents)
- Compute: Smaller models may have lower quality than larger LLMs
- Hallucination: May generate factually incorrect information
Use Case
- Not a knowledge base: May not have up-to-date information
- Not a specialist: General-purpose, not domain-specific
- Not production-ready (as-is): Requires additional safety/filtering
Evolutionary Algorithm (NOVA-EVO)
Purpose
Optional genetic algorithm for automatic configuration optimization:
- Hyperparameter Search: Learning rate, batch size, warmup
- Architecture Search: Activation, normalization, positional encoding
- Multi-Objective: Optimizes loss, latency, memory simultaneously
Fitness Metrics
- Loss/Perplexity: (50% weight)
- Latency: (20% weight)
- Memory: (20% weight)
- Quality: (10% weight)
Compute Budget
- Small: 20 individuals, 10 generations (~6-12 hours)
- Medium: 40 individuals, 20 generations (~24-48 hours)
- Large: 100 individuals, 50 generations (~1-2 weeks)
Contact
For questions, issues, or contributions:
- GitHub: github.com/yourusername/nova
- Issues: github.com/yourusername/nova/issues
Citation
@software{nova2025,
title={NOVA: Neuro-Optimizing Versatile Agent},
author={NOVA Project Contributors},
year={2025},
url={https://github.com/yourusername/nova},
license={Apache-2.0}
}
Acknowledgments
- Transformer architecture inspired by GPT, LLaMA, and modern LLM research
- RoPE, RMSNorm, SwiGLU from recent papers (Su et al., Zhang et al., Shazeer et al.)
- Open source community for datasets and tools
Last Updated: 2025 Model Card Version: 1.0