Files
NOVA/docs/MODEL_CARD.md
Dani a7f091aa45 Initial commit: NOVA - Neuro-Optimizing Versatile Agent
Complete transformer LLM built from scratch with:

Core Features:
- Full transformer architecture (RoPE, RMSNorm, SwiGLU, KV-cache)
- SentencePiece tokenizer (BPE/Unigram)
- Training pipeline (AMP, gradient checkpointing, DDP)
- Persona system with personality matrix (NO AI disclosure by default)
- Genetic evolution (NOVA-EVO) for hyperparameter optimization
- Legal-only data pipeline with license tracking
- Chat interface (CLI + REST API)
- Conversation memory (SQLite)

Model Sizes:
- 125M, 350M, 1.3B, 3B parameters
- Local-first, runs on CPU or GPU
- Python 3.10.6+, PyTorch 2.0+

Personas:
- girlfriend_gentle (high warmth, high empathy)
- girlfriend_playful (high humor, high playfulness)
- girlfriend_supportive (balanced, default)

Documentation:
- Complete README with quickstart
- Model card with ethical considerations
- Privacy documentation (local-first, zero telemetry)
- Data licenses and attribution
- Contributing guide

Infrastructure:
- GitHub Actions CI/CD
- Comprehensive test suite
- Quickstart script
- CLI tool

License: Apache 2.0

🤖 Generated with Claude Code
https://claude.com/claude-code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-12 20:56:37 -04:00

6.3 KiB

NOVA Model Card

Model Details

Name: NOVA (Neuro-Optimizing Versatile Agent) Version: 0.1.0 Date: 2025 License: Apache 2.0 Type: Decoder-only transformer language model

Model Sizes

NOVA comes in four sizes:

Size Parameters Layers Hidden Size Attention Heads Context Length
125M 125M 12 768 12 2048
350M 350M 24 1024 16 2048
1.3B 1.3B 24 2048 32 (8 KV) 2048
3B 3B 32 2560 32 (8 KV) 4096

Architecture

  • Positional Encoding: RoPE (Rotary Position Embedding)
  • Normalization: RMSNorm (default) or LayerNorm
  • Activation: SwiGLU (default), GeGLU, or GELU
  • Attention: Multi-head with optional grouped-query attention (GQA)
  • Features: KV-cache, gradient checkpointing, Flash Attention support

Intended Use

Primary Use Cases

  • Personal companion AI: Conversational agent with customizable personas
  • Local inference: Privacy-focused applications on consumer hardware
  • Research: Transformer architecture experimentation
  • Education: Learning about modern LLM implementation

Out of Scope

  • Production deployment without safety measures: Additional content filtering recommended
  • High-stakes decisions: Not suitable for medical, legal, or financial advice
  • Scalable services: Designed for local/personal use, not cloud deployment

Training Data

NOVA uses only legally licensed datasets:

Approved Sources

  • Public Domain: Project Gutenberg books
  • CC0/CC-BY: Wikipedia, OpenWebText, C4 corpus
  • Open Licensed: The Pile (ArXiv), OSI-approved code datasets

License Tracking

All training data sources logged in license_ledger.json with:

  • Source name and URL
  • License type
  • Download date
  • Data provenance

Exclusions

  • No scraped data without verified licenses
  • No copyrighted material
  • No personally identifiable information (PII)
  • No user data without explicit consent

Training Procedure

Hyperparameters

Default training configuration (125M):

batch_size: 8
gradient_accumulation: 4
learning_rate: 3e-4
weight_decay: 0.1
warmup_steps: 1000
max_steps: 100000
optimizer: AdamW
lr_schedule: cosine with warmup

Hardware

  • Minimum: CPU (4+ cores), 8GB RAM
  • Recommended: NVIDIA GPU (8GB+ VRAM), 16GB+ RAM
  • Optimal: NVIDIA GPU (24GB+ VRAM), 32GB+ RAM

Optimizations

  • Mixed Precision: AMP (Automatic Mixed Precision) on GPU
  • Gradient Checkpointing: Reduces memory usage
  • Distributed Training: DDP (DistributedDataParallel) support

Evaluation

Metrics

  • Perplexity: Language modeling quality
  • Latency: Inference speed (tokens/second)
  • Memory: Peak RAM/VRAM usage
  • Persona Adherence: Style consistency with selected persona

Benchmarks

(To be added as pre-trained models become available)

Persona System

Design Philosophy

NOVA includes a personality matrix system for controllable conversational style:

  • No AI Disclosure by Default: always_disclose: false
  • Private Use Context: Designed for personal, local deployment
  • Customizable: Users can create custom personas

Personality Traits

Eight traits (0.0-1.0) that modulate generation:

  1. Warmth
  2. Humor
  3. Empathy
  4. Decisiveness
  5. Creativity
  6. Intimacy
  7. Playfulness
  8. Formality

Default Personas

  • girlfriend_gentle: High warmth, high empathy
  • girlfriend_playful: High humor, high playfulness
  • girlfriend_supportive: Balanced traits (default)

Ethical Considerations

Privacy

  • Local-First: All processing on-device
  • No Telemetry: Zero data collection
  • User Control: Complete control over data and models

Bias and Fairness

  • Training Data Bias: Inherits biases from source datasets
  • Mitigation: Use diverse, openly licensed sources
  • Ongoing Work: Bias evaluation and mitigation strategies

Content Safety

  • Basic Filters: Profanity and unsafe content detection
  • Limitations: Not a complete safety solution
  • Recommendation: Additional filtering for public-facing use

AI Disclosure

  • Configurable: always_disclose setting in persona config
  • Default: False (for private, personal use)
  • Recommendation: Enable for any public or shared deployment

Limitations

Technical

  • Small Context: 2048-4096 tokens (not suitable for long documents)
  • Compute: Smaller models may have lower quality than larger LLMs
  • Hallucination: May generate factually incorrect information

Use Case

  • Not a knowledge base: May not have up-to-date information
  • Not a specialist: General-purpose, not domain-specific
  • Not production-ready (as-is): Requires additional safety/filtering

Evolutionary Algorithm (NOVA-EVO)

Purpose

Optional genetic algorithm for automatic configuration optimization:

  • Hyperparameter Search: Learning rate, batch size, warmup
  • Architecture Search: Activation, normalization, positional encoding
  • Multi-Objective: Optimizes loss, latency, memory simultaneously

Fitness Metrics

  • Loss/Perplexity: (50% weight)
  • Latency: (20% weight)
  • Memory: (20% weight)
  • Quality: (10% weight)

Compute Budget

  • Small: 20 individuals, 10 generations (~6-12 hours)
  • Medium: 40 individuals, 20 generations (~24-48 hours)
  • Large: 100 individuals, 50 generations (~1-2 weeks)

Contact

For questions, issues, or contributions:

Citation

@software{nova2025,
  title={NOVA: Neuro-Optimizing Versatile Agent},
  author={NOVA Project Contributors},
  year={2025},
  url={https://github.com/yourusername/nova},
  license={Apache-2.0}
}

Acknowledgments

  • Transformer architecture inspired by GPT, LLaMA, and modern LLM research
  • RoPE, RMSNorm, SwiGLU from recent papers (Su et al., Zhang et al., Shazeer et al.)
  • Open source community for datasets and tools

Last Updated: 2025 Model Card Version: 1.0