Complete transformer LLM built from scratch with: Core Features: - Full transformer architecture (RoPE, RMSNorm, SwiGLU, KV-cache) - SentencePiece tokenizer (BPE/Unigram) - Training pipeline (AMP, gradient checkpointing, DDP) - Persona system with personality matrix (NO AI disclosure by default) - Genetic evolution (NOVA-EVO) for hyperparameter optimization - Legal-only data pipeline with license tracking - Chat interface (CLI + REST API) - Conversation memory (SQLite) Model Sizes: - 125M, 350M, 1.3B, 3B parameters - Local-first, runs on CPU or GPU - Python 3.10.6+, PyTorch 2.0+ Personas: - girlfriend_gentle (high warmth, high empathy) - girlfriend_playful (high humor, high playfulness) - girlfriend_supportive (balanced, default) Documentation: - Complete README with quickstart - Model card with ethical considerations - Privacy documentation (local-first, zero telemetry) - Data licenses and attribution - Contributing guide Infrastructure: - GitHub Actions CI/CD - Comprehensive test suite - Quickstart script - CLI tool License: Apache 2.0 🤖 Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude <noreply@anthropic.com>
228 lines
4.2 KiB
Markdown
228 lines
4.2 KiB
Markdown
# Contributing to NOVA
|
|
|
|
Thank you for your interest in contributing to NOVA! This document provides guidelines for contributing.
|
|
|
|
---
|
|
|
|
## How to Contribute
|
|
|
|
### Reporting Issues
|
|
|
|
**Bug Reports:**
|
|
1. Check existing issues first
|
|
2. Use the bug report template
|
|
3. Include:
|
|
- Python version
|
|
- OS and hardware
|
|
- Steps to reproduce
|
|
- Expected vs actual behavior
|
|
- Error messages/logs
|
|
|
|
**Feature Requests:**
|
|
1. Check if already proposed
|
|
2. Explain the use case
|
|
3. Describe the desired behavior
|
|
|
|
### Code Contributions
|
|
|
|
**Setup Development Environment:**
|
|
|
|
```bash
|
|
# Fork and clone
|
|
git clone https://github.com/yourusername/nova.git
|
|
cd nova
|
|
|
|
# Create venv
|
|
python -m venv venv
|
|
source venv/bin/activate # Windows: venv\Scripts\activate
|
|
|
|
# Install dev dependencies
|
|
pip install -r requirements.txt
|
|
pip install -e .[dev]
|
|
```
|
|
|
|
**Before Submitting:**
|
|
|
|
1. **Run Tests:**
|
|
```bash
|
|
pytest tests/ -v
|
|
```
|
|
|
|
2. **Lint Code:**
|
|
```bash
|
|
ruff check .
|
|
black --check .
|
|
```
|
|
|
|
3. **Format Code:**
|
|
```bash
|
|
black nova_core/ nova_tokenizer/ nova_train/ nova_evo/ nova_chat/
|
|
```
|
|
|
|
4. **Type Check (optional but recommended):**
|
|
```bash
|
|
mypy nova_core/ --ignore-missing-imports
|
|
```
|
|
|
|
### Pull Request Process
|
|
|
|
1. **Branch Naming:**
|
|
- `feature/description` for new features
|
|
- `fix/description` for bug fixes
|
|
- `docs/description` for documentation
|
|
|
|
2. **Commit Messages:**
|
|
- Clear, descriptive messages
|
|
- Reference issues: `Fix #123: Description`
|
|
|
|
3. **PR Description:**
|
|
- What changed
|
|
- Why the change
|
|
- Testing performed
|
|
- Screenshots (if UI changes)
|
|
|
|
4. **Review Process:**
|
|
- CI must pass
|
|
- At least one approval required
|
|
- Address review feedback
|
|
|
|
---
|
|
|
|
## Development Guidelines
|
|
|
|
### Code Style
|
|
|
|
**Python:**
|
|
- Follow PEP 8
|
|
- Use Black formatter (line length 100)
|
|
- Type hints encouraged
|
|
- Docstrings for public APIs
|
|
|
|
**Example:**
|
|
```python
|
|
def example_function(param: str, optional: int = 0) -> bool:
|
|
"""
|
|
Brief description.
|
|
|
|
Args:
|
|
param: Description
|
|
optional: Description (default: 0)
|
|
|
|
Returns:
|
|
Description
|
|
"""
|
|
return True
|
|
```
|
|
|
|
### Testing
|
|
|
|
**Write Tests For:**
|
|
- New features
|
|
- Bug fixes
|
|
- Public APIs
|
|
|
|
**Test Locations:**
|
|
- `tests/test_core.py` - Core transformer
|
|
- `tests/test_tokenizer.py` - Tokenizer
|
|
- `tests/test_persona.py` - Persona system
|
|
- `tests/test_<module>.py` - Other modules
|
|
|
|
**Run Tests:**
|
|
```bash
|
|
# All tests
|
|
pytest
|
|
|
|
# Specific file
|
|
pytest tests/test_core.py
|
|
|
|
# With coverage
|
|
pytest --cov=nova_core
|
|
```
|
|
|
|
### Documentation
|
|
|
|
**Update Docs For:**
|
|
- API changes
|
|
- New features
|
|
- Configuration options
|
|
|
|
**Documentation Files:**
|
|
- `README.md` - Main documentation
|
|
- `docs/MODEL_CARD.md` - Model information
|
|
- `docs/PRIVACY_LOCAL.md` - Privacy details
|
|
- `docs/DATA_LICENSES.md` - Data licensing
|
|
|
|
---
|
|
|
|
## Contribution Areas
|
|
|
|
### High Priority
|
|
|
|
- **Pre-trained Models:** Training and releasing checkpoints
|
|
- **Export Tools:** GGUF converter, quantization improvements
|
|
- **Evaluation Suite:** Comprehensive benchmarks
|
|
- **Dataset Downloaders:** Legal dataset acquisition scripts
|
|
|
|
### Medium Priority
|
|
|
|
- **LoRA Support:** Fine-tuning with adapters
|
|
- **Multi-language:** Support for non-English
|
|
- **Performance:** Optimization improvements
|
|
- **Tests:** Increase coverage
|
|
|
|
### Documentation
|
|
|
|
- **Tutorials:** Step-by-step guides
|
|
- **Examples:** Real-world use cases
|
|
- **API Docs:** Complete API documentation
|
|
- **Architecture:** Deep-dive technical docs
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
By contributing, you agree that your contributions will be licensed under Apache License 2.0.
|
|
|
|
---
|
|
|
|
## Code of Conduct
|
|
|
|
### Our Pledge
|
|
|
|
- Be respectful and inclusive
|
|
- Welcome newcomers
|
|
- Focus on constructive feedback
|
|
- Assume good intentions
|
|
|
|
### Unacceptable Behavior
|
|
|
|
- Harassment or discrimination
|
|
- Trolling or insulting comments
|
|
- Publishing others' private information
|
|
- Other unprofessional conduct
|
|
|
|
### Enforcement
|
|
|
|
Violations can be reported to project maintainers. All complaints will be reviewed and investigated.
|
|
|
|
---
|
|
|
|
## Questions?
|
|
|
|
- **Discussions:** GitHub Discussions
|
|
- **Issues:** GitHub Issues
|
|
- **General:** Open an issue with the "question" label
|
|
|
|
---
|
|
|
|
## Recognition
|
|
|
|
Contributors will be:
|
|
- Listed in CONTRIBUTORS.md
|
|
- Mentioned in release notes
|
|
- Credited for significant features
|
|
|
|
---
|
|
|
|
Thank you for contributing to NOVA! 🌟
|