feat: Add train/val split script and update gitignore
- Added 03_train_val_split.py to create deterministic train/validation splits from data.txt or fallback text - Updated .gitignore to un-comment .vscode/ directory exclusion - Changed data.txt pattern to *.txt for better file matching in gitignore - Script handles UTF-8 text loading with newline normalization and writes train.txt/val.txt files - Includes doctest examples and proper type hints
This commit is contained in:
4
.gitignore
vendored
4
.gitignore
vendored
@@ -199,7 +199,7 @@ cython_debug/
|
||||
# that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
|
||||
# and can be added to the global gitignore or merged into this file. However, if you prefer,
|
||||
# you could uncomment the following to ignore the entire vscode folder
|
||||
# .vscode/
|
||||
.vscode/
|
||||
|
||||
# Ruff stuff:
|
||||
.ruff_cache/
|
||||
@@ -216,4 +216,4 @@ __marimo__/
|
||||
.streamlit/secrets.toml
|
||||
|
||||
# Data/Material that should not be synced
|
||||
data.txt
|
||||
*.txt
|
Reference in New Issue
Block a user