Instructions to use PearlLeeStudio/TheArtist-MusicTransformer-lora-classical with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use PearlLeeStudio/TheArtist-MusicTransformer-lora-classical with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
TheArtist Music Transformer — LoRA Adapter (Classical)
LoRA adapter that conditions the F1 base (PearlLeeStudio/TheArtist-MusicTransformer-ft-pop80) toward classical chord progressions.
This model was presented in the paper How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity? Capabilities and Boundaries in Multi-Genre Chord-Symbol Modeling.
- GitHub: PearlLeeStudio/TheArtist
- Demo: Watch TheArtist in action on YouTube — interactive staff editor, MIDI input, AI generation with live progress, and per-genre LoRA playback across the 13-genre vocabulary.
This release is the best-rank snapshot from a 5-point rank sweep (r ∈ {4, 8, 16, 32, 64}); see §Rank sweep below for the full table and selection criterion.
Base checkpoint note (2026-06-11): This adapter was trained on the released F1 base checkpoint, which full-SHA-256 weight-hash verification (2026-06-11) shows coincides with the pop-only Phase-0 baseline — best-checkpoint selection in the F1 run silently retained the pre-fine-tuning weights (see the Erratum on the base card). The adapter and all evaluation numbers on this card are self-consistent with that released base: every "F1 base" column was measured against the exact weights this adapter was trained on and is served with. The adaptation gains shown here are therefore gains over a pure-pop harmonic prior.
Adapter summary
| Field | Value |
|---|---|
| Base model | PearlLeeStudio/TheArtist-MusicTransformer-ft-pop80 (F1, 25.6M params) |
| Adapter type | LoRA (Q/K/V projections) |
| LoRA rank | 64 |
| LoRA alpha | 128 |
| LoRA dropout | 0.05 |
| Target modules | w_q, w_k, w_v |
| Trainable parameters | |
| Adapter file size | ~6.0 MB |
| Base vocabulary | 351 tokens (jazz/pop) |
| Vocabulary extension | +8 genre tokens (embedding_extension.pt) |
| Training epochs | 8 |
Training data
Source
371 chord-progression sequences in the classical (Bach chorale) subset (296 train / 37 val / 38 test), drawn from the public-domain music21 corpus rather than from Chordonomicon. These Bach chorales are license-clean for commercial use; the CC BY-NC restriction below applies only because the LoRA's runtime is loaded on top of an F1 base that itself trained on Chordonomicon.
Filter rule
Bach chorales from music21.corpus — Chordonomicon filter does NOT apply (no classical genre in Chordonomicon)
Splits (song-level, seed=42, 80/10/10)
| Partition | Songs | Used for |
|---|---|---|
| train | 296 | this LoRA's training (12-key augmented → 3,552 sequences) |
| val | 37 | rank-sweep eval + best-epoch selection during training |
| test | 38 | held aside for future paired analysis |
Vocabulary
- Base: 351 tokens (jazz/pop chord vocab from the F1 base model)
- Extension: +8
[GENRE:X]tokens covering 8 new genres (this LoRA adds the[GENRE:classical]token) - Final vocab: 359 tokens (stored alongside the adapter in
embedding_extension.pt)
Reproducibility
# 1. classical = Bach chorales from the public-domain music21 corpus (NOT Chordonomicon)
# 2. Extract this genre subset (pulls the Bach chorales from music21.corpus)
uv run python ai/training/extract_genre_subsets.py --genres classical --merge
# 3. Train the LoRA at the released rank
uv run python ai/training/lora_train.py --config ai/training/configs/lora/classical_r64.yaml
Hyperparameters: 8 epochs · batch 32 × accum 2 · lr 3e-4 · 1-epoch warmup · AMP fp16 · best.pt selected by min val_loss.
Rank sweep
The released adapter is the best-rank snapshot from training the same LoRA recipe at five different ranks. Numbers are validation-set token-level metrics (no key augmentation).
| Rank | val_loss | val_top1 (%) | val_top5 (%) | Δtop1 vs F1 |
|---|---|---|---|---|
| r=4 | 1.3663 | 58.15 | 85.21 | +14.61 |
| r=8 | 1.3486 | 58.74 | 85.62 | +15.20 |
| r=16 | 1.3333 | 59.61 | 85.51 | +16.07 |
| r=32 | 1.3174 | 60.08 | 86.09 | +16.54 |
| r=64 | 1.3071 | 60.55 | 86.09 | +17.01 ← selected |
Selection criterion: minimum validation cross-entropy loss; val_top1 as tiebreaker.
Evaluation
Validation token-level metrics on the genre-specific val split (37 sequences, no key augmentation).
| Metric | F1 base alone | F1 + this LoRA | Δ |
|---|---|---|---|
| Top-1 accuracy (%) | 43.54 | 60.55 | +17.01 |
| Top-5 accuracy (%) | 72.82 | 86.09 | +13.27 |
| Cross-entropy loss | 2.8653 | 1.3071 | -1.5582 |
Real-song eval
Mean validation top-1/top-5/cross-entropy on 10 held-out real classical songs.
| Model | Top-1 (%) | Top-5 (%) | val_loss |
|---|---|---|---|
| F1 base alone | 49.55 | 81.17 | 2.2389 |
| F1 + this LoRA | 61.88 | 88.87 | 1.2489 |
| Δ | +12.33 | +7.70 | -0.9900 |
Usage
Both the base repo and this LoRA repo ship the project's model.py and tokenizer.py at the repo root, so external users can load this adapter end-to-end without cloning anything from GitHub.
Required dependencies: torch, huggingface_hub, peft, safetensors.
import sys
import torch
import torch.nn as nn
from huggingface_hub import snapshot_download
from peft import PeftModel
# 1. Download the base + LoRA repos. Both bundle model.py and tokenizer.py.
base_dir = snapshot_download(repo_id="PearlLeeStudio/TheArtist-MusicTransformer-ft-pop80")
lora_dir = snapshot_download(repo_id="PearlLeeStudio/TheArtist-MusicTransformer-lora-classical")
sys.path.insert(0, base_dir) # so the next two imports resolve
from model import MusicTransformer
from tokenizer import ChordTokenizer
# 2. Extended tokenizer (351 base + 8 new genre tokens = 359).
tokenizer = ChordTokenizer(include_extra_genres=True)
# 3. Build the model at the BASE vocab size (351)
BASE_VOCAB = 351
model = MusicTransformer(
vocab_size=BASE_VOCAB,
d_model=512, n_heads=8, d_ff=2048, n_layers=8,
max_seq_len=256, dropout=0.0, pad_id=tokenizer.pad_id,
)
ckpt = torch.load(f"{base_dir}/best.pt", map_location="cpu", weights_only=False)
model.load_state_dict(ckpt["model_state_dict"])
# 4. Grow token_emb + out_proj from 351 -> 359
def _grow_to_extended_vocab(m, new_vocab, none_id):
d = m.token_emb.embedding_dim
new_emb = nn.Embedding(new_vocab, d, padding_idx=m.token_emb.padding_idx)
with torch.no_grad():
new_emb.weight[:m.token_emb.num_embeddings] = m.token_emb.weight
for i in range(m.token_emb.num_embeddings, new_vocab):
new_emb.weight[i] = m.token_emb.weight[none_id]
m.token_emb = new_emb
new_out = nn.Linear(d, new_vocab, bias=False)
with torch.no_grad():
new_out.weight[:m.out_proj.out_features] = m.out_proj.weight
for i in range(m.out_proj.out_features, new_vocab):
new_out.weight[i] = m.out_proj.weight[none_id]
m.out_proj = new_out
_grow_to_extended_vocab(model, tokenizer.vocab_size, tokenizer.encode_genre("none"))
ext = torch.load(f"{lora_dir}/embedding_extension.pt",
map_location="cpu", weights_only=False)
model.token_emb.load_state_dict(ext["token_emb_state"])
model.out_proj.load_state_dict(ext["out_proj_state"])
# 5. Apply the LoRA adapter
model = PeftModel.from_pretrained(model, f"{lora_dir}/adapter")
model.eval()
# 6. Generate a classical continuation.
song = {
"key": "Cmaj", "time_signature": "4/4", "genre": "classical",
"bars": [["Cmaj7"], ["Fmaj7"]],
}
prompt_ids = tokenizer.encode_sequence(song)[:-1]
ids = torch.tensor([prompt_ids])
with torch.no_grad():
for _ in range(32):
logits = model(ids)
next_id = torch.multinomial(
torch.softmax(logits[:, -1, :] / 0.8, dim=-1), 1,
)
ids = torch.cat([ids, next_id], dim=-1)
if next_id.item() == tokenizer.eos_id:
break
print(tokenizer.decode(ids[0].tolist()))
License and use
The adapter weights are released under CC BY-NC 4.0. Permitted: research, paper replication, portfolio, demo. Not permitted: commercial deployment without separate licensing of upstream data. The classical training data (Bach chorales from the public-domain music21 corpus) is itself license-clean, but the CC BY-NC restriction applies because this LoRA's runtime is loaded on top of an F1 base that trained on Chordonomicon.
Citation
@misc{lee2026chordmix,
title = {Empirical Study of Pop and Jazz Mix Ratios for Genre-Adaptive Chord Generation},
author = {Lee, Jinju},
year = {2026},
eprint = {2605.04998},
archivePrefix = {arXiv}
}
@misc{lee2026chordtimeseries,
title = {How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity?},
author = {Lee, Jinju},
year = {2026},
eprint = {2606.07334},
archivePrefix = {arXiv}
}
- Downloads last month
- 147