Common Lisp Macro Transformations

A fine-tuning dataset for training models to generate Common Lisp macros. Each example is a (before-code) → (macro-definition) → (after-expansion) triple.

Idea

Instead of fine-tuning a model to "write code", fine-tune it to generate CL macros — code that writes code. The model learns to recognize AST patterns and generate transformations, not final output.

Sources

Let Over Lambda — Doug Hoyte's production macro collection (thephoeron/let-over-lambda)
On Lisp — Paul Graham's classic Common Lisp macro utilities

Dataset Structure

Each record contains:

instruction — Task description with the code pattern to address
input — The "before" code showing the pattern that needs a macro
output — The defmacro form that solves it
category — Macro category (capture-management, anaphoric, dispatch, control-flow, DSL, compiler-macro, efficiency, scope)
technique — Comma-separated techniques used (gensym, nested-backquote, dlambda, anaphor, code-walking, symbol-macrolet, defsetf, tagbody-go, once-only, macrolet, compiler-macro, recursive-expansion)
complexity — basic, intermediate, or advanced
quality_score — Classifier score from 0.0 to 1.0

Categories

Category	Description	Examples
capture-management	Hygienic macro writing utilities	defmacro/g!, defmacro!, with-gensyms
anaphoric	Deliberate variable capture for conciseness	aif, alambda, alet, aand
dispatch	Keyword-based dispatch and inter-closure protocols	dlambda, pandoriclet, with-pandoric
control-flow	New evaluation semantics via macros	nlet-tail, condlet, if-match, choose
DSL	Domain-specific embedded languages	defunits, _f (generalized setf), dbind
compiler-macro	Compile-time optimization of function calls	fformat compiler macro
efficiency	Performance-oriented macro techniques	sortf (sorting networks)
scope	Lexical scope manipulation	pandoric-eval

Use for Fine-tuning

The data is in instruction-input-output JSONL format, ready for fine-tuning:

from datasets import load_dataset
ds = load_dataset("j14i/cl-macros", split="train")

Target model size: ≤ 30B parameters (the domain is narrow — pattern matching on ASTs and transformations — so a smaller model suffices).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support