|
|
--- |
|
|
library_name: transformers |
|
|
tags: |
|
|
- llama-3.2 |
|
|
- causal-lm |
|
|
- code |
|
|
- python |
|
|
- peft |
|
|
- qlora |
|
|
--- |
|
|
|
|
|
# Model Card for llama32-1b-python-docstrings-qlora |
|
|
|
|
|
A parameter-efficiently fine-tuned adapter on top of `meta-llama/Llama-3.2-1B-Instruct` for generating concise one-line Python docstrings from function bodies. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
- **Developed by:** Abdullah Al-Housni |
|
|
- **Model type:** Causal language model with LoRA/QLoRA adapters |
|
|
- **Language(s):** Python code as input, English docstrings as output |
|
|
- **License:** Same as `meta-llama/Llama-3.2-1B-Instruct` (Meta Llama 3.2 Community License) |
|
|
- **Finetuned from model:** `meta-llama/Llama-3.2-1B-Instruct` |
|
|
|
|
|
The model is trained to take a Python function definition and generate a concise, one-line docstring describing what the function does. |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
- Automatically generate one-line Python docstrings for functions. |
|
|
- Improve or bootstrap documentation in Python codebases. |
|
|
- Educational use for learning how to summarize code behavior. |
|
|
|
|
|
Typical usage pattern: |
|
|
- Input: Python function body (source code). |
|
|
- Output: Single-sentence English description suitable as a docstring. |
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
- Generating full, multi-paragraph API documentation. |
|
|
- Security auditing or correctness guarantees for code. |
|
|
- Use outside Python (e.g., other programming languages) without additional fine-tuning. |
|
|
- Any safety-critical application where incorrect summaries could cause harm. |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
- The model can produce **incorrect or incomplete summaries**, especially for complex or ambiguous functions. |
|
|
- It may imitate noisy or low-quality patterns from the training data (e.g., overly short or cryptic docstrings). |
|
|
- It does **not** understand project-specific context, invariants, or business logic; outputs should be reviewed by a human developer. |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
- Use the model as an **assistive tool**, not an authoritative source. |
|
|
- Always review and edit generated docstrings before committing to production code. |
|
|
- For non-Python or highly domain-specific code, consider additional fine-tuning on in-domain examples. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Example with 🤗 Transformers and PEFT (LoRA adapter): |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
from peft import PeftModel |
|
|
|
|
|
base_model_id = "meta-llama/Llama-3.2-1B-Instruct" |
|
|
adapter_id = "Abdul1102/llama32-1b-python-docstrings-qlora" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model_id) |
|
|
model = AutoModelForCausalLM.from_pretrained(base_model_id, device_map="auto") |
|
|
model = PeftModel.from_pretrained(model, adapter_id) |
|
|
|
|
|
def make_prompt(code: str) -> str: |
|
|
return |
|
|
f'Write a one-line Python docstring for this function:\n\n{code}\n\n"""' |
|
|
|
|
|
code = "def add(a, b):\n return a + b" |
|
|
inputs = tokenizer(make_prompt(code), return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=32, do_sample=False) |
|
|
text = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
print(text) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
- Dataset: Python subset of CodeSearchNet (`Nan-Do/code-search-net-python`) |
|
|
- Inputs: `code` column (full Python function body) |
|
|
- Targets: First non-empty line of `docstring` |
|
|
- A filtered subset of ~1,000–2,000 examples was used for efficient QLoRA fine-tuning |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
- Objective: Causal language modeling (predict the docstring continuation) |
|
|
- Method: QLoRA (4-bit quantized base model with LoRA adapters) |
|
|
- Precision: 4-bit quantized weights, bf16 compute |
|
|
- Epochs: 1 |
|
|
- Max sequence length: 256–512 tokens |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- Learning rate: ~2e-4 (adapter weights only) |
|
|
- Epochs: 1 |
|
|
- Optimizer: AdamW via Hugging Face `Trainer` |
|
|
- LoRA rank: 16 |
|
|
- LoRA alpha: 32 |
|
|
- LoRA dropout: 0.05 |
|
|
|
|
|
--- |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
|
|
#### Testing Data |
|
|
|
|
|
Held-out test split from the same CodeSearchNet Python dataset, using identical `code` → one-line docstring mapping. |
|
|
|
|
|
#### Factors |
|
|
|
|
|
- Function size and complexity |
|
|
- Variety in docstring writing styles |
|
|
- Presence of short or noisy docstrings |
|
|
|
|
|
#### Metrics |
|
|
|
|
|
- BLEU (sacreBLEU): strict n-gram overlap, sensitive to paraphrasing |
|
|
- ROUGE (ROUGE-1 / ROUGE-2 / ROUGE-L): better for short summaries |
|
|
|
|
|
### Results |
|
|
|
|
|
Approximate performance on ~50 held-out samples: |
|
|
|
|
|
- BLEU: ~12.4 |
|
|
- ROUGE-1: ~0.78 |
|
|
- ROUGE-2: ~0.74 |
|
|
- ROUGE-L: ~0.78 |
|
|
|
|
|
#### Summary |
|
|
|
|
|
The model frequently reproduces or closely paraphrases the correct docstring. Occasional failures include echoing part of the prompt or returning an empty string. Strong performance for a 1B model trained briefly on a small dataset. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Examination |
|
|
|
|
|
Not applicable. |
|
|
|
|
|
--- |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
- Hardware Type: Google Colab GPU (T4/L4) |
|
|
- Hours Used: ~0.5–1 hour total |
|
|
- Cloud Provider: Google Colab |
|
|
- Compute Region: US |
|
|
- Carbon Emitted: Not estimated (very low due to minimal training time) |
|
|
|
|
|
--- |
|
|
|
|
|
## Technical Specifications |
|
|
|
|
|
### Model Architecture and Objective |
|
|
|
|
|
- Base model: Llama 3.2 1B Instruct |
|
|
- Architecture: Decoder-only transformer |
|
|
- Objective: Causal language modeling |
|
|
- Parameter-efficient fine-tuning using LoRA (rank 16) |
|
|
|
|
|
### Compute Infrastructure |
|
|
|
|
|
#### Hardware |
|
|
|
|
|
Single Google Colab GPU (T4 or L4) |
|
|
|
|
|
#### Software |
|
|
|
|
|
- Python |
|
|
- PyTorch |
|
|
- Hugging Face Transformers |
|
|
- PEFT |
|
|
- bitsandbytes |
|
|
- Datasets |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
Not applicable. |
|
|
|
|
|
--- |
|
|
|
|
|
## Glossary |
|
|
|
|
|
Not applicable. |
|
|
|
|
|
--- |
|
|
|
|
|
## More Information |
|
|
|
|
|
See the Hugging Face model page for updates or usage examples. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Card Authors |
|
|
|
|
|
Abdullah Al-Housni |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
Available through the Hugging Face model repository. |
|
|
|