Phoenix21
/

codeact-qwen2.5-3b

Text Generation

Model card Files Files and versions

codeact-qwen2.5-3b / README.md

Phoenix21's picture

Upload CodeAct fine-tuned model

29006c5 verified 25 days ago

|

history blame contribute delete

2.42 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- code
	- codeact
	- python
	- mlx
	- lora
	base_model: Qwen/Qwen2.5-3B
	pipeline_tag: text-generation
	---

	# CodeAct Fine-tuned Qwen2.5-3B

	A fine-tuned version of Qwen2.5-3B for code generation with self-evaluation feedback.

	## Model Description

	This model was fine-tuned using the CodeAct approach with:
	- Base Model: Qwen/Qwen2.5-3B
	- Training Method: LoRA (Low-Rank Adaptation)
	- Training Data: 100 curated Python programming examples
	- Categories: Math, Strings, Lists, Algorithms, Data Structures

	## Usage

	### With MLX (Apple Silicon)
	```python
	from mlx_lm import load, generate

	model, tokenizer = load("Phoenix21/codeact-qwen2.5-3b")
	# Or with adapter:
	# model, tokenizer = load("Qwen/Qwen2.5-3B", adapter_path="Phoenix21/codeact-qwen2.5-3b")

	response = generate(model, tokenizer, prompt="Calculate factorial of 5", max_tokens=200)
	print(response)
	```

	### With PyTorch (CUDA/CPU)
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel

	base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B", trust_remote_code=True)
	model = PeftModel.from_pretrained(base_model, "Phoenix21/codeact-qwen2.5-3b")
	tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B", trust_remote_code=True)
	```

	### Interactive Demo
	```bash
	# Auto-detect backend (MLX/CUDA/CPU)
	python interactive_universal.py

	# Force specific backend
	python interactive_universal.py --backend cuda
	python interactive_universal.py --backend mlx
	python interactive_universal.py --backend cpu
	```

	## Training Details

	- Iterations: 500
	- Batch Size: 1
	- LoRA Layers: 16
	- Learning Rate: 1e-5
	- Platform: Apple M3 (MLX)

	## Response Format

	The model uses structured tags:
	- `<thought>reasoning</thought>` - Chain of thought
	- `<execute>code</execute>` - Python code to execute
	- `<solution>answer</solution>` - Final answer
	- `<feedback>assessment</feedback>` - Self-evaluation

	## Example

	Input: "Calculate the sum of squares from 1 to 10"

	Output:
	```
	<thought>Sum of squares formula: n(n+1)(2n+1)/6</thought>

	<execute>
	n = 10
	result = n * (n + 1) * (2 * n + 1) // 6
	print(result)
	</execute>

	<solution>Sum of squares from 1 to 10 is 385</solution>

	<feedback>
	score: 10
	correctness: correct
	efficiency: excellent
	explanation: Used O(1) formula instead of O(n) loop
	</feedback>
	```

	## License

	Apache 2.0