File size: 2,419 Bytes
29006c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
license: apache-2.0
language:
- en
tags:
- code
- codeact
- python
- mlx
- lora
base_model: Qwen/Qwen2.5-3B
pipeline_tag: text-generation
---

# CodeAct Fine-tuned Qwen2.5-3B

A fine-tuned version of Qwen2.5-3B for code generation with self-evaluation feedback.

## Model Description

This model was fine-tuned using the CodeAct approach with:
- **Base Model:** Qwen/Qwen2.5-3B
- **Training Method:** LoRA (Low-Rank Adaptation)
- **Training Data:** 100 curated Python programming examples
- **Categories:** Math, Strings, Lists, Algorithms, Data Structures

## Usage

### With MLX (Apple Silicon)
```python
from mlx_lm import load, generate

model, tokenizer = load("Phoenix21/codeact-qwen2.5-3b")
# Or with adapter:
# model, tokenizer = load("Qwen/Qwen2.5-3B", adapter_path="Phoenix21/codeact-qwen2.5-3b")

response = generate(model, tokenizer, prompt="Calculate factorial of 5", max_tokens=200)
print(response)
```

### With PyTorch (CUDA/CPU)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B", trust_remote_code=True)
model = PeftModel.from_pretrained(base_model, "Phoenix21/codeact-qwen2.5-3b")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B", trust_remote_code=True)
```

### Interactive Demo
```bash
# Auto-detect backend (MLX/CUDA/CPU)
python interactive_universal.py

# Force specific backend
python interactive_universal.py --backend cuda
python interactive_universal.py --backend mlx
python interactive_universal.py --backend cpu
```

## Training Details

- **Iterations:** 500
- **Batch Size:** 1
- **LoRA Layers:** 16
- **Learning Rate:** 1e-5
- **Platform:** Apple M3 (MLX)

## Response Format

The model uses structured tags:
- `<thought>reasoning</thought>` - Chain of thought
- `<execute>code</execute>` - Python code to execute
- `<solution>answer</solution>` - Final answer
- `<feedback>assessment</feedback>` - Self-evaluation

## Example

**Input:** "Calculate the sum of squares from 1 to 10"

**Output:**
```
<thought>Sum of squares formula: n(n+1)(2n+1)/6</thought>

<execute>
n = 10
result = n * (n + 1) * (2 * n + 1) // 6
print(result)
</execute>

<solution>Sum of squares from 1 to 10 is 385</solution>

<feedback>
score: 10
correctness: correct
efficiency: excellent
explanation: Used O(1) formula instead of O(n) loop
</feedback>
```

## License

Apache 2.0