Add information on project to readme
Browse files
README.md
CHANGED
|
@@ -1,3 +1,106 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# BitAgent-8B
|
| 6 |
+
|
| 7 |
+
BitAgent-8B is an open-source, tool-calling language model fine-tuned and incentivized on [Bittensor](https://bittensor.com) Subnet #20 -- BitAgent. This model was trained for complex function-calling tasks, drawing on decentralized AI efforts and an open, community-driven training approach.
|
| 8 |
+
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
## Overview
|
| 12 |
+
|
| 13 |
+
**BitAgent-8B** arose from the collaborative efforts within Bittensor Subnet #20. It leverages:
|
| 14 |
+
- **Decentralized AI**: Community-driven hosting and validation that provide continuous training signals, ensuring that the model adapts to a wide range of function-calling and workflow-building tasks.
|
| 15 |
+
- **Offset by Several Competitions**: The model’s parameters have been refined in multiple training competitions, leveraging evolving multi-turn dialogue tasks.
|
| 16 |
+
- **Broad Tool-Calling Agency**: It handles a diverse set of functions spanning financial calculations, workflow management, deployment scripts, and more.
|
| 17 |
+
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
## Key Features
|
| 21 |
+
- **Enhanced Tool Usage**: Fine-tuned to select, utilize, and chain functions from a toolset effectively.
|
| 22 |
+
- **Multi-Turn Dialogue**: Maintains context across turns, enabling complex tasks with step-by-step interactions.
|
| 23 |
+
- **BFCL-Style Adherence**: Engineered for strong performance on the [Berkeley Function Calling Leaderboard](https://gorilla.cs.berkeley.edu/leaderboard.html).
|
| 24 |
+
- **Decentralized & Community Driven**: Developed and hosted on a global, miner-supported network (Subnet #20 on Bittensor), encouraging open contribution and verifying model performance without reliance on centralized resources.
|
| 25 |
+
|
| 26 |
+
---
|
| 27 |
+
|
| 28 |
+
## Performance on BFCL
|
| 29 |
+
BitAgent-8B secured a **top-10 rank (6th place)** on the BFCL, ahead of notable commercial models such as:
|
| 30 |
+
- 4o Mini
|
| 31 |
+
- Gemini
|
| 32 |
+
- Qwen
|
| 33 |
+
- DeepSeek
|
| 34 |
+
- Claude
|
| 35 |
+
|
| 36 |
+
While some high-ranking small-form models on BFCL appear to overfit specific function-calling tasks, **BitAgent-8B** was purposefully trained to preserve broad generalization. This emphasis on diverse tasks ensures robust, consistent performance across a variety of real-world use cases.
|
| 37 |
+
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
+
## Open-Source Incentive Training
|
| 41 |
+
**BitAgent-8B** was developed with an **incentive mechanism** on Bittensor Subnet #20:
|
| 42 |
+
- Miners contributed compute to fine-tune and host candidate models.
|
| 43 |
+
- Validators continuously tested these models on domain-specific tool-calling prompts.
|
| 44 |
+
- Reward signals drove iterative improvements and overcame overfitting pitfalls.
|
| 45 |
+
|
| 46 |
+
This ecosystem ensures BitAgent-8B remains:
|
| 47 |
+
1. **Adaptable** to new tasks.
|
| 48 |
+
2. **Decentralized** in design, with no single entity controlling its training pipeline.
|
| 49 |
+
3. **Transparent** in performance metrics, with all scoring data publicly available on the BFCL.
|
| 50 |
+
|
| 51 |
+
For additional background on Subnet #20 and details about the training setup, see the [Subnet 20 Readme](https://github.com/RogueTensor/bitagent_subnet) (or refer to the documentation you already have).
|
| 52 |
+
|
| 53 |
+
---
|
| 54 |
+
|
| 55 |
+
## Installation & Usage
|
| 56 |
+
|
| 57 |
+
Below is a minimal example of loading **BitAgent-8B** using Hugging Face `transformers`. Adjust paths and parameters according to your environment:
|
| 58 |
+
|
| 59 |
+
```python
|
| 60 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 61 |
+
import torch
|
| 62 |
+
|
| 63 |
+
model_id = "BitAgent/Bitagent-8b"
|
| 64 |
+
|
| 65 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 66 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 67 |
+
model_id,
|
| 68 |
+
torch_dtype="auto",
|
| 69 |
+
device_map="auto"
|
| 70 |
+
)
|
| 71 |
+
|
| 72 |
+
# Example usage in Python
|
| 73 |
+
# (Please adapt to your environment or use Bittensor's APIs accordingly)
|
| 74 |
+
prompt = """You are an expert in function calling.
|
| 75 |
+
You are given a question and a set of possible tools.
|
| 76 |
+
You must decide if a tool should be invoked.
|
| 77 |
+
Format tool calls strictly as: [tool_name(param=value, param2=value2)]
|
| 78 |
+
If no tool is relevant or required parameters are missing, please respond that the request can't be fulfilled."""
|
| 79 |
+
|
| 80 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 81 |
+
outputs = model.generate(inputs["input_ids"], max_new_tokens=100, do_sample=False)
|
| 82 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 83 |
+
```
|
| 84 |
+
|
| 85 |
+
## Sample Prompt & Invocation Structure
|
| 86 |
+
|
| 87 |
+
Following the BFCL-style function invocation structure, BitAgent-8B expects prompts that look like this:
|
| 88 |
+
### Prompt Template
|
| 89 |
+
```
|
| 90 |
+
"You are an expert in composing functions.
|
| 91 |
+
You are given a question and a set of possible functions.
|
| 92 |
+
Based on the question, you will need to make one or more function/tool calls to achieve the purpose.
|
| 93 |
+
If none of the functions can be used, point it out.
|
| 94 |
+
If the given question lacks the parameters required by any function, also point it out.
|
| 95 |
+
You should only return the function call in tools call sections.
|
| 96 |
+
|
| 97 |
+
If you decide to invoke any of the function(s),
|
| 98 |
+
you MUST put it in the format of [func_name(params_name1=params_value1, params_name2=params_value2...)].
|
| 99 |
+
You SHOULD NOT include any other text in the response.
|
| 100 |
+
Here is a list of functions in JSON format that you can invoke:
|
| 101 |
+
|
| 102 |
+
{functions}"```
|
| 103 |
+
|
| 104 |
+
|
| 105 |
+
## License
|
| 106 |
+
BitAgent-8B is open-sourced under the Apache 2.0 License
|