Instructions to use cognAI/lil-c3po with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cognAI/lil-c3po with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="cognAI/lil-c3po")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("cognAI/lil-c3po") model = AutoModelForCausalLM.from_pretrained("cognAI/lil-c3po") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use cognAI/lil-c3po with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "cognAI/lil-c3po" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cognAI/lil-c3po", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/cognAI/lil-c3po
- SGLang
How to use cognAI/lil-c3po with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "cognAI/lil-c3po" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cognAI/lil-c3po", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "cognAI/lil-c3po" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cognAI/lil-c3po", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use cognAI/lil-c3po with Docker Model Runner:
docker model run hf.co/cognAI/lil-c3po
deepnight-research/lil-c3po

Model Details:
lil-c3po is an open-source large language model (LLM) resulting from the linear merge of two distinct fine-tuned Mistral-7B models, internally referred to as c3-1 and c3-2. These models, developed in-house, bring together unique characteristics to enhance performance and utility.
Model Architecture:
lil-c3po inherits its architecture from the combined c3-1 and c3-2 models, incorporating features such as Grouped-Query Attention, Sliding-Window Attention, and Byte-fallback BPE tokenizer. This fusion aims to capitalize on the strengths of both models for improved language understanding and generation.
Training Details:
- The first model, internally referred to as c3-1, is a 7B parameter Large Language Model fine-tuned on the Intel Gaudi 2 processor. It utilizes the Direct Performance Optimization (DPO) method and is designed to excel in various language-related tasks.
- The second model, denoted as c3-2, is an instruct fine-tuned version of Mistral-7B. Its architecture features improvements in instruct fine-tuning, contributing to enhanced language understanding in instructional contexts.
License:
lil-c3po is released under the MIT license, fostering open-source collaboration and innovation.
Intended Use:
This merged model is suitable for a broad range of language-related tasks, inheriting the capabilities of the fine-tuned c3-1 and c3-2 models. Users interested in language tasks can leverage lil-c3po's capabilities.
Out-of-Scope Uses:
While lil-c3po is versatile, it is important to note that, in most cases, fine-tuning may be necessary for specific tasks. Additionally, the model should not be used to intentionally create hostile or alienating environments for people.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 68.03 |
| AI2 Reasoning Challenge (25-Shot) | 65.02 |
| HellaSwag (10-Shot) | 84.45 |
| MMLU (5-Shot) | 62.36 |
| TruthfulQA (0-shot) | 68.73 |
| Winogrande (5-shot) | 79.16 |
| GSM8k (5-shot) | 48.45 |
- Downloads last month
- 58
Evaluation results
- normalized accuracy on AI2 Reasoning Challenge (25-Shot)test set Open LLM Leaderboard65.020
- normalized accuracy on HellaSwag (10-Shot)validation set Open LLM Leaderboard84.450
- accuracy on MMLU (5-Shot)test set Open LLM Leaderboard62.360
- mc2 on TruthfulQA (0-shot)validation set Open LLM Leaderboard68.730
- accuracy on Winogrande (5-shot)validation set Open LLM Leaderboard79.160
- accuracy on GSM8k (5-shot)test set Open LLM Leaderboard48.450