Text Generation
Transformers
Safetensors
English
gravity_moe
medical
clinical
mixture-of-experts
conversational
sft
custom_code
Instructions to use Jashan887/97_Learning_Unit_L1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Jashan887/97_Learning_Unit_L1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Jashan887/97_Learning_Unit_L1", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("Jashan887/97_Learning_Unit_L1", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Jashan887/97_Learning_Unit_L1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Jashan887/97_Learning_Unit_L1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Jashan887/97_Learning_Unit_L1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Jashan887/97_Learning_Unit_L1
- SGLang
How to use Jashan887/97_Learning_Unit_L1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Jashan887/97_Learning_Unit_L1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Jashan887/97_Learning_Unit_L1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Jashan887/97_Learning_Unit_L1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Jashan887/97_Learning_Unit_L1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Jashan887/97_Learning_Unit_L1 with Docker Model Runner:
docker model run hf.co/Jashan887/97_Learning_Unit_L1
| license: apache-2.0 | |
| language: | |
| - en | |
| base_model: | |
| - trillionlabs/Gravity-16B-A3B-Base | |
| tags: | |
| - medical | |
| - clinical | |
| - mixture-of-experts | |
| - conversational | |
| - sft | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| <p align="center"> | |
| <img src="banner.png" alt="L1" style="width: 80%;"> | |
| </p> | |
| # Learning Unit 1 | |
| **L1** (Learning Unit 1) is the first language model from [Lunit](https://www.lunit.io) and Lunit Consortium, purpose-built for the medical domain. Derived from [Gravity-16B-A3B-Base](https://huggingface.co/trillionlabs/Gravity-16B-A3B-Base), L1 is designed for clinical reasoning and decision support. | |
| ### β¨ Key Highlights | |
| * π©Ί **Medical-Domain Specialized**: Developed specifically for clinical reasoning and medical decision support | |
| * β‘ **Efficient MoE**: Only 3B parameters active per token out of 16.24B total β fast inference with high capacity | |
| * π **Thinking Model**: Performs step-by-step reasoning in `<think>` tags before generating the final answer | |
| > **Note:** L1 reasons internally using `<think>...</think>` blocks before producing a response. This chain-of-thought process improves answer quality but consumes additional tokens. Set `max_tokens` accordingly (recommended: 2048+). | |
| ### π Model Specifications | |
| - Type: Causal Language Model | |
| - Base Model: [Gravity-16B-A3B-Base](https://huggingface.co/trillionlabs/Gravity-16B-A3B-Base) from Trillion Labs and Lunit Consortium | |
| - Architecture: GravityMoE (Sparse Mixture-of-Experts with MLA) | |
| - Total Parameters: 16.24B | |
| - Active Parameters: 3B | |
| - Number of Layers: 28 | |
| - Attention Heads: 16 | |
| - KV Heads: 16 | |
| - Hidden Size: 2048 | |
| - MoE Intermediate Size: 1408 | |
| - Routed Experts: 64 (top-8 selection) | |
| - Shared Experts: 1 | |
| - Context Length: 32,768 tokens | |
| - Vocabulary Size: 151,552 | |
| - Tokenizer: GLM-4.5 | |
| - Precision: bf16 | |
| ## π Quickstart | |
| ### SGLang (Recommended) | |
| **Install:** | |
| ```bash | |
| pip install "sglang[all] @ git+https://github.com/trillion-labs/sglang-gravity.git#subdirectory=python" | |
| ``` | |
| **Launch server:** | |
| ```bash | |
| python -m sglang.launch_server \ | |
| --model-path learning-unit/L1-16B-A3B \ | |
| --port 9006 --host 0.0.0.0 \ | |
| --tp 1 --dtype bfloat16 --trust-remote-code \ | |
| --attention-backend triton \ | |
| --moe-runner-backend triton | |
| ``` | |
| **Query:** | |
| ```bash | |
| curl -X POST http://localhost:9006/v1/chat/completions \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "model": "learning-unit/L1-16B-A3B", | |
| "messages": [ | |
| {"role": "user", "content": "What are the diagnostic criteria for sepsis?"} | |
| ], | |
| "max_tokens": 2048 | |
| }' | |
| ``` | |
| ### Transformers | |
| **Install:** | |
| ```bash | |
| pip install "transformers>=5.0" torch | |
| ``` | |
| ```python | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model_name = "learning-unit/L1-16B-A3B" | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_name, | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto", | |
| trust_remote_code=True, | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) | |
| messages = [ | |
| {"role": "user", "content": "What are the diagnostic criteria for sepsis?"} | |
| ] | |
| text = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| add_generation_prompt=True, | |
| ) | |
| model_inputs = tokenizer([text], return_tensors="pt").to(model.device) | |
| generated_ids = model.generate( | |
| **model_inputs, | |
| max_new_tokens=2048, | |
| temperature=0.7, | |
| do_sample=True, | |
| ) | |
| generated_ids = [ | |
| output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) | |
| ] | |
| response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] | |
| print(response) | |
| ``` | |
| ## π¬ Examples | |
| L1 is specialized for the medical domain and covers a wide range of clinical scenarios. Below are representative examples from real-world clinical use cases. | |
| ### Medical Q&A | |
| > A 45-year-old woman with lupus nephritis on mycophenolate and prednisone develops fever, dry cough, and bilateral ground-glass opacities on chest CT. Her CD4 count is 180. What is your differential diagnosis and recommended workup? | |
| ### Patient Education | |
| > I have diabetes and use insulin daily. What is the proper way to store insulin at home? | |
| ### Clinical Documentation | |
| > Please draft an overnight progress note. Patient labs: RBC 4.5, WBC 8. Vitals: HR 82, BP 118/76, RR 15, Temp 37.1. Nurse reports stable overnight. Plan: continue antibiotics, recheck labs in the morning. | |
| ### Emergency Triage | |
| > λ€μ μκΈμ€ νμμ λν΄ KTAS triageλ₯Ό μννκ³ , μ΄κΈ° μ§λ¨ λ° κ°λ³μ§λ¨μ μ μν΄μ£ΌμΈμ. 78μΈ μ¬μ± νμκ° 119 ꡬκΈμ°¨λ‘ μκΈμ€μ λ΄μνμ΅λλ€. 22μκ²½ κ°μκΈ° μ’μΈ‘ μλ©΄μ΄ μ²μ§κ³ λ§μ΄ μ΄λν΄μ§λ μ¦μμ΄ λ°μνμ΅λλ€. λν΅μ νΈμνλ©°, κ³ νμ λ³λ ₯μ΄ μμ΅λλ€. νλ ₯μ§νλ νμ 172/88, μ¬λ°μ 92, νΈν‘μ 14, μ²΄μ¨ 36.8, μ°μν¬νλ 98%μ΄κ³ μμμ λͺ λ£ν©λλ€. μ¬μ§ μμ½κ°μ μμ΅λλ€. | |
| ### Adverse Drug Reaction (ADR) Causality Assessment | |
| > λ€μ νμμ μ½λ¬Όμ΄μλ°μ(ADR)μ λν΄ WHO-UMC κΈ°μ€μΌλ‘ μΈκ³Όκ΄κ³λ₯Ό νκ°ν΄μ£ΌμΈμ. 80μΈ μ¬μ± νμκ° κΈ°κ΄μ§νμ₯μ¦μΌλ‘ μ μ μ€ moxifloxacin 400mg IVλ₯Ό ν¬μ¬λ°μμ΅λλ€. ν¬μ¬ μ€ μ μ νΌλΆ κ°λ €μμ΄ μλ‘ λ°μνκ³ , μ½λ¬Ό μ€λ¨ ν νμ λ³ΈμΈλ κ°λ €μμ΄ μ€μ΄λλ μμμ νννμΌλ©° μ΄ν ν볡λμμ΅λλ€. μ¬ν¬μ¬λ μννμ§ μμμ΅λλ€. κΈ°μ‘΄ μ½λ¬Ό μλ λ₯΄κΈ°λ ₯μ μκ³ , κ°λ €μμ μ λ°ν λ§ν λ€λ₯Έ λ³μ©μ½λ¬Όμ΄λ νΌλΆμ§νμ νμΈλμ§ μμμ΅λλ€. | |
| ## π Benchmark | |
| All benchmarks were evaluated using [CoEval](https://github.com/lunit-io/CoEval), Lunit's open-source medical LLM evaluation framework. Evaluations use greedy decoding (temperature=0). To reproduce these results: | |
| ```bash | |
| git clone https://github.com/lunit-io/CoEval.git | |
| cd CoEval | |
| ``` | |
| Refer to the [CoEval Quickstart](https://github.com/lunit-io/CoEval#quickstart) for setup and evaluation instructions. | |
| ### MCQA Benchmarks | |
| | Model | [PubMedQA](https://huggingface.co/datasets/qiaojin/PubMedQA) | [AttrBench](https://huggingface.co/datasets/osunlp/AttributionBench) | [MedQA](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options) | [CareQA](https://huggingface.co/datasets/HPAI-BSC/CareQA) | [HeadQA](https://huggingface.co/datasets/alesi12/head_qa_v2) | [MedMCQA](https://huggingface.co/datasets/lighteval/med_mcqa) | [MMLU-Pro (Health)](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro) | [M-ARC](https://huggingface.co/datasets/mkieffer/M-ARC) | [MetaMedQA](https://huggingface.co/datasets/maximegmd/MetaMedQA) | [MedHallu](https://huggingface.co/datasets/UTAustin-AIHealth/MedHallu) | [MedCalc](https://huggingface.co/datasets/ncbi/MedCalc-Bench) | [MedBullets](https://huggingface.co/datasets/mkieffer/Medbullets) 4-opt | [MedBullets](https://huggingface.co/datasets/mkieffer/Medbullets) 5-opt | [MedXpertQA](https://huggingface.co/datasets/TsinghuaC3I/MedXpertQA)-R | [MedXpertQA](https://huggingface.co/datasets/TsinghuaC3I/MedXpertQA)-U | W.Avg | | |
| |:---|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:| | |
| | GPT-OSS-120B | 78.00 | 76.10 | 91.10 | 91.00 | 88.40 | 74.80 | 74.60 | 40.00 | 76.50 | 83.50 | 30.30 | 84.70 | 82.10 | 35.60 | 32.90 | 79.43 | | |
| | GPT-OSS-20B | 75.80 | 74.80 | 83.90 | 84.80 | 83.30 | 65.40 | 70.50 | 31.00 | 70.10 | 81.30 | 29.20 | 73.40 | 70.50 | 24.70 | 21.20 | 73.38 | | |
| | Qwen3.5-122B | 76.40 | 55.68 | 87.80 | 86.40 | 84.00 | 74.40 | 73.00 | 59.00 | 73.90 | 37.50 | 53.70 | 79.20 | 79.50 | 35.90 | 35.30 | 75.08 | | |
| | MedGemma-27B | 73.40 | 74.80 | 84.40 | 85.00 | 83.80 | 71.90 | 73.00 | 48.00 | 69.60 | 81.40 | 24.10 | 73.70 | 68.80 | 19.10 | 20.50 | 73.99 | | |
| | Gemma4-26B-A4B | 76.40 | 72.00 | 81.80 | 84.50 | 82.30 | 67.30 | 73.50 | 67.00 | 71.50 | 86.50 | 45.60 | 73.70 | 67.50 | 45.10 | 39.20 | 75.34 | | |
| | L1-16B-A3B | 84.20 | 78.40 | 85.50 | 88.20 | 85.80 | 76.70 | 74.90 | 82.00 | 73.10 | 76.10 | 43.90 | 78.90 | 70.80 | 27.50 | 29.20 | 77.74 | | |
| ### Chat Task | |
| | Model | [HealthBench-Consensus](https://github.com/openai/simple-evals) | | |
| |:---|:---:| | |
| | GPT-OSS-120B | 90.60 | | |
| | GPT-OSS-20B | 78.70 | | |
| | Qwen3.5-122B | 92.20 | | |
| | MedGemma-27B | 90.70 | | |
| | Gemma4-26B-A4B | 92.60 | | |
| | L1-16B-A3B | 93.50 | | |
| ## π Citation | |
| ```bibtex | |
| @misc{lunit2026l1, | |
| title={L1: The First Clinical Language Model by Lunit}, | |
| author={Lunit}, | |
| year={2026}, | |
| url={https://huggingface.co/learning-unit/L1-16B-A3B} | |
| } | |
| ``` | |
| ## β οΈ Limitations | |
| - **Not a substitute for professional medical judgment.** L1 may generate factually incorrect, incomplete, or outdated clinical information. All outputs should be verified by qualified healthcare professionals. | |
| - **Thinking overhead.** Chain-of-thought reasoning in `<think>` tags increases token consumption and latency compared to non-thinking models of similar size. | |
| - **Context length.** Maximum context length is 32,768 tokens. | |
| - **No real-time knowledge.** The model's knowledge is limited to its training data cutoff and does not reflect the latest medical guidelines or drug approvals. | |
| ## π€ Acknowledgements | |
| This work was supported by the Domain-Specific Foundation Model Project (μΈκ³΅μ§λ₯ νΉν νμ΄λ°μ΄μ λͺ¨λΈ νλ‘μ νΈ), funded by the Ministry of Science and ICT (κ³ΌνκΈ°μ μ 보ν΅μ λΆ) and managed by the National IT Industry Promotion Agency (NIPA). | |
| L1 is a collaborative effort by the following consortium members: | |
| **Industry** | |
| - Lunit | |
| - Trillion Labs | |
| - SK Biopharmaceuticals | |
| - Kakao Healthcare | |
| - AIGEN Sciences | |
| - D-Circle | |
| - Rebellions | |
| - Standigm | |
| **Academia** | |
| - Prof. Choi Yun-jae's Lab from KAIST | |
| - Prof. Hong Seung-hoon's Lab from KAIST | |
| - Prof. Jung Yu-seong's Lab from SNU | |
| - Prof. Kim Hyun-woo's Lab from KAIST | |
| - Prof. Kim Tae-gyun's Lab from KAIST | |
| - Prof. Ye Jong-cheol's Lab from KAIST | |
| **Hospitals** | |
| - NHIS Ilsan Hospital | |
| - Ewha Womans University Seoul Hospital | |
| - Keimyung University Dongsan Medical Center | |
| - Konyang University Hospital | |
| - Korea University Research & Business Foundation | |
| - Kyung Hee University Hospital at Gangdong | |
| - Kyung Hee University Medical Center | |
| - Pusan National University Yangsan Hospital | |
| - Yongin Severance Hospital | |
| <p align="center"> | |
| <img src="consortium.png" alt="Consortium Members" style="width: 80%;"> | |
| </p> | |
| ## π License | |
| This model is licensed under the [Apache 2.0 License](LICENSE). | |
| ## π¬ Contact | |
| - Taesoo Kim (κΉνμ) β [taesoo.kim@lunit.io](mailto:taesoo.kim@lunit.io) | |
| - Donggeun Yoo (μ λκ·Ό) β [dgyoo@lunit.io](mailto:dgyoo@lunit.io) | |