Buckets:
160 GB
74 files
Updated 7 days ago
Ctrl+K
| Name | Size | Uploaded | Xet hash |
|---|---|---|---|
| README.md | 951 Bytes xet | e43f9389 | |
| config.json | 991 Bytes xet | 2c9dafb0 | |
| convert.py | 7.08 kB xet | 969b192b | |
| generate.py | 6.3 kB xet | 01b1677a | |
| kernel.py | 22.2 kB xet | e353d1ed | |
| model.py | 38.6 kB xet | 17243ed6 | |
| requirements.txt | 92 Bytes xet | 0acba707 |
Inference code for DeepSeek models
First convert huggingface model weight files to the format of this project.
export EXPERTS=256
export MP=4
export CONFIG=config.json
python convert.py --hf-ckpt-path ${HF_CKPT_PATH} --save-path ${SAVE_PATH} --n-experts ${EXPERTS} --model-parallel ${MP}
Then chat with DeepSeek model at will!
torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --interactive
Or batch inference from file.
torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --input-file ${FILE}
Or multi nodes inference.
torchrun --nnodes ${NODES} --nproc-per-node $((MP / NODES)) --node-rank $RANK --master-addr $ADDR generate.py --ckpt-path ${SAVE_PATH} --config ${CONFIG} --input-file ${FILE}
If you want to use fp8, just remove "expert_dtype": "fp4" in config.json and specify --expert-dtype fp8 in convert.py.
- Total size
- 160 GB
- Files
- 74
- Last updated
- May 22
- Pre-warmed CDN
- US EU US EU