--- model-index: - name: Kulyk-UK-EN results: - task: type: text-generation dataset: type: facebook/flores name: FLORES split: devtest metrics: - type: bleu value: 36.27 name: BLEU library_name: transformers license: other license_name: lfm1.0 license_link: LICENSE language: - en - uk pipeline_tag: text-generation tags: - liquid - lfm2 - edge datasets: - lang-uk/FiftyFiveShades base_model: - LiquidAI/LFM2-350M --- A lightweight model to do machine translation from Ukrainian to English based on recently published LFM2 model. Use [demo](https://huggingface.co/spaces/Yehor/uk-en-translator) to test it. Also, there's another model: [kulyk-en-uk](https://huggingface.co/Yehor/kulyk-en-uk) **Run with Docker (CPU)**: ``` docker run -p 3000:3000 --rm ghcr.io/egorsmkv/kulyk-rust:latest ``` **Run using Apptainer (CUDA)**: 1. Run it using shell: ``` apptainer shell --nv ./kulyk.sif Apptainer> /opt/entrypoints/kulyk --verbose --n-len 1024 --model-path-ue /project/models/kulyk-uk-en.gguf --model-path-eu /project/models/kulyk-en-uk.gguf ``` 2. Run it as a webservice: ``` apptainer instance start --nv ./kulyk.sif kulyk-ws # go to http://localhost:3000 ``` **Facts**: - Fine-tuned with 40M samples (filtered by quality metric) from ~53.5M for 1.4 epochs - 354M params - Requires 1 GB of RAM to run with bf16 - BLEU on FLORES-200: 36.27 - Tokens per second: 229.93 (bs=1), 1664.40 (bs=10), 8392.48 (bs=64) - License: lfm1.0 **Info**: - Model name is inherited from name of [Sergiy Kulyk](https://en.wikipedia.org/wiki/Sergiy_Kulyk) who was chargé d'affaires of Ukraine in the United States **Training Info**: - Learning Rate: 3e-5 - Learning Rate scheduler type: cosine - Warmup Ratio: 0.05 - Max length: 2048 - Batch Size: 10 - `packed=True` - Sentences <= 1000 chars - Gradient accumulation steps: 4 - Used Flash Attention 2 - Time for epoch: 32 hours - 2 cards of NVIDIA RTX 3090 Ti (24G) - `accelerate` with DeepSpeed - Memory usage: 22.212GB-22.458GB - torch 2.7.1 **Acknowledgements**: - [Dmytro Chaplynskyi](https://huggingface.co/dchaplinsky) for providing compute to train this model - [lang-uk](https://huggingface.co/lang-uk) members for their compilation of different MT datasets