LDARNet-110M

Pretrained LDARNet (~110M params) with learnable DNA tokenization (dynamic chunking + BiMamba-2).

Paper: arXiv:2606.04552
Code: ICML-LDARNet

Files

model_ckpt_110m.pt — MLM checkpoint with embedded LDarConfig

Download

Clone the code repo and install dependencies, then download the weights:

huggingface-cli download darlednik/LDARNet-110M model_ckpt_110m.pt --local-dir models_ckpts

Load

import torch
from ldar.utils.ckpt import load_ldar_from_ckpt

model, cfg = load_ldar_from_ckpt(
    "models_ckpts/model_ckpt_110m.pt",
    device="cuda",
    dtype=torch.bfloat16,
)

Architecture

Component	Layout	`d_model`
Encoder	`m3t1` — 3× BiMamba-2 + 1 local-attention layer	512
Backbone	`M10` — 10× BiMamba-2 (+ SwiGLU)	768
Decoder	`m4` — 4× BiMamba-2	512

Compression ratio N = 4
Byte vocabulary: {A, C, G, T, N, [MASK], <pad>}

Citation

@misc{ledneva2026ldarnetdnaadaptiverepresentation,
      title={LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling},
      author={Daria Ledneva and Denis Kuznetsov},
      year={2026},
      eprint={2606.04552},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2606.04552},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including darlednik/LDARNet-110M

LDARNet

Collection

2 items • Updated 3 days ago • 1

Paper for darlednik/LDARNet-110M

LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling

Paper • 2606.04552 • Published about 1 month ago • 1