LDARNet-110M

Pretrained LDARNet (~110M params) with learnable DNA tokenization (dynamic chunking + BiMamba-2).

Files

  • model_ckpt_110m.pt — MLM checkpoint with embedded LDarConfig

Download

Clone the code repo and install dependencies, then download the weights:

huggingface-cli download darlednik/LDARNet-110M model_ckpt_110m.pt --local-dir models_ckpts

Load

import torch
from ldar.utils.ckpt import load_ldar_from_ckpt

model, cfg = load_ldar_from_ckpt(
    "models_ckpts/model_ckpt_110m.pt",
    device="cuda",
    dtype=torch.bfloat16,
)

Architecture

Component Layout d_model
Encoder m3t1 — 3× BiMamba-2 + 1 local-attention layer 512
Backbone M10 — 10× BiMamba-2 (+ SwiGLU) 768
Decoder m4 — 4× BiMamba-2 512
  • Compression ratio N = 4
  • Byte vocabulary: {A, C, G, T, N, [MASK], <pad>}

Citation

@misc{ledneva2026ldarnetdnaadaptiverepresentation,
      title={LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling},
      author={Daria Ledneva and Denis Kuznetsov},
      year={2026},
      eprint={2606.04552},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2606.04552},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including darlednik/LDARNet-110M

Paper for darlednik/LDARNet-110M