M-LSD-tiny — LiteRT (on-device line segment detection, fully-GPU)

M-LSD (NAVER, AAAI 2022) light-weight real-time line segment detection, converted to LiteRT and running fully on the CompiledModel GPU (ML Drift) on Android. Detects straight line segments — building edges, document borders, wireframes, room layout. The tiny variant (MobileNetV2 backbone, 0.62M params) is 1.4 MB in fp16.

On-device (Pixel 8a, Tensor G3 — verified)


nodes on GPU	99 / 99 LITERT_CL (full residency)
inference	~2 ms (512×512)
size	1.4 MB (fp16)
accuracy	device-vs-PyTorch corr 0.997 (127 vs 128 lines decoded)

image[1,4,512,512] (RGB + ones channel, scaled to [-1,1]) →[GPU: MobileNetV2 U-Net]→ tpMap[1,9,256,256]

The output is a "TP map": channel 0 = line-center heatmap, channels 1–4 = start/end displacement. The decode (sigmoid + 3×3 NMS over centers, displacement → endpoints, ×2) runs on the host.

How it converts (litert-torch)

Pure CNN encoder-decoder. A single re-authoring: the decoder's F.interpolate(bilinear, align_corners=True) → align_corners=False (the Mali delegate bans align_corners=True + half-pixel). MobileNetV2 has no max-pool (strided convs → no PADV2), and the upsample is RESIZE_BILINEAR, not a transposed conv → fully GPU-clean. Result: banned ops NONE, all tensors ≤4D, tflite-vs-torch corr 1.0, device-vs-torch corr 0.997.

Preprocessing & decode

Resize to 512×512, append a 4th channel of ones, scale (x/127.5) - 1, NCHW. Decode: sigmoid the center map, 3×3 max NMS, threshold (0.10), displacement → endpoints, filter by length, ×2 to 512-space.

License

Apache-2.0. Upstream: navervision/mlsd; PyTorch port lhwcv/mlsd_pytorch.

Downloads last month: -