Segformer-Base: Optimized for Qualcomm Devices

Segformer Base is a machine learning model that predicts masks and classes of objects in an image.

This is based on the implementation of Segformer-Base found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime	Precision	Chipset	SDK Versions	Download
ONNX	float	Universal	QAIRT 2.42, ONNX Runtime 1.24.3	Download
ONNX	w8a16	Universal	QAIRT 2.42, ONNX Runtime 1.24.3	Download
ONNX	w8a8	Universal	QAIRT 2.42, ONNX Runtime 1.24.3	Download
QNN_DLC	float	Universal	QAIRT 2.43	Download
TFLITE	float	Universal	QAIRT 2.43, TFLite 2.19.1	Download
TFLITE	w8a8	Universal	QAIRT 2.43, TFLite 2.19.1	Download

For more device-specific assets and performance metrics, visit Segformer-Base on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

Custom weights (e.g., fine-tuned checkpoints)
Custom input shapes
Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for Segformer-Base on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.semantic_segmentation

Model Stats:

Model checkpoint: nvidia/segformer-b0-finetuned-ade-512-512
Input resolution: 512x512
Number of output classes: 150
Number of parameters: 3.75M
Model size (float): 14.4 MB
Model size (w8a16): 4.57 MB
Model size (w8a8): 3.90 MB

Performance Summary

Model	Runtime	Precision	Chipset	Inference Time (ms)	Peak Memory Range (MB)	Primary Compute Unit
Segformer-Base	ONNX	float	Snapdragon® 8 Elite Gen 5 Mobile	74.11 ms	24 - 216 MB	NPU
Segformer-Base	ONNX	float	Snapdragon® X2 Elite	72.551 ms	34 - 34 MB	NPU
Segformer-Base	ONNX	float	Snapdragon® X Elite	112.531 ms	33 - 33 MB	NPU
Segformer-Base	ONNX	float	Snapdragon® 8 Gen 3 Mobile	82.136 ms	25 - 256 MB	NPU
Segformer-Base	ONNX	float	Qualcomm® QCS8550 (Proxy)	108.436 ms	19 - 28 MB	NPU
Segformer-Base	ONNX	float	Qualcomm® QCS9075	113.253 ms	23 - 26 MB	NPU
Segformer-Base	ONNX	float	Snapdragon® 8 Elite For Galaxy Mobile	74.201 ms	23 - 214 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® 8 Elite Gen 5 Mobile	6.779 ms	14 - 221 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® X2 Elite	6.819 ms	13 - 13 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® X Elite	15.321 ms	18 - 18 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® 8 Gen 3 Mobile	10.353 ms	12 - 252 MB	NPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCS6490	742.54 ms	385 - 391 MB	CPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCS8550 (Proxy)	14.858 ms	9 - 16 MB	NPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCS9075	20.798 ms	12 - 15 MB	NPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCM6690	353.439 ms	333 - 344 MB	CPU
Segformer-Base	ONNX	w8a16	Snapdragon® 8 Elite For Galaxy Mobile	8.554 ms	13 - 217 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® 7 Gen 4 Mobile	318.687 ms	366 - 378 MB	CPU
Segformer-Base	ONNX	w8a8	Snapdragon® 8 Elite Gen 5 Mobile	4.572 ms	6 - 202 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® X2 Elite	4.602 ms	4 - 4 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® X Elite	11.675 ms	9 - 9 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® 8 Gen 3 Mobile	7.597 ms	6 - 231 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS6490	277.259 ms	194 - 202 MB	CPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS8550 (Proxy)	10.978 ms	2 - 13 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS9075	11.572 ms	7 - 10 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCM6690	175.328 ms	195 - 207 MB	CPU
Segformer-Base	ONNX	w8a8	Snapdragon® 8 Elite For Galaxy Mobile	5.575 ms	8 - 204 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® 7 Gen 4 Mobile	157.633 ms	128 - 139 MB	CPU
Segformer-Base	QNN_DLC	float	Snapdragon® 8 Elite Gen 5 Mobile	73.829 ms	3 - 194 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® X2 Elite	73.271 ms	3 - 3 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® X Elite	114.593 ms	3 - 3 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® 8 Gen 3 Mobile	83.847 ms	0 - 226 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS8275 (Proxy)	210.561 ms	0 - 185 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS8550 (Proxy)	110.179 ms	3 - 78 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® SA8775P	112.414 ms	1 - 183 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS9075	113.595 ms	3 - 17 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS8450 (Proxy)	122.184 ms	2 - 223 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® SA7255P	210.561 ms	0 - 185 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® SA8295P	122.632 ms	3 - 184 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® 8 Elite For Galaxy Mobile	74.571 ms	3 - 196 MB	NPU
Segformer-Base	TFLITE	float	Snapdragon® 8 Elite Gen 5 Mobile	73.979 ms	16 - 211 MB	NPU
Segformer-Base	TFLITE	float	Snapdragon® 8 Gen 3 Mobile	82.7 ms	8 - 235 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® QCS8275 (Proxy)	210.505 ms	10 - 197 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® QCS8550 (Proxy)	110.047 ms	9 - 12 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® SA8775P	112.394 ms	10 - 194 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® QCS9075	112.427 ms	8 - 30 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® QCS8450 (Proxy)	123.061 ms	0 - 222 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® SA7255P	210.505 ms	10 - 197 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® SA8295P	122.654 ms	9 - 194 MB	NPU
Segformer-Base	TFLITE	float	Snapdragon® 8 Elite For Galaxy Mobile	74.914 ms	9 - 194 MB	NPU
Segformer-Base	TFLITE	w8a8	Snapdragon® 8 Elite Gen 5 Mobile	4.401 ms	2 - 185 MB	NPU
Segformer-Base	TFLITE	w8a8	Snapdragon® 8 Gen 3 Mobile	7.549 ms	2 - 209 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS6490	126.202 ms	15 - 50 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS8275 (Proxy)	19.666 ms	2 - 177 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS8550 (Proxy)	10.884 ms	2 - 6 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® SA8775P	11.475 ms	2 - 178 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS9075	11.357 ms	0 - 10 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCM6690	95.908 ms	13 - 176 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS8450 (Proxy)	14.871 ms	2 - 210 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® SA7255P	19.666 ms	2 - 177 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® SA8295P	13.593 ms	2 - 182 MB	NPU
Segformer-Base	TFLITE	w8a8	Snapdragon® 8 Elite For Galaxy Mobile	5.53 ms	0 - 173 MB	NPU
Segformer-Base	TFLITE	w8a8	Snapdragon® 7 Gen 4 Mobile	39.272 ms	15 - 63 MB	NPU

License

The license for the original implementation of Segformer-Base can be found here.

References

Community

Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
For questions or feedback please reach out to us.

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for qualcomm/Segformer-Base

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Paper • 2105.15203 • Published May 31, 2021 • 3