Instructions to use Qwen/Qwen3-VL-Reranker-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Qwen/Qwen3-VL-Reranker-8B with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Qwen/Qwen3-VL-Reranker-8B") model = AutoModelForImageTextToText.from_pretrained("Qwen/Qwen3-VL-Reranker-8B") - sentence-transformers
How to use Qwen/Qwen3-VL-Reranker-8B with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("Qwen/Qwen3-VL-Reranker-8B") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Integrate with Sentence Transformers v5.4
Hello!
Pull Request overview
- Integrate this model using a Sentence Transformers
CrossEncoder
Details
This PR adds the configuration files needed to load this model directly as a CrossEncoder via Sentence Transformers. The model uses an any-to-any Transformer with a LogitScore head that computes the logit difference between the "yes" and "no" tokens, i.e. the model's confidence that a document is relevant to a query. The model supports text, image, video, and multimodal (e.g. combinations of the previous) inputs via a structured message format.
A custom additional_chat_templates/reranker.jinja maps Sentence Transformers' structured messages (with "query" and "document" roles) to the model's expected format with the <Instruct>, <Query>, and <Document> fields, including the system prompt for yes/no judgment. The template includes a default instruction ("Given a search query, retrieve relevant candidates that answer the query.") as a fallback when no prompt is provided. unpad_inputs is set to false as Qwen3 can't flatten inputs nicely.
Added files:
modules.json: pipeline:Transformer&LogitScoresentence_bert_config.json:any-to-anytask, structured message format, multimodal configconfig_sentence_transformers.json: default prompt ("Retrieve text relevant to the user's query."), Identity activationadditional_chat_templates/reranker.jinja: custom template for the reranker format1_LogitScore/config.json: yes/no token IDs
Once the Sentence Transformers v5.4 release is out, the model can be used immediately like so:
from sentence_transformers import CrossEncoder
model = CrossEncoder("Qwen/Qwen3-VL-Reranker-8B", revision="refs/pr/9")
query = "A woman playing with her dog on a beach at sunset."
documents = [
"A woman shares a joyful moment with her golden retriever on a sun-drenched beach at sunset, as the dog offers its paw in a heartwarming display of companionship and trust.",
"https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
{
"text": "A woman shares a joyful moment with her golden retriever on a sun-drenched beach at sunset, as the dog offers its paw in a heartwarming display of companionship and trust.",
"image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
},
]
prompt = "Retrieve images or text relevant to the user's query."
pairs = [(query, doc) for doc in documents]
scores = model.predict(pairs, prompt=prompt)
print(scores)
# [1.3125, 0.25, 0.4375]
rankings = model.rank(query, documents, prompt=prompt)
print(rankings)
# [{'corpus_id': 0, 'score': 1.3125}, {'corpus_id': 2, 'score': 0.4375}, {'corpus_id': 1, 'score': 0.25}]
And after merging, the revision argument can be dropped.
Note that none of the old behaviour is affected/changed. It only adds an additional way to run this model in a familiar and common format.
If you are able to merge this before tomorrow's Sentence Transformers v5.4 release, then I will be able to include this in my blogpost and documentation as a release model without revision. Otherwise I'll document it with revision and I can drop that later.
- Tom Aarsen