perf: reduce per-query latency — shared model, single encode, float32 LLM d624b44 hodfa840 commited on 16 days ago