Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings
Abstract
Text embeddings from large language models are enhanced by EmbedFilter, a linear transformation that reduces the influence of high-frequency tokens and improves semantic representations while enabling dimensionality reduction.
Large language models exhibit impressive zero-shot capabilities across a wide range of downstream tasks. However, they struggle to function as off-the-shelf embedding models, leading to suboptimal performance on massive text embedding benchmarks. In this paper, we identify a potential cause underlying this deficiency. Our motivation stems from an unexpected observation: text embeddings tend to align with frequent but uninformative tokens when projected onto the vocabulary space. We argue that this excessive expression of high-frequency tokens suppresses the model's ability to capture nuanced semantics. To address this, we introduce EmbedFilter, a simple linear transformation designed to refine text embeddings derived from LLMs directly. Specifically, we uncover that the unembedding matrix within LLMs encodes a latent space that is actively writing these frequent tokens into embedding space. By filtering out this subspace, EmbedFilter suppress the influence of high-frequency tokens, thereby enhancing semantic representations. As a compelling byproduct, this enables an inherent dimensionality reduction, lowering index storage and speedup retrieval while fully preserving the refined embedding quality. Our experiments across multiple LLM backbones demonstrate that LLMs equipped with EmbedFilter achieve superior zero-shot downstream performance even with significantly reduced embedding dimensions. We hope our findings provide deeper insights into the mechanisms of LLM-based representations and inspire more principled designs to improve text embeddings training. Our code is available at https://github.com/CentreChen/EmbFilter.
Community
We show that the unemebdding matrix within LLMs serve as an overlooked feature extractor for free. It encodes a latent semantic space; filtering out its effects from the primary text embeddings markedly improves zero-shot representation performance. We also empirically confirm that this can be achieved through a simple linear transformation, which results in a reduction in vector dimensionality as an bonus.
Cool paper - I liked the way "Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings" frames the problem without making it feel too abstract.
Curious if you think this would still work once the setup gets messier in the wild?
I made a podcast on it with ResearchPod, it makes it easy to get the key concepts on the go:
https://researchpod.app/episode/e08d64a5-f5d5-4555-9d1f-db797b88cc1b
Thanks for checking it out! We found this phenomenon holds up across a wide range of LLMs and even embedding models trained via contrastive learning (where the unembed matrix wasn't part of the training).
To your point about the "wild" — we've actually used these insights to develop a new training framework aimed at improving text embedding models. We hope this will help boost embedding capabilities across much "messier," real-world scenarios.
Stay tuned, it's coming soon!
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- ReverseEOL: Improving Training-free Text Embeddings via Text Reversal in Decoder-only LLMs (2026)
- ACE: Anisotropy-Controllable Embedding for LLM-enhanced Sequential Recommendation (2026)
- Embedding-based In-Context Prompt Training for Enhancing LLMs as Text Encoders (2026)
- Asymmetric Generative Recommendation via Multi-Expert Projection and Multi-Faceted Hierarchical Quantization (2026)
- DocRetriever: A Plug-and-Play Framework for Multimodal Document Retrieval with Comprehensive Benchmark (2026)
- ReaLM: Residual Quantization Bridges Knowledge Graph Embeddings and Large Language Models (2025)
- SLQ: Bridging Modalities via Shared Latent Queries for Retrieval with Frozen MLLMs (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
the neat thing is treating the unembedding matrix as a latent edge spectrum that actively writes frequent tokens into embedding space, and then using a small linear filter to suppress that subspace. EmbedFilter is parameter-free and can be dropped on top of different backbones without retraining, which makes it a very practical plug-in and it doubles as an implicit dimensionality reducer. my worry to test would be how stable that edge spectrum is across vocab changes, languages, or domain-specific jargon; if the token frequency profile shifts, would the filter misfire and start erasing real semantic cues? the arxivlens breakdown helped me parse the method details, there’s a solid walkthrough on arXivLens that covers the logit lens and the edge spectrum: https://arxivlens.com/PaperView/Details/your-unembedding-matrix-is-secretly-a-feature-lens-for-text-embeddings-7023-1a8cfa36
Thank you for sharing the paper. Nice paper! But I have few questions that I hope can help sharpen the contribution.
First, when I did a quick check, I noticed that the Moore–Penrose pseudoinverse of the unembedding matrix seemed quite close to the input embedding matrix under a few simple measures. I was wondering whether you have compared EmbedFilter with a version that uses the input embedding matrix directly. If the results turn out similar, the method could become even simpler, and it might also help us see whether the improvement comes specifically from the unembedding perspective, or from a broader effect shared by both embedding matrices.
Second, I was wondering if the connection between unembedding singular directions, high-frequency tokens, and anisotropy might be more closely related to existing findings than the paper seems to suggest. This reminds me of earlier work on logit spectroscopy / spectral filtering, as well as studies on how frequency or probability information can be reflected in embedding and output-embedding geometry. For example, Cancedda (2024) already discusses spectral filters over embedding/unembedding singular vectors and points out the relation between leading unembedding singular directions and token frequency.
It would be really helpful if the paper could more clearly separate what is new here from parts that mainly reinterpret or apply these known phenomena to sentence-embedding post-processing.
Regarding the first point, based on our humble knowledge of LLMs, we believe the statement “the Moore–Penrose pseudoinverse of the unembedding matrix seemed quite close to the input embedding matrix” is a misunderstanding. Secondly, we have conducted experiments using the embedding matrix (which were not presented in the paper), and the conclusion is that it does not work. If you have any experimental evidence, we would be glad to see it, as it would serve as an extension of our work and a contribution to the academic community.
As to Point 2, we have made it very clear in our paper that our method is built on logit spectroscopy. We believe that using established tools to explore new phenomena and applications—much like applying mathematical formulas to solve new problems—is by no means trivial. As to the mentioned footnote in Cancedda (2024), where the original text states that "the FIRST singular vector on UnEmbed Matrix shows that it is highly representative of token frequency," we trust that researchers who have carefully reviewed both works and noticed this footnote will be able to form their own judgment.
We are deeply grateful for the tools provided by Cancedda, which allowed us to reflect on domain-specific problems, and give us the opportunity to make a contribution. We welcome meaningful discussions, grounded in a solid understanding of LLMs and these tools, on how to effectively leverage these insights to guide better real-world language model training and applications. After all, in the eyes of some reviewers, FlashAttention is merely a simple combination of existing tiling techniques and online softmax. Yet, before Tri Dao, no one seemed to have noticed that such a bottleneck existed in attention mechanisms. Since Cancedda's work was published more than two years ago, little work has analyzed existing research in text embeddings from the perspective of parameter analysis. We hope our efforts serve as an exploration that provokes further thought within the scope of embedding training.
This is exactly what we are currently doing: based on our findings, we are already working on a method to improve text embeddings during the training phase, which we hope to release soon. Anyway, thank you for your questions and we have nothing further to add to this discussion.
Get this paper in your agent:
hf papers read 2606.07502 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper