Papers
arxiv:2603.21064

2Xplat: Two Experts Are Better Than One Generalist

Published on Mar 22
· Submitted by
Lani Ko
on Mar 25
Authors:
,
,
,
,
,

Abstract

A two-expert architecture for pose-free 3D Gaussian Splatting separates geometry estimation from appearance synthesis, achieving superior performance compared to unified monolithic approaches.

AI-generated summary

Pose-free feed-forward 3D Gaussian Splatting (3DGS) has opened a new frontier for rapid 3D modeling, enabling high-quality Gaussian representations to be generated from uncalibrated multi-view images in a single forward pass. The dominant approach in this space adopts unified monolithic architectures, often built on geometry-centric 3D foundation models, to jointly estimate camera poses and synthesize 3DGS representations within a single network. While architecturally streamlined, such "all-in-one" designs may be suboptimal for high-fidelity 3DGS generation, as they entangle geometric reasoning and appearance modeling within a shared representation. In this work, we introduce 2Xplat, a pose-free feed-forward 3DGS framework based on a two-expert design that explicitly separates geometry estimation from Gaussian generation. A dedicated geometry expert first predicts camera poses, which are then explicitly passed to a powerful appearance expert that synthesizes 3D Gaussians. Despite its conceptual simplicity, being largely underexplored in prior works, the proposed approach proves highly effective. In fewer than 5K training iterations, the proposed two-experts pipeline substantially outperforms prior pose-free feed-forward 3DGS approaches and achieves performance on par with state-of-the-art posed methods. These results challenge the prevailing unified paradigm and suggest the potential advantages of modular design principles for complex 3D geometric estimation and appearance synthesis tasks.

Community

This comment has been hidden (marked as Graphic Content)
Paper submitter
This comment has been hidden (marked as Resolved)

Key Idea:

  • A pose-free feed-forward 3D Gaussian Splatting framework that decouples geometry estimation and appearance generation into two specialized experts, enabling higher-quality novel view synthesis than previous monolithic architectures.

Highlights:

  • The two-expert design separates pose estimation and 3D Gaussian generation, enabling specialized learning beyond monolithic architectures.
  • It achieves state-of-the-art pose-free performance and matches or outperforms pose-dependent methods.
  • The framework is efficient, converging in under 5K iterations using pretrained experts.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2603.21064 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.21064 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.21064 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.