3DTV: A Feedforward Interpolation Network for Real-Time View Synthesis
Abstract
3DTV combines lightweight geometry with learning for real-time sparse-view interpolation, achieving efficient and robust multi-view rendering without scene-specific optimization.
Real-time free-viewpoint rendering requires balancing multi-camera redundancy with the latency constraints of interactive applications. We address this challenge by combining lightweight geometry with learning and propose 3DTV, a feedforward network for real-time sparse-view interpolation. A Delaunay-based triplet selection ensures angular coverage for each target view. Building on this, we introduce a pose-aware depth module that estimates a coarse-to-fine depth pyramid, enabling efficient feature reprojection and occlusion-aware blending. Unlike methods that require scene-specific optimization, 3DTV runs feedforward without retraining, making it practical for AR/VR, telepresence, and interactive applications. Our experiments on challenging multi-view video datasets demonstrate that 3DTV consistently achieves a strong balance of quality and efficiency, outperforming recent real-time novel-view baselines. Crucially, 3DTV avoids explicit proxies, enabling robust rendering across diverse scenes. This makes it a practical solution for low-latency multi-view streaming and interactive rendering. Project Page: https://stefanmschulz.github.io/3DTV_webpage/
Community
Hey everyone,
this is Stefan, the first author of the 3DTV paper. I am currently setting up the webpage which should be fully updated by today's evening (GMT-1). The Code is under review and being cleaned up. I expect to push it together with pre-trained models around the beginning or mid of May but it could take a little bit longer than that.
Let me know what you think and I'll happily discuss the paper with you or talk about collaborations for other projects in this area!
Best,
Stefan 🤗
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Real-Time Human Frontal View Synthesis from a Single Image (2026)
- LiveStre4m: Feed-Forward Live Streaming of Novel Views from Unposed Multi-View Video (2026)
- NavCrafter: Exploring 3D Scenes from a Single Image (2026)
- Real-Time Human Reconstruction and Animation using Feed-Forward Gaussian Splatting (2026)
- UniQueR: Unified Query-based Feedforward 3D Reconstruction (2026)
- M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM (2026)
- DAGE: Dual-Stream Architecture for Efficient and Fine-Grained Geometry Estimation (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2604.11211 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper