GIFSplat: Generative Prior-Guided Iterative Feed-Forward 3D Gaussian Splatting from Sparse Views

Tianyu Chen1, Wei Xiang1, Kang Han1, Yu Lu1, Di Wu1, Gaowen Liu2, Ramana Rao Kompella2
1La Trobe University 2Cisco Research
GIFSplat teaser figure

GIFSplat enables scene-adaptive refinement for sparse-view 3D Gaussian Splatting with forward-only iterative updates and lightweight generative prior guidance.

Replace ./static/images/teaser.png with your teaser figure, or switch to ./static/videos/teaser.mp4 if you prefer a looping teaser video.

Abstract

Feed-forward 3D reconstruction offers substantial runtime advantages over per-scene optimization, which remains slow at inference and often fragile under sparse views. However, existing feed-forward methods still have potential for further performance gains, especially for out-of-domain data, and struggle to retain second-level inference time once a generative prior is introduced. These limitations stem from the one-shot prediction paradigm in existing feed-forward pipeline: models are strictly bounded by capacity, lack inference-time refinement, and are ill-suited for continuously injecting generative priors. We introduce GIFSplat, a purely feed-forward iterative refinement framework for 3D Gaussian Splatting from sparse unposed views. A small number of forward-only residual updates progressively refine current 3D scene using rendering evidence, achieve favorable balance between efficiency and quality. Furthermore, we distill a frozen diffusion prior into Gaussian-level cues from enhanced novel renderings without gradient backpropagation or ever-increasing view-set expansion, thereby enabling per-scene adaptation with generative prior while preserving feed-forward efficiency. Across DL3DV, RealEstate10K, and DTU, GIFSplat consistently outperforms state-of-the-art feed-forward baselines, improving PSNR by up to +2.1 dB, and it maintains second-scale inference time without requiring camera poses or any test-time gradient optimization.

Iterative feed-forward refinement

A weight-shared residual update module progressively refines a fixed Gaussian set through multiple forward passes, improving scene fidelity without gradient-based test-time optimization.

Generative prior fusion

A frozen diffusion enhancer provides lightweight discrepancy cues from enhanced renderings, bringing generative knowledge into the refinement loop without expensive iterative optimization.

Efficient and pose-free

The framework preserves feed-forward efficiency, works from sparse unposed views, and improves robustness across both in-domain and out-of-domain benchmarks.

Method Overview

Overview of GIFSplat

GIFSplat consists of a Gaussian initializer, an iterative Gaussian head, and a generative prior fusion module. Starting from sparse unposed inputs, the model predicts an initial 3D Gaussian scene and then refines it via forward-only residual updates guided by observation evidence and diffusion-derived cues.

Recommended asset: export your paper's main pipeline figure to ./static/images/overview.png.

Progressive Refinement

Iterative refinement visualization

Rather than predicting the final scene in a single pass, GIFSplat updates the Gaussian representation step by step. Each refinement round uses the current state, rendered evidence, and optional prior cues to predict residual corrections to geometry and appearance.

This design preserves the simplicity of feed-forward inference while giving the model a mechanism to correct residual errors, especially in sparse-view and under-constrained regions.

Recommended asset: use your paper's refinement visualization figure or a custom qualitative step-by-step comparison.

Qualitative Results

You can keep this section image-based for simplicity, or replace each panel with short MP4 clips for better visual impact.

RealEstate10K

RealEstate10K results

Sparse-view reconstruction examples on RealEstate10K highlighting sharper appearance and fewer artifacts after iterative refinement.

DL3DV

DL3DV results

In-domain results on DL3DV showing stronger detail recovery and improved rendering consistency under sparse observation.

Cross-domain

DTU results

Cross-domain evaluation demonstrating improved robustness when generalizing beyond the main training distribution.

Overall results grid

Optional summary figure for side-by-side comparisons against feed-forward baselines.

Citation

@article{chen2026gifsplat,
  title={GIFSplat: Generative Prior-Guided Iterative Feed-Forward 3D Gaussian Splatting from Sparse Views},
  author={Chen, Tianyu and Xiang, Wei and Han, Kang and Lu, Yu and Wu, Di and Liu, Gaowen and Kompella, Ramana Rao},
  journal={arXiv preprint arXiv:2602.22571},
  year={2026}
}