4DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Vide

CVPR 2025


Qiang Hu*, Zihan Zheng*, Houqiang Zhong, Sihua Fu , Li Song Xiaoyun Zhang, Guangtao Zhai, Yanfeng Wang

* co-first authors    corresponding authors

Paper Code

3D Gaussian Splatting (3DGS) has substantial potential for enabling photorealistic Free-Viewpoint Video (FVV) experiences. However, the vast number of Gaussians and their associated attributes poses significant challenges for storage and transmission. Existing methods typically handle dynamic 3DGS representation and compression separately, neglecting motion information and the rate-distortion (RD) trade-off during training, leading to performance degradation and increased model redundancy. To address this gap, we propose 4DGC, a novel rate-aware 4D Gaussian compression framework that significantly reduces storage size while maintaining superior RD performance for FVV. Specifically, 4DGC introduces a motion-aware dynamic Gaussian representation that utilizes a compact motion grid combined with sparse compensated Gaussians to exploit inter-frame similarities. This representation effectively handles large motions, preserving quality and reducing temporal redundancy. Furthermore, we present an end-to-end compression scheme that employs differentiable quantization and a tiny implicit entropy model to compress the motion grid and compensated Gaussians efficiently. The entire framework is jointly optimized using a rate-distortion trade-off. Extensive experiments demonstrate that 4DGC supports variable bitrates and consistently outperforms existing methods in RD performance across multiple datasets.

System Overview



Left: 4DGC results, showcasing flexible quality levels across various bitrates. Middle: Comparison of visual quality and bitrate with state-of-the-art methods. Right: The RD performance of our approach surpasses that of prior work (e.g. 3DGStream, ReRF, TeTriRF).

Framework



Illustration of the 4DGC Framework. The reconstructed Gaussians from the previous frame are retrieved from the reference buffer and combined with the input images of the current frame to facilitate learning of the motion grid and the compensated Gaussians through a two-stage training process. In the first stage, the motion grid and its associated entropy model are optimized. In the second stage, the compensated Gaussians are refined along with their corresponding entropy model. Both stages are supervised by a rate-distortion trade-off, employing simulated quantization and an entropy model to jointly optimize representation and compression.




Illustration of our motion-aware dynamic Gaussian modeling that utilizes a multi-resolution motion grid with sparse compensated Gaussians to exploit inter-frame similarities.

Comparison



We present a qualitative comparison with ReRF, TeTriRF, and 3DGStream on the coffee martini sequence from the N3DV dataset and the trimming sequence from the MeetRoom dataset, as shown in the figure. Our approach achieves comparable reconstruction quality to 3DGStream at a substantially lower bitrate, achieving a compression rate exceeding 16×. Compared to ReRF and TeTriRF, our 4DGC more effectively preserves finer details—such as the head, window, bottles, and books in coffee martini, and the face, hand, plant, and scissor in trimming—which are lost in the reconstructions of these two methods. This demonstrates that our 4DGC accurately captures dynamic scene elements and maintains high-quality detail in intricate objects while achieving a highly compact model size.

Bibtex


@InProceedings{Hu_2025_CVPR, author = {Hu, Qiang and Zheng, Zihan and Zhong, Houqiang and Fu, Sihua and Song, Li and Zhang, Xiaoyun and Zhai, Guangtao and Wang, Yanfeng}, title = {4DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Video}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {875-885} }