Table 1. Quantitative comparison on Neural 3D dataset. In the Colmap column, SA denotes 'Sparse point cloud for All frames' and D0 denotes 'Dense point cloud for the 0th frame'. Following the original STG paper, which reports training six models for every 50 frames, we provide results for both the multi-model approach and a single-model approach trained on the full 300-frame sequence.
| Method | Colmap | Preproc. Time ↓ | PSNR ↑ | SSIM ↑ | LPIPS ↓ | Train Time ↓ | FPS ↑ | Storage ↓ | Frames |
|---|---|---|---|---|---|---|---|---|---|
| 4DGS | D0 | 6 mins | 28.72 | 0.9306 | 0.1528 | 33 mins | 98 | 40.3 | 300 |
| STG | SA | 25 mins | 31.75 | 0.9473 | 0.1423 | 2h 43mins | 683 | 127.5 | 50×6 |
| STG | SA | 25 mins | 31.46 | 0.9432 | 0.1474 | 29 mins | 532 | 54.0 | 300 |
| TaylorG | SA | 25 mins | 29.80 | 0.9558 | 0.1597 | 9 hours | 125 | 205.7 | 300 |
| Swift4D | D0 | 18 mins | 29.93 | 0.9383 | 0.1370 | 19 mins | 273 | 141.2 | 300 |
| DeGauss | D0 | 6 mins | 30.16 | 0.9357 | 0.1430 | 1h 27mins | 95 | 117.5 | 300 |
| OURS-35K | ✗ | 4 sec | 32.35 | 0.9480 | 0.1295 | 10 mins | 766 | 23.1 | 300 |
| OURS-45K | ✗ | 4 sec | 32.72 | 0.9502 | 0.1221 | 14 mins | 755 | 23.7 | 300 |