Marko Mihajlovic · Sergey Prokudin · Marc Pollefeys · Siyu Tang
ResField layers incorporate time-dependent weights into MLPs to effectively represent complex temporal signals.
2D Video Approximation | Temporal SDF Capture |
Dynamic NeRFs from 4 RGB views | Dynamic NeRFs from 3 RGB-D |
- [2023/10/01] Code released.
Our key idea is to substitute one or several MLP layers with time-dependent layers whose weights are modeled as trainable residual parameters added to the existing layer weights.
We propose to implement the residual parameters as a global low-rank spanning set and a set of time-dependent coefficients. this modeling enhances the generalization properties and further reduces the memory footprint caused by maintaining additional network parameters.
These residual weights are modeled as a learnable low-rank composition.
Increasing the model capacity in this way offers three key advantages:
- Runtime: the underlying MLP does not increase in size and hence maintains the inference and training speed.
- Generalizability: retains the implicit regularization and generalization properties of MLPs.
- Universality: ResFields are versatile, easily extendable, and compatible with most MLP-based methods for spatiotemporal signals.
Please consider citing our work if you find it useful
@inproceedings{mihajlovic2024ResFields,
title={{ResFields}: Residual Neural Fields for Spatiotemporal Signals},
author={Mihajlovic, Marko and Prokudin, Sergey and Pollefeys, Marc and Tang, Siyu},
booktitle={International Conference on Learning Representations (ICLR)},
year={2024}
}
- See installation to install all the required packages
- See data preparation to set up the datasets
- See benchmark on how to run various experiments and reproduce results from the paper
- Release RGB-D data
- Release data preprocessing code
@inproceedings{mihajlovic2024ResFields,
title={{ResFields}: Residual Neural Fields for Spatiotemporal Signals},
author={Mihajlovic, Marko and Prokudin, Sergey and Pollefeys, Marc and Tang, Siyu},
booktitle={International Conference on Learning Representations (ICLR)},
year={2024}
}
We thank Hongrui Cai and Ruizhi Shao for providing additional details about the baseline methods and Anpei Chen, Shaofei Wang, and Songyou Peng for proofreading the manuscript and proving useful suggestions.
Some great prior work we benefit from:
- Siren for the 2D video approximation task
- NeuS for data preprocessing and following their data format
- Owlii, DeformingThings4D, and ReSynth for datasets
- PyTorch3D for visualizing meshes and some evaluation scripts
- Instant NSR for inspiring the code structure
This project has been supported by the Innosuisse Flagship project PROFICIENCY.
The code and models are available for use without any restrictions. See the LICENSE file for details.
Please open a PR or contact Marko Mihajlovic for any questions. We greatly appreciate everyone's feedback and insights. Please do not hesitate to get in touch.