Robust Local Light Field Synthesis via Occlusion-aware Sampling and Deep Visual Feature Fusion
Tóm tắt
Từ khóa
Tài liệu tham khảo
R. C. Bolles, H. H. Baker, D. H. Marimont. Epipolar-plane image analysis: An approach to determining structure from motion. International Journal of Computer Vision, vol. 1, no. 1, pp. 7–55, 1987. DOI: https://doi.org/10.1007/BF00128525.
W. P. Xing, J. Chen, Z. F. Yang, Q. Wang, Y. K. Guo. Scale-consistent fusion: From heterogeneous local sampling to global immersive rendering. IEEE Transactions on Image Processing, vol. 31, pp. 6109–6123, 2022. DOI: https://doi.org/10.1109/TIP.2022.3205745.
R. T. Collins. A space-sweep approach to true multi-image matching. In Proceedings of CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, pp. 358–363, 1996. DOI: https://doi.org/10.1109/CVPR.1996.517097.
D. G. Dansereau, B. Girod, G. Wetzstein. LiFF: Light field features in scale and depth. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp.0344–0433, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00823.
N. K. Kalantari, T. C. Wang, R. Ramamoorthi. Learning-based view synthesis for light field cameras. ACM Transactions on Graphics, vol. 35, no. 6, Article number 193, 2016. DOI: https://doi.org/10.1145/2980179.2980251.
P. P. Srinivasan, T. Z. Wang, A. Sreelal, R. Ramamoorthi, R. Ng. Learning to synthesize a 4D RGBD light field from a single image. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 2262–2270, 2017. DOI: https://doi.org/10.1109/ICCV.2017.246.
Y. L. Wang, F. Liu, Z. L. Wang, G. Q. Hou, Z. A. Sun, T. N. Tan. End-to-end view synthesis for light field imaging with pseudo 4DCNN. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 340–355, 2018. DOI: https://doi.org/10.1007/978-3-030-01216-8_21.
G. C. Wu, M. D. Zhao, L. Y. Wang, Q. H. Dai, T. Y. Chai, Y. B. Liu. Light field reconstruction using deep convolutional network on EPI. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 6638–6646, 2017. DOI: https://doi.org/10.1709/CVPR.2017.178.
H. W. F. Yeung, J. H. Hou, J. Chen, Y. Y. Chung, X. M. Chen. Fast light field reconstruction with deep coarse-to-fine modeling of spatial-angular clues. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 138–154, 2018. DOI: https://doi.org/10.1007/978-3-030-01231-1_9.
Z. T. Zhang, Y. B. Liu, Q. H. Dai. Light field from micro-baseline image pair. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 3800–3809, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7299004.
J. Jin, J. H. Hou, J. Chen, H. Q. Zeng, S. Kwong, J. Y. Yu. Deep coarse-to-fine dense light field reconstruction with flexible sampling and geometry-aware fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 4, pp. 1819–1836, 2022. DOI: https://doi.org/10.1109/TPAMI.2020.3026039.
X. Liu, M. H. Wang, A. Z. Wang, X. Y. Hua, S. S. Liu. Depth-guided learning light field angular super-resolution with edge-aware inpainting. The Visual Computer, vol. 38, no. 8, pp. 2839–2851, 2022. DOI: https://doi.org/10.1007/s00371-021-02159-6
L. Y. Ruan, B. Chen, M. L. Lam. Light field synthesis from a single image using improved Wasserstein generative adversarial network. In Proceedings of the 39th Annual European Association for Computer Graphics Conference: Posters, Delft, The Netherlands, pp. 19–20, 2018.
J. Couillaud, D. Ziou. Light field variational estimation using a light field formation model. The Visual Computer, vol. 36, no. 2, pp. 237–251, 2020. DOI: https://doi.org/10.1007/s00371-018-1599-2.
O. Wiles, G. Gkioxari, R. Szeliski, J. Johnson. SynSin: End-to-end view synthesis from a single image. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 7465–7475, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00749.
M. L. Shih, S. Y. Su, J. Kopf, J. B. Huang. 3D photography using context-aware layered depth inpainting. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8025–8035, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00805.
R. Tucker, N. Snavely. Single-view view synthesis with multiplane images. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 548–557, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00063.
B. Mildenhall, P. P. Srinivasan, R. Ortiz-Cayon, N. K. Kalantari, R. Ramamoorthi, R. Ng, A. Kar. Local light field fusion: Practical view synthesis with prescriptive sampling guidelines. ACM Transactions on Graphics, vol. 38, no. 4, Article number 29, 2019. DOI: https://doi.org/10.1145/3306346.3322980.
A. Jain, M. Tancik, P. Abbeel. Putting NeRF on a diet: Semantically consistent few-shot view synthesis. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 5865–5874, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00583.
W. P. Xing, J. Chen. NEX.+: Novel view synthesis with neural regularisation over multi-plane images. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Singapore, pp. 1581–1585, 2022. DOI: https://doi.org/10.1109/ICASSP43922.2022.9746938.
W. P. Xing, J. Chen. Temporal-MPI: Enabling multi-plane images for dynamic scene modelling via temporal basis learning. In Proceedings of the 17th European Conference on Computer Vision, Springer, Tel Aviv, Israel, pp. 323–338, 2022. DOI: https://doi.org/10.1007/978-3-031-19784-0_19.
B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, R. Ng. NeRF: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 405–421, 2020. DOI: https://doi.org/10.1007/978-3-030-58452-8_24.
P. Dai, Y. D. Zhang, Z. W. Li, S. C. Liu, B. Zeng. Neural point cloud rendering via multi-plane projection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 7827–7836, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00785.
V. Sitzmann, J. Thies, F. Heide, M. Nießner, G. Wetzstein, M. Zollhöfer. DeepVoxels: Learning persistent 3D feature embeddings. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2437–2446, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00254.
I. Choi, O. Gallo, A. Troccoli, M. H. Kim, J. Kautz. Extreme view synthesis. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 7780–7789, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00787.
J. Chibane, A. Bansal, V. Lazova, G. Pons-Moll. Stereo radiance fields (SRF): Learning view synthesis for sparse views of novel scenes. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 7907–7916, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00782.
A. P. Chen, Z. X. Xu, F. Q. Zhao, X. S. Zhang, F. B. Xiang, J. Y. Yu, H. Su. MVSNeRF: Fast generalizable radiance field reconstruction from multi-view stereo. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 14104–14113, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.01386.
L. Liu, Z. Y. Wang, Y. Liu, C. Xu. An immersive virtual reality system for rodents in behavioral and neural research. International Journal of Automation and Computing, vol. 18, no. 5, pp. 838–848, 2021. DOI: https://doi.org/10.1007/s11633-021-1307-y.
N. N. Zhou, Y. L. Deng. Virtual reality: A state-of-the-art survey. International Journal of Automation and Computing, vol. 6, no. 4, pp. 319–325, 2009. DOI: https://doi.org/10.1007/s11633-009-0319-9.
W. P. Xing, J. Chen. MVSPlenOctree: Fast and generic reconstruction of radiance fields in PlenOctree from multi-view stereo. In Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, Portugal, pp. 5114–5122, 2022. DOI: https://doi.org/10.1145/3503161.3547795.
Y. Yao, Z. X. Luo, S. W. Li, T. W. Shen, T. Fang, L. Quan. Recurrent MVSNet for high-resolution multi-view stereo depth inference. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 5520–5529, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00567.
Y. Yao, Z. X. Luo, S. W. Li, T. Fang, L. Quan. MVSNet: Depth inference for unstructured multi-view stereo. In European Conference on Computer Vision, Springer, Munich, Germany, pp. 785–801, 2018. DOI: https://doi.org/10.1007/978-3-030-01237-3_47.
R. Chen, S. F. Han, J. Xu, H. Su. Point-based multi-view stereo network. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Republic of Korea, pp. 1538–1547, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00162.
J. Chen, J. H. Hou, Y. Ni, L. P. Chau. Accurate light field depth estimation with superpixel regularization over partially occluded regions. IEEE Transactions on Image Processing, vol. 27, no. 10, pp. 4889–4900, 2018. DOI: https://doi.org/10.1109/TIP.2018.2839524.
R. Ng, M. Levoy, M. Brédif, G. Duval, M. Horowitz, P. Hanrahan. Light Field Photography with A Hand-Held Plenoptic Camera, Ph.D. dissertation, Department of Computer Science, Stanford University, USA, 2005.
Z. H. Yu, S. H. Gao. Fast-MVSNet: Sparse-to-dense multi-view stereo with learned propagation and gauss-newton refinement. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1946–1955, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00202.
H. W. F. Yeung, J. H. Hou, X. M. Chen, J. Chen, Z. B. Chen, Y. Y. Chung. Light field spatial super-resolution using deep efficient spatial-angular separable convolution. IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2319–2330, 2019. DOI: https://doi.org/10.1109/TIP.2018.2885236.
T. Porter, T. Duff. Compositing digital images. ACM SIG-GRAPH Computer Graphics, vol. 18, no. 3, pp. 253–259, 1984. DOI: https://doi.org/10.1145/964965.808606.
K. Y. Luo, T. Guan, L. L. Ju, Y. S. Wang, Z. Chen, Y. W. Luo. Attention-aware multi-view stereo. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1587–1596, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00166.
P. H. Chen, H. C. Yang, K. W. Chen, Y. S. Chen. MVS-Net++: Learning depth-based attention pyramid features for multi-view stereo. IEEE Transactions on Image Processing, vol. 29, pp. 7261–7273, 2020. DOI: https://doi.org/10.1109/TIP.2020.3000611.
X. D. Zhang, Y. T. Hu, H. C. Wang, X. B. Cao, B. C. Zhang. Long-range attention network for multi-view stereo. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Waikoloa, USA, pp. 3781–3790, 2021. DOI: https://doi.org/10.1109/WACV48630.2021.00383.
D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. [Online], Available: https://arxiv.org/abs/1412.6980, 2014.
J. L. Schönberger, J. M. Frahm. Structure-from-motion revisited. In Proceedings of IEEE Conference Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 4104–4113, 2016. DOI: https://doi.org/10.1109/CVPR.2016.445.
C. W. Tian, Y. Xu, Z. Y. Li, W. M. Zuo, L. K. Fei, H. Liu. Attention-guided CNN for image denoising. Neural Networks, vol. 124, pp. 117–129, 2020. DOI: https://doi.org/10.1016/j.neunet.2019.12.024.
Y. Yao, Z. X. Luo, S. W. Li, J. Y. Zhang, Y. F. Ren, L. Zhou, T. Fang, L. Quan. BlendedMVS: A large-scale data-set for generalized multi-view stereo networks. In Proceedings of IEEE/CVF Conference Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1787–1796, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00186.