Practical Blind Image Denoising via Swin-Conv-UNet and Data Synthesis

Springer Science and Business Media LLC - Tập 20 - Trang 822-836 - 2023
Kai Zhang1, Yawei Li1, Jingyun Liang1, Jiezhang Cao1, Yulun Zhang1, Hao Tang1, Deng-Ping Fan1, Radu Timofte2, Luc Van Gool1,3
1Computer Vision Lab, ETH Zürich, Zürich, Switzerland
2Computer Vision Lab, University of Würzburg, Würzburg, Germany
3KU Leuven, Leuven, Belgium

Tóm tắt

While recent years have witnessed a dramatic upsurge of exploiting deep neural networks toward solving image denoising, existing methods mostly rely on simple noise assumptions, such as additive white Gaussian noise (AWGN), JPEG compression noise and camera sensor noise, and a general-purpose blind denoising method for real images remains unsolved. In this paper, we attempt to solve this problem from the perspective of network architecture design and training data synthesis. Specifically, for the network architecture design, we propose a swin-conv block to incorporate the local modeling ability of residual convolutional layer and non-local modeling ability of swin transformer block, and then plug it as the main building block into the widely-used image-to-image translation UNet architecture. For the training data synthesis, we design a practical noise degradation model which takes into consideration different kinds of noise (including Gaussian, Poisson, speckle, JPEG compression, and processed camera sensor noises) and resizing, and also involves a random shuffle strategy and a double degradation strategy. Extensive experiments on AGWN removal and real image denoising demonstrate that the new network architecture design achieves state-of-the-art performance and the new degradation model can help to significantly improve the practicability. We believe our work can provide useful insights into current denoising research. The source code is available at https://github.com/cszn/SCUNet .

Tài liệu tham khảo

P. Chatterjee, P. Milanfar. Is denoising dead? IEEE Transactions on Image processing, vol. 19, no. 4, pp. 895–911, 2009. DOI: https://doi.org/10.1109/TIP.2009.2037087. M. V. Afonso, J. M. Bioucas-Dias, M. A. T. Figueiredo. Fast image recovery using variable splitting and constrained optimization. IEEE Transactions on Image Processing, vol. 19, no. 9, pp. 2345–2356, 2010. DOI: https://doi.org/10.1109/TIP.2010.2047910. S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, vol. 3, no. 1, pp. 1–122, 2011. DOI: https://doi.org/10.1561/2200000016. U. S. Kamilov, H. Mansour, B. Wohlberg. A plug-and-play priors approach for solving nonlinear imaging inverse problems. IEEE Signal Processing Letters, vol. 24, no. 12, pp. 1872–1876, 2017. DOI: https://doi.org/10.1109/LSP.2017.2763583. T. Plötz, S. Roth. Benchmarking denoising algorithms with real photographs. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 2750–2759, 2017. DOI: https://doi.org/10.1109/CV-PR.2017.294. T. Brooks, B. Mildenhall, T. F. Xue, J. W. Chen, D. Sharlet, J. T. Barron. Unprocessing images for learned raw denoising. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 11028–11037, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01129. K. Zhang, W. M. Zuo, Y. J. Chen, D. Y. Meng, L. Zhang. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Transactions on Image processing, vol 26, no. 7, pp. 3142–3155, 2077. DOI: https://doi.org/10.1109/TIP.2017.2662206. T. Plötz, S. Roth. Neural nearest neighbors networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal Canada, pp. 1095–1106, 2018. D. Liu, B. H. Wen, Y. C. Fan, C. C. Loy, T. S. Huang. Non-local recurrent network for image restoration. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 1680–1689, 2018. K. Zhang, Y. W. Li, W. M. Zuo, L. Zhang, L. Van Gool, R. Timofte. Plug-and-play image restoration with deep denoiser prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 10, pp. 6360–6376, 2022. DOI: https://doi.org/10.1109/TPAMI.2021.3088914. J. Y. Liang, J. Z. Cao, G. L. Sun, K. Zhang, L. Van Gool, R. Timofte. SwinIR: Image restoration using swin transformer. In Proceedings of IEEE/CVF International Conference on Computer Vision Workshops, IEEE, Montreal, Canada, pp. 1833–1844, 2021. DOI: https://doi.org/10.1109/ICCVW54120.2021.00210. H. C. Burger, C. Schuler, S. Harmeling. Learning how to combine internal and external denoising methods. In Proceedings of the 35th German Conference on Pattern Recognition, Springer, Saarbrücken, Germany, pp. 121–130, 2013. DOI: https://doi.org/10.1007/978-3-642-40602-7_13. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Identity mappings in deep residual networks. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 630–645, 2016. DOI: https://doi.org/10.1007/978-3-319-46493-0_38. Z. Liu, Y. T. Lin, Y. Cao, H. Hu, Y. X. Wei, Z. Zhang, S. Lin, B. N. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 9992–10002, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00986. J. W. Chen, J. W. Chen, H. Y. Chao, M. Yang. Image blind denoising with generative adversarial network based noise modeling. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3155–3164, 2018. DOI: https://doi.org/10.1109/CV-PR.2018.00333. S. Guo, Z. F. Yan, K. Zhang, W. M. Zuo, L. Zhang Toward convolutional blind denoising of real photographs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1712–1722, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00181. A. Krull, T. O. Buchholz, F. Jug. Noise2Void-learning denoising from single noisy images. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2124–2132, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00223. Z. S. Yue, H. W. Yong, Q. Zhao, L. Zhang, D. Y. Meng. Variational denoising network: Toward blind noise modeling and removal. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 151, 2019. A. Buades, B. Coll, J. M. Morel. A non-local algorithm for image denoising. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, San Diego, USA, pp. 60–65, 2005. DOI: https://doi.org/10.1109/CVPR.2005.18. K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Transactions on Image Processing, vol. 16, no. 8, pp. 2080–2095, 0077. DOI: https://doi.org/10.1109/TIP.2007.901238. J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman. Non-local sparse models for image restoration. In Proceedings of the IEEE 12th International Conference on Computer Vision, IEEE, Kyoto, Japan, pp. 2272–2279, 2009. DOI: https://doi.org/10.1109/ICCV.2009.5459452. S. H. Gu, L. Zhang, W. M Zuo, X. C. Feng. Weighted nuclear norm minimization with application to image denoising. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 2862–2869, 2014. DOI: https://doi.org/10.1109/CVPR.2014.166. J. Sun, M. F. Tappen. Learning non-local range markov random field for image restoration. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Colorado Springs, USA, pp. 2745–2752, 2011. DOI: https://doi.org/10.1109/CVPR.2011.5995520. S. Lefkimmiatis. Non-local color image denoising with convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5882–5891, 2017. DOI: https://doi.org/10.1109/CVPR.2017.623. H. T. Chen, Y. H. Wang, T. Y. Guo, C. Xu, Y. P. Deng, Z. H. Liu, S. W. Ma, C. J. Xu, C. Xu, W. Gao. Pre-trained image processing transformer. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 12294–12305, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01212. U. Schmidt, S. Roth. Shrinkage fields for effective image restoration. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 2774–2781, 2014. DOI: https://doi.org/10.1109/CVPR.2014.349. Y. J. Chen, T. Pock. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 6, pp. 1256–1272, 2017. DOI: https://doi.org/10.1109/TPAMI.2016.2596743. O. Ronneberger, P. Fischer, T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image computing and Computer-assisted Intervention, Springer, Munich, Germany, pp. 214–241, 2015. DOI: https://doi.org/10.1007/978-3-319-24574-4_28. B. Lim, S. Son, H. Kim, S. Nah, K. Mu Lee. Enhanced deep residual networks for single image super-resolution. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Honolulu, USA, pp. 1112–1140, 2017. DOI: https://doi.org/10.1109/CVPRW.2017.151. C. L Li, T. Tang, G. R. Wang, J. F. Peng, B. Wang, X. D. Liang, X. J. Chang. Bossnas: Exploring hybrid CNN-transformers with block-wisely self-supervised neural architecture search. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 12261–12271, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.01206. K. Yuan, S. P. Guo, Z. W. Liu, A. J. Zhou, F. W. Yu, W. Wu. Incorporating convolution designs into visual transformers. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 559–568, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00062. J. Y. Guo, K. Han, H. Wu, Y. H. Tang, X. H. Chen, Y. H. Wang, C. Xu. CMT: Convolutional neural networks meet vision transformers. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, New Orleans, USA, pp. 12165–12175, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.01186. Z. D. Wang, X. D. Cun, J. M. Bao, W. G. Zhou, J. Z. Liu, H. Q. Li. Uformer: A general U-shaped transformer for image restoration. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, New Orleans, USA, pp. 17662–17672, 2022. DOI: https://doi.org/10.1109/CV-PR52688.2022.01716. H. Cao, Y. Y. Wang, J. Chen, D. S. Jiang, X. P. Zhang, Q. Tian, M. N. Wang. Swin-unet: Unet-like pure transformer for medical image segmentation. In Proceedings of the 17th European Conference on Computer Vision, Springer, Tel Aviv, Israel, pp. 205–218, 2021. DOI: https://doi.org/10.1007/978-3-031-25066-8_9. Y. W. Li, K. Zhang, J. Z. Cao, R. Timofte, L. Van Gool. LocalViT: Bringing locality to vision transformers, [On-line], Available: https://arxiv.org/abs/2104.05707, 2021. K. Zhang, W. M. Zuo, L. Zhang. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Transactions on Image Processing, vol. 27, no. 9, pp. 4608–4622, 2018. DOI: https://doi.org/10.1109/TIP.2018.2819891. S. Nam, Y. Hwang, Y. Matsushita, S. J. Kim. A holistic approach to cross-channel image noise modeling and its application to image denoising. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1681–1691, 2016. DOI: https://doi.org/10.1109/CVPR.2016.186. S. W. Hasinoff. Photon, poisson noise. Computer Vision: A Reference Guide, K. Ikeuchi, Ed., New York, USA: Springer, pp. 608–610, 2014. DOI: https://doi.org/10.1007/978-0-187-31439-6_482. M. Tur, K. C. Chin, J. W. Goodman. When is speckle noise multiplicative? Applied Optics, vol. 21, no. 7, pp. 1157–1159, 1982. DOI: https://doi.org/10.1364/AO.21.001157. R. Racine, G. A. H. Walker, D. Nadeau, R. Doyon, C. Marois. Speckle noise and the detection of faint companions. Publications of the Astronomical Society of the Pacific, vol. 111, no. 759, pp. 587–594, 1999. DOI: https://doi.org/10.1086/316367. K. Zhang, J. Y. Liang, L. Van Gool, R. Timofte. Designing a practical degradation model for deep blind image super-resolution. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 4771–4780, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00475. M. D. Grossberg, S. K. Nayar. What is the space of camera response functions? In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, Madison, USA, pp. II–602, 2003. DOI: https://doi.org/10.1109/CVPR.2001.1211522. X. T. Wang, L. B. Xie, C. Dong, Y. Shan. Real-ESRGAN: Training real-world blind super-resolution with pure synthetic data. In Proceedings of IEEE/CVF International Conference on Computer Vision Workshops, IEEE, Montreal, Canada, pp. 1905–1914, 2021. DOI: https://doi.org/10.1109/IC-CVW54120.2021.00217. J. X. Jiang, K. Zhang, R. Timofte. Towards flexible blind JPEG artifacts removal. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 4977–4986, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00495. K. D. Ma, Z. F. Duanmu, Q. B. Wu, Z. Wang, H. W. Yong, H. L. Li, L. Zhang. Waterloo exploration database: New challenges for image quality assessment models. IEEE Transactions on Image Processing, vol. 26, no. 2, pp. 1004–1016, 2017. DOI: https://doi.org/10.1109/TIP.2016.2631888. E. Agustsson, R. Timofte. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Honolulu, USA, pp. 1122–1131, 2017. DOI: https://doi.org/10.1109/CVPRW.2017.150. D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015. D. Martin, C. Fowlkes, D. Tal, J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the 8th IEEE International Conference on Computer Vision, IEEE, Vancouver, Canada, pp. 416–423, 2001. DOI: https://doi.org/10.1109/ICCV.2001.937655. S. Roth, M. J. Black. Fields of experts. International Journal of Computer Vision, vol. 82, no. 2, pp. 205–229, 2009. DOI: https://doi.org/10.1007/s11263-008-0197-6. J. B. Huang, A. Singh, N. Ahuja. Single image super-resolution from transformed self-exemplars. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 5197–5206, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7299156. Y. L. Zhang, K. P. Li, K. Li, B. N. Zhong, Y. Fu. Residual non-local attention networks for image restoration. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019. X. X. Jia, S. Y. Liu, X. C. Feng, L. Zhang. FOCNet: A fractional optimal control network for image denoising. In Proceedings of IEEE/CVF Conference on Computer V ision and Pattern Recognition, IEEE, Long Beach, USA, pp. 6047–6056, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00621. C. Mou, J. Zhang, Z. Y. Wu. Dynamic attentive graph learning for image restoration. In Proceedings of IEEE/CVF International Conference on Computer V ision, IEEE, Montreal, Canada, pp. 4308–4317, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00429. S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, M. H. Yang. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, New Orleans, USA, pp. 5718–5729, 2022. DOI: https://doi.org/10.1109/CVPR52688.2022.00564. R. Franzen. Kodak lossless true color image suite, [Online], Available: https://r0k.us/graphics/kodak, November 15, 1999. L. Zhang, X. L. Wu, A. Buades, X. Li. Color demosaicking by local directional interpolation and nonlocal adaptive thresholding. Journal of Electronic Imaging, vol. 20, no. 2, Article number 023016, 2011. DOI: https://doi.org/10.1117/1.3600632. Y. L. Peng, L. Zhang, S. G. Liu, X. J. Wu, Y. Zhang, X. L. Wang. Dilated residual networks with symmetric skip connection for image denoising. Neurocomputing, vol. 345, pp. 67–76, 2019. DOI: https://doi.org/10.1016/j.neucom.2018.12.075. C. W. Tian, Y. Xu, W. M. Zuo. Image denoising using deep CNN with batch renormalization. Neural Networks, vol. 121, pp. 461–473, 2020. DOI: https://doi.org/10.1016/j.neunet.2019.08.022. Y. L. Zhang, Y. P. Tian, Y. Kong, B. N. Zhong, Y. Fu. Residual dense network for image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 7, pp. 2480–2495, 2021. DOI: https://doi.org/10.1109/TPAMI.2020.2968521. M. Lebrun, M. Colom, J. M. Morel. The noise clinic: A blind image denoising algorithm. Image Processing on Line, vol. 5, pp. 1–54, 2015. DOI: https://doi.org/10.5201/ipol.2015.125. C. Ren, X. H. He, C. C. Wang, Z. B. Zhao. Adaptive consistency prior based deep network for image denoising. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 8592–8602, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00849. A. Mittal, R. Soundararajan, A. C. Bovik. Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters, vol. 20, no. 3, pp. 209–212, 2013. DOI: https://doi.org/10.1109/LSP.2012.2227726. C. Ma, C. Y. Yang, X. K. Yang, M. H. Yang. Learning a no-reference quality metric for single-image super-resolution. Computer Vision and Image Understanding, vol. 158, pp. 1–16, 2017. DOI: https://doi.org/10.1016/j.cviu.2016.12.009. N. Venkatanath, D. Praneeth, M. C. Bh, S. S. Channappayya, S. S. Medasani. Blind image quality evaluation using perception based features. In Proceedings of the 21th First National Conference on communications, IEEE, Mumbai, India, pp. 1–6, 2015. DOI: 10:1109/NCC.2015.7084843.