Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo

Mạng Lưới Kernel Biến Hình cho Lọc Ảnh Tương Hợp

Springer Science and Business Media LLC - Tập 129 - Trang 579-600 - 2020

Beomjun Kim¹, Jean Ponce², Bumsub Ham¹

¹School of Electrical and Electronic Engineering, Yonsei University, Seoul, Korea

²Inria and DI-ENS, Département d’Informatique de l’ENS, CNRS, PSL University, Paris, France

Tóm tắt

Các bộ lọc ảnh tương hợp được sử dụng để chuyển giao các chi tiết cấu trúc từ một bức tranh hướng dẫn được sử dụng như là một tham chiếu đến một bức ảnh mục tiêu, trong các nhiệm vụ như tăng cường độ phân giải không gian và giảm tiếng ồn. Các phương pháp trước đây dựa trên mạng nơ-ron tích chập (CNNs) kết hợp các hoạt hóa phi tuyến của các kernel không gian bất biến để ước lượng các chi tiết cấu trúc và hồi quy kết quả lọc. Trong bài báo này, chúng tôi thay vào đó học một cách rõ ràng các kernel thưa và biến thiên theo không gian. Chúng tôi đề xuất một kiến trúc CNN và cách triển khai hiệu quả của nó, được gọi là mạng kernel biến hình (DKN), mà đầu ra là các tập hàng xóm và các trọng số tương ứng thích ứng cho mỗi pixel. Kết quả lọc sau đó được tính toán như một giá trị trung bình có trọng số. Chúng tôi cũng đề xuất một phiên bản nhanh của DKN có tốc độ chạy khoảng mười bảy lần nhanh hơn cho một bức ảnh có kích thước $$640 \times 480$$. Chúng tôi chứng minh tính hiệu quả và linh hoạt của các mô hình của chúng tôi trong các nhiệm vụ như làm tăng độ phân giải bản đồ độ sâu, làm tăng độ phân giải bản đồ độ nổi bật, phục hồi ảnh đa phương thức, loại bỏ kết cấu và phân đoạn ngữ nghĩa. Đặc biệt, chúng tôi cho thấy rằng quá trình trung bình có trọng số với các kernel $$3 \times 3$$ được lấy mẫu thưa vượt trội hơn công nghệ hiện tại với một khoảng cách đáng kể trong tất cả các trường hợp.

Từ khóa

#Lọc Ảnh Tương Hợp; Mạng Nơ-ron Tích Chập; Kernel Biến Hình; Tăng cường Độ Phân Giải; Phục Hồi Ảnh.

Tài liệu tham khảo

Bako, S., Vogels, T., McWilliams, B., Meyer, M., Novák, J., Harvill, A., et al. (2017). Kernel-predicting convolutional networks for denoising Monte Carlo renderings. ACM Transactions on Graphics, 36(4), 97. Barron, J. T., & Poole, B. (2016). The fast bilateral solver. In: Proc. Eur. Conf. Comput. Vis. Buades, A., Coll, B., & Morel, J. M. (2005). A non-local algorithm for image denoising. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Butler, D. J., Wulff, J., Stanley, G. B., & Black, M. J. (2012). A naturalistic open source movie for optical flow evaluation. In: Proc. Eur. Conf. Comput. Vis. Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848. Choy, C. B., Gwak, J., Savarese, S., & Chandraker, M. (2016). Universal correspondence network. In: Adv. Neural Inf. Process. Syst. Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., & Wei, Y. (2017). Deformable convolutional networks. In: Proc. Int. Conf. Comput. Vis. Diebel, J., & Thrun, S. (2006). An application of Markov random fields to range sensing. In: Adv. Neural Inf. Process. Syst. Everingham, M., Eslami, S. A., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2015). The pascal visual object classes challenge: a retrospective. International Journal of Computer Vision, 111(1), 98–136. Farbman, Z., Fattal, R., Lischinski, D., & Szeliski, R. (2008). Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM Transactions on Graphics, 27(3), 67. Ferstl, D., Reinbacher, C., Ranftl, R., Rüther, M., & Bischof, H. (2013). Image guided depth upsampling using anisotropic total generalized variation. In: Proc. Int. Conf. Comput. Vis. Ferstl, D., Rüther, M., & Bischof, H. (2015). Variational depth superresolution using example-based edge representations. In: Proc. Int. Conf. Comput. Vis. Getreuer, P., Garcia-Dorado, I., Isidoro, J., Choi, S., Ong, F., & Milanfar, P. (2018). Blade: filter learning for general purpose computational photography. In: Proc. IEEE Conf. Computational Photography Gu, S., Zuo, W., Guo, S., Chen, Y., Chen, C., & Zhang, L. (2017). Learning dynamic guidance for depth image enhancement. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Ham, B., Cho, M., & Ponce, J. (2018). Robust guided image filtering using nonconvex potentials. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(1), 192–207. Ham, B., Cho, M., Schmid, C., & Ponce, J. (2016). Proposal flow. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Hariharan, B., Arbelaez, P., Bourdev, L., Maji, S., & Malik, J. (2011). Semantic contours from inverse detectors. In: Proc. Int. Conf. Comput. Vis. He, K., Sun, J., & Tang, X. (2013). Guided image filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(6), 1397–1409. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Hirschmuller, H., & Scharstein, D. (2007). Evaluation of cost functions for stereo matching. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Hosni, A., Rhemann, C., Bleyer, M., Rother, C., & Gelautz, M. (2013). Fast cost-volume filtering for visual correspondence and beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(2), 504–511. Hui, T. W., Loy, C. C., & Tang, X. (2016). Depth map super-resolution by deep multi-scale guidance. In: Proc. Eur. Conf. Comput. Vis. Ioffe, S., & Szegedy, C. (2015). Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proc. Int. Conf. Machine Learning Jaderberg, M., Simonyan, K., & Zisserman, A., et al. (2015). Spatial transformer networks. In: Adv. Neural Inf. Process. Syst. Jia, X., De Brabandere, B., Tuytelaars, T., & Gool, L. V. (2016). Dynamic filter networks. In: Adv. Neural Inf. Process. Syst. Karacan, L., Erdem, E., & Erdem, A. (2013). Structure-preserving image smoothing via region covariances. ACM Transactions on Graphics, 32(6), 176. Kim, J., Kwon Lee, J., & Mu Lee, K. (2016). Accurate image super-resolution using very deep convolutional networks. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Kingma, D. P., & Ba, J. (2015). Adam: a method for stochastic optimization. In: Proc. Int. Conf. Learning Representations Kopf, J., Cohen, M. F., Lischinski, D., & Uyttendaele, M. (2007). Joint bilateral upsampling. ACM Transactions on Graphics, 26(3), 96. Krähenbühl, P., & Koltun, V. (2011). Efficient inference in fully connected CRFS with Gaussian edge potentials. In: Adv. Neural Inf. Process. Syst. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In: Adv. Neural Inf. Process. Syst. Levin, A., Lischinski, D., & Weiss, Y. (2008). A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 228–242. Li, Y., Huang, J. B., Ahuja, N., & Yang, M. H. (2016). Deep joint image filtering. In: Proc. Eur. Conf. Comput. Vis. Li, Y., Huang, J. B., Ahuja, N., & Yang, M. H. (2019). Joint image filtering with deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8), 1909–1923. Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Lu, S., Ren, X., & Liu, F. (2014). Depth enhancement via low-rank matrix completion. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Margolin, R., Zelnik-Manor, L., & Tal, A. (2014). How to evaluate foreground maps? In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Mildenhall, B., Barron, J. T., Chen, J., Sharlet, D., Ng, R., Carroll, R. (2018). Burst denoising with kernel prediction networks. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Niklaus, S., Mai, L., & Liu, F. (2017). Video frame interpolation via adaptive convolution. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Odena, A., Dumoulin, V., & Olah, C. (2016). Deconvolution and checkerboard artifacts. Distill, 1(10), e3. Park, J., Kim, H., Tai, Y. W., Brown, M. S. & Kweon, I. (2011). High quality depth map upsampling for 3D-ToF cameras. In: Proc. Int. Conf. Comput. Vis. Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in PyTorch. In: NIPS-W Revaud, J., Weinzaepfel, P., Harchaoui, Z., & Schmid, C. (2015). EpicFlow: edge-preserving interpolation of correspondences for optical flow. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., (pp. 1164–1172) Riegler, G., Ferstl, D., Rüther, M., & Horst, B. (2016a). A deep primal-dual network for guided depth super-resolution. In: Proc. British Machine Vision Conference Riegler, G., Rüther, M., & Horst, B. (2016b) ATGV-Net: accurate depth super-resolution. In: Proc. Eur. Conf. Comput. Vis. Romano, Y., Isidoro, J., & Milanfar, P. (2017). RAISR: rapid and accurate image super resolution. IEEE Transactions on Computational Imaging, 3(1), 110–125. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: convolutional networks for biomedical image segmentation. In: Proc. Intl. Conf. on Medical image computing and computer-assisted intervention Scharstein, D., & Pal. C. (2007). Learning conditional random fields for stereo. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Shen, X., Zhou, C., Xu, L., & Jia, J. (2015). Mutual-structure for joint filtering. In: Proc. Int. Conf. Comput. Vis. Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Silberman, N., Hoiem, D., Kohli, P., & Fergus, R. (2012). Indoor segmentation and support inference from rgbd images. In: Proc. Eur. Conf. Comput. Vis. Simonyan, K., & Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In: Adv. Neural Inf. Process. Syst. Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In: Proc. Int. Conf. Learning Representations Su, H., Jampani, V., Sun, D., Gallo, O., Learned-Miller, E., & Kautz, J. (2019). Pixel-adaptive convolutional neural networks. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Szeliski, R. (2006). Locally adapted hierarchical basis preconditioning. ACM Transactions on Graphics, 25(3), 1135–1143. Tang, J., Tian, F.P., Feng, W., Li, J., & Tan, P. (2019). Learning guided convolutional network for depth completion. arXiv preprint arXiv:1908.01238 Tomasi, C., & Manduchi, R. (1998). Bilateral filtering for gray and color images. In: Proc. Int. Conf. Comput. Vis. Vogels, T., Rousselle, F., McWilliams, B., Röthlin, G., Harvill, A., Adler, D., et al. (2018). Denoising with kernel prediction and asymmetric loss functions. ACM Transactions on Graphics, 37(4), 124. Wang, J., & Cohen, M. F. (2007). Optimized color sampling for robust matting. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Wu, H., Zheng, S., Zhang, J., & Huang, K. (2018) Fast end-to-end trainable guided filter. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Xu, L., Lu, C., Xu, Y., & Jia, J. (2011). Image smoothing via L0 gradient minimization. ACM Transactions on Graphics, 30(6), 174 Xu, L., Yan, Q., Xia, Y., & Jia, J. (2012). Structure extraction from texture via relative total variation. ACM Transactions on Graphics, 31(6), 139. Xu, L., Ren, J., Yan, Q., Liao, R., & Jia, J. (2015). Deep edge-aware filters. In: Proc. Int. Conf. Machine Learning Yan, Q., Shen, X., Xu, L., Zhuo, S., Zhang, X., Shen, L., & Jia, J. (2013). Cross-field joint image restoration via scale map. In: Proc. Int. Conf. Comput. Vis. Yang, C., Zhang, L., Lu, H., Ruan, X., & Yang, M.H. (2013). Saliency detection via graph-based manifold ranking. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Yang, J., Wright, J., Huang, T. S., & Ma, Y. (2010). Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 19(11), 2861–2873. Yang, Q., Yang, R., Davis, J., & Nistér, D. (2007). Spatial-depth super resolution for range images. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In: Proc. Int. Conf. Learning Representations Zhang, K., Zuo, W., Chen, Y., Meng, D., & Zhang, L. (2017). Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Transactions on Image Processing, 26(7), 3142–3155. Zhang, Q., Shen, X., Xu, L., & Jia, J. (2014). Rolling guidance filter. In: Proc. Eur. Conf. Comput. Vis. Zhang, Z. (2012). Microsoft Kinect sensor and its effect. IEEE Transactions on Multimedia, 19(2), 4–10. Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., & Torr, P. H. (2015). Conditional random fields as recurrent neural networks. In: Proc. Int. Conf. Comput. Vis.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA