Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo
Mạng lưới tổng hợp dựa trên chú ý nửa giám sát với mô-đun tích chập giãn hybrid cho phục hồi video HDR với ít mẫu
Multimedia Tools and Applications - Trang 1-22 - 2023
Tóm tắt
Các phương pháp dựa trên học sâu cho việc phục hồi video dải động cao (HDR) cần thu thập tập dữ liệu video HDR quy mô lớn với sự thật mặt đất, điều này là rất tốn thời gian. Các chiến lược đào tạo gần đây theo mô hình học ít, nhằm xây dựng một mô hình hiệu quả dựa trên chỉ một vài mẫu được gán nhãn, đã thể hiện sự thành công trong phân loại hình ảnh và phân đoạn hình ảnh. Trong bài báo này, một khuôn khổ học nửa giám sát dựa trên phục hồi video HDR với ít mẫu được đề xuất. Một mạng lưới tổng hợp dựa trên chú ý với mô-đun tích chập giãn hybrid được sử dụng để phục hồi nội dung bị thiếu và loại bỏ các hiện tượng không mong muốn. Mô-đun tích chập giãn hybrid chiết xuất các đặc điểm bổ sung từ các vùng sáng yếu và mô-đun chú ý điều chỉnh chúng để hạn chế thông tin có hại. Trong khuôn khổ nửa giám sát, các hàm mất mát được thiết kế cho nhánh giám sát và nhánh không giám sát được sử dụng để ràng buộc mạng trong quá trình đào tạo dưới kịch bản học ít mẫu. Các kết quả thực nghiệm cho thấy phương pháp được đề xuất đã được đào tạo chỉ với 5 mẫu được gán nhãn và 45 mẫu không được gán nhãn đạt được điểm PSNR là 41.664dB trên tập dữ liệu đánh giá tổng hợp, so với 35.201dB, đây là điểm số tốt nhất trong số các phương pháp giám sát được đào tạo trong cùng một điều kiện học ít mẫu.
Từ khóa
#học sâu #phục hồi video HDR #học nửa giám sát #mạng lưới tổng hợp #mô-đun tích chập giãn hybrid #học ít mẫuTài liệu tham khảo
Kang SB, Uyttendaele M, Winder S, Szeliski R (2003) High dynamic range video. ACM Transactions on Graphics (TOG) 22(3):319–325
Kalantari N.K, Ramamoorthi R (2019) Deep hdr video from sequences with alternating exposures. In: Computer graphics forum, vol 38, pp 193–205. Wiley Online Library
Chen G, Chen C, Guo S, Liang Z, Wong K-YK, Zhang L (2021) Hdr video reconstruction: A coarse-to-fine network and a real-world benchmark dataset. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2502–2511
Anand M, Harilal N, Kumar C, Raman S (2021) Hdrvideo-gan: deep generative hdr video reconstruction. In: Proceedings of the twelfth indian conference on computer vision, graphics and image processing, pp 1–9
Li L, Dong Y, Ren W, Pan J, Gao C, Sang N, Yang M-H (2019) Semi-supervised image dehazing. IEEE Trans Image Process 29:2766–2779
Hasinoff S.W, Durand F, Freeman WT (2010) Noise-optimal capture for high dynamic range photography. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 553–560. IEEE
Seshadrinathan K, Park SH, Nestares O (2012) Noise and dynamic range optimal computational imaging. In: 2012 19th IEEE international conference on image processing, pp 2785–2788. IEEE
Pourreza-Shahri R, Kehtarnavaz N (2015) Exposure bracketing via automatic exposure selection. In: 2015 IEEE international conference on image processing (ICIP):pp 320–323. IEEE
Eilertsen G, Kronander J, Denes G, Mantiuk RK, Unger J (2017) Hdr image reconstruction from a single exposure using deep cnns. ACM Transactions on Graphics (TOG) 36(6):1–15
Bogoni L (2000) Extending dynamic range of monochrome and color images through fusion. In: Proceedings 15th international conference on pattern recognition. ICPR-2000, vol 3, pp 7–12. IEEE
Jacobs K, Loscos C, Ward G (2008) Automatic high-dynamic range image generation for dynamic scenes. IEEE Comput Graph Appl 28(2):84–93
Kalantari NK, Ramamoorthi R et al (2017) Deep high dynamic range imaging of dynamic scenes. ACM Trans Graph 36(4):144–1
Pece F, Kautz J (2010) Bitmap movement detection: Hdr for dynamic scenes. In: 2010 Conference on visual media production, pp 1–8. IEEE
Zhang W, Cham W-K (2012) Reference-guided exposure fusion in dynamic scenes. J Vis Commun Image Represent 23(3):467–475
Oh T-H, Lee J-Y, Tai Y-W, Kweon IS (2014) Robust high dynamic range imaging by rank minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(6):1219–1232
Wu S, Xu J, Tai Y-W, Tang C-K (2018) Deep high dynamic range imaging with large foreground motions. In: Proceedings of the european conference on computer vision (ECCV):pp 117–132
Yan Q, Gong D, Shi Q, Hengel A.v.d, Shen C, Reid I, Zhang Y (2019) Attention-guided network for ghost-free high dynamic range imaging. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1751–1760
Yan Q, Zhang L, Liu Y, Zhu Y, Sun J, Shi Q, Zhang Y (2020) Deep hdr imaging via a non-local network. IEEE Trans Image Process 29:4308–4322
Niu Y, Wu J, Liu W, Guo W, Lau RW (2021) Hdr-gan: Hdr image reconstruction from multi-exposed ldr images with large motions. IEEE Trans Image Process 30:3885–3896
Nayar SK, Mitsunaga T (2000) High dynamic range imaging: Spatially varying pixel exposures. In: Proceedings IEEE conference on computer vision and pattern recognition. CVPR 2000 (Cat. No. PR00662):vol 1, pp 472–479. IEEE
Serrano A, Heide F, Gutierrez D, Wetzstein G, Masia B (2016) Convolutional sparse coding for high dynamic range imaging. In: Computer graphics forum, vol 35, pp 153–163. Wiley Online Library
Hajisharif S, Kronander J, Unger J (2015) Adaptive dualiso hdr reconstruction. EURASIP Journal on Image and Video Processing 2015(1)1:1–13
Choi I, Baek S-H, Kim MH (2017) Reconstructing interlaced high-dynamic-range video using joint learning. IEEE Trans Image Process 26(11):5353–5366
McGuire M, Matusik W, Pfister H, Chen B, Hughes JF, Nayar SK (2007) Optical splitting trees for high-precision monocular imaging. IEEE Comput Graph Appl 27(2):32–42
Kronander J, Gustavson S, Bonnet G, Ynnerman A, Unger J (2014) A unified framework for multi-sensor hdr video reconstruction. Signal Process Image Commun 29(2):203–215
Mangiat S, Gibson, J (2010) High dynamic range video with ghost removal. In: Applications of digital image processing XXXIII, vol 7798, pp 307–314. SPIE
Kalantari NK, Shechtman E, Barnes C, Darabi S, Goldman DB, Sen P (2013) Patch-based high dynamic range video. ACM Trans Graph 32(6):202–1
Li Y, Lee C, Monga V (2016) A maximum a posteriori estimation framework for robust high dynamic range video synthesis. IEEE Trans Image Process 26(3):1143–1157
Cai Q, Pan Y, Yao T, Yan C, Mei T (2018) Memory matching networks for one-shot image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4080–4088
Munkhdalai T, Yuan X, Mehri S, Trischler A (2018) Rapid adaptation with conditionally shifted neurons. In: International conference on machine learning, pp 3664–3673. PMLR
Yoon SW, Seo J, Moon J (2019) Tapnet: Neural network augmented with task-adaptive projection for few-shot learning. In: International conference on machine learning, pp 7115–7123. PMLR
Li H, Eigen D, Dodge S, Zeiler M, Wang X (2019) Finding task-relevant features for few-shot learning by category traversal. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1–10
Ravi S, Larochelle H (2016) Optimization as a model for few-shot learning. In: International conference on learning representations
Su J-C, Maji S, Hariharan B (2020) When does self-supervision improve few-shot learning? In: European conference on computer vision, pp 645–666. Springer
Vinyals O, Blundell C, Lillicrap T, Wierstra D, et al (2016) Matching networks for one shot learning. Advances in neural information processing systems 29
Prabhakar KR, Senthil G, Agrawal S, Babu RV, Gorthi RKSS (2021) Labeled from unlabeled: Exploiting unlabeled data for few-shot deep hdr deghosting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4875–4885
Sousa S, Milios E, Berton L (2020) Word sense disambiguation: an evaluation study of semi-supervised approaches with word embeddings. In: 2020 International joint conference on neural networks (IJCNN):pp 1–8. IEEE
Hu Z, Yang Z, Hu X, Nevatia R (2021) Simple: Similar pseudo label exploitation for semi-supervised classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 15099–15108
Ling S, Liu Y, Salazar J, Kirchhoff K (2020) Deep contextualized acoustic representations for semi-supervised speech recognition. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP):pp 6429–6433. IEEE
Lai W-S, Huang J-B, Yang M-H (2017) Semi-supervised learning for optical flow with generative adversarial networks. Advances in Neural Information Processing Systems 30
Yang W, Wang S, Fang Y, Wang Y, Liu J (2020) From fidelity to perceptual quality: A semi-supervised approach for low-light image enhancement. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3063–3072
Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4161–4170
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision, pp 694–711. Springer
Chen D, Yuan L, Liao J, Yu N, Hua G (2017) Stylebank: An explicit representation for neural image style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1897–1906
Hore A, Ziou D (2010) Image quality metrics: Psnr vs. ssim. In: 2010 20th International conference on pattern recognition, pp 2366–2369. IEEE
Froehlich J, Grandinetti S, Eberhardt B, Walter S, Schilling A, Brendel H (2014) Creating cinematic wide gamut hdr-video for the evaluation of tone mapping operators and hdr-displays. In: Digital photography X, vol 9023, pp 279–288. SPIE
Mantiuk R, Kim KJ, Rempel AG, Heidrich W (2011) Hdr-vdp-2: A calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Transactions on graphics (TOG) 30(4):1–14
Narwaria M, Da Silva MP, Le Callet P (2015) Hdr-vqm: An objective quality measure for high dynamic range video. Signal Processing: Image Communication 35:46–60
Jais IKM, Ismail AR, Nisa SQ (2019) Adam optimization algorithm for wide and deep neural network. Knowledge Engineering and Data Science 2(1):41–46