Special perceptual parsing for Chinese landscape painting scene understanding: a semantic segmentation approach
Neural Computing and Applications - Trang 1-19 - 2023
Tóm tắt
The automatic and precise perceptual parsing of Chinese landscape paintings (CLP) significantly aids in the digitization and recreation of artworks. Manual extraction and analysis of objects in CLPs is challenging, even for expert painters with professional knowledge and sharp discernment. Two main key reasons restricted the development of CLP parsing: (1) a lack of pixel-level labeled data used to supervise model training, and (2) the inherent complexity of CLP images compared to real scenes, characterized by varied scales, diverse textures, and intricate empty spaces. To address these challenges, we first construct a pixel-level annotated CLP segmentation datasets to advance perceptual parsing. Then, a novel CLP Perceptual Parsing (CLPPP) model is designed to fully utilize the intrinsic features of CLP images. To dynamically and adaptively capture context information, we introduced a set of learnable kernels into the CLPPP model based on the multiscale features of objects within CLPs. This enabled the model to learn an appropriate receptive field for context information extraction. Additionally, a positional attention head is devised to effectively eliminate noise from the intergroup and help the kernel gain inter-object position information. This iterative optimization process is helpful to learn powerful feature representations for different textures in CLPs. The experiment results demonstrate that the proposed CLPPP model outperforms state-of-the-art methods with mIoU, aAcc, and mAcc scores of 55.45, 75.08, and 71.15, respectively, achieving a large margin on the CLP dataset under consistent conditions.