Phân đoạn lưỡi tự động sử dụng mô hình mã hóa-giải mã sâu

Multimedia Tools and Applications - Tập 82 - Trang 37661-37686 - 2023
Worapan Kusakunniran1, Punyanuch Borwarnginn1, Thanandon Imaromkul1, Kittinun Aukkapinyo1, Kittikhun Thongkanchorn1, Disathon Wattanadhirach1, Sophon Mongkolluksamee2, Ratchainant Thammasudjarit3, Panrasee Ritthipravat4, Pimchanok Tuakta5, Paitoon Benjapornlert5
1Faculty of Information and Communication Technology, Mahidol University, Nakhon Pathom, Thailand
2Department of Computer Science, Faculty of Science, Srinakharinwirot University, Bangkok, Thailand
3Department of Epidemiology and Biostatistics, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
4Department of Biomedical Engineering, Faculty of Engineering, Mahidol University, Nakhon Pathom, Thailand
5Department of Rehabilitation Medicine, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand

Tóm tắt

Bài báo này đề xuất một giải pháp phân đoạn lưỡi trong hình ảnh. Giải pháp dựa trên mạng nơ-ron tích chập, sử dụng U-Net sâu với các lớp sâu của các mô-đun mã hóa-giải mã. Mô hình được huấn luyện với độ phân giải khởi đầu là 512 x 512 pixel. Để nâng cao hiệu suất phân đoạn của mô hình đã được huấn luyện trong các môi trường ghi lại khác nhau, ba loại tăng cường dữ liệu chính được thêm vào quá trình huấn luyện, bao gồm tiếng ồn gaussian cộng thêm, nhân và cộng vào độ sáng, và thay đổi nhiệt độ màu. Chúng cũng có thể xử lý số lượng mẫu dữ liệu không đủ trong các tập dữ liệu hạn chế. Phương pháp đề xuất được đánh giá dựa trên bốn chỉ số đo lường là hệ số Dice, IoU trung bình, khoảng cách Jaccard và độ chính xác. Mô hình đã được huấn luyện thành công trên các tập dữ liệu công khai và sau đó được chuyển giao để thử nghiệm với tập dữ liệu tự thu thập trong môi trường thực tế.

Từ khóa

#phân đoạn lưỡi #mạng nơ-ron tích chập #U-Net #tăng cường dữ liệu #đo lường hiệu suất

Tài liệu tham khảo

Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615 BioHit (2014) Tongeimagedataset. https://github.com/BioHit/TongeImageDataset Cai Y, Wang T, Liu W, Luo Z (2020) A robust interclass and intraclass loss function for deep learning based tongue segmentation. Concurr Comput Pract Exper 32(22):e5849. https://doi.org/10.1002/cpe.5849 Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2016) Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv:1412.7062 Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587 Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184 Cui Z, Zuo W, Zhang H, Zhang D (2013) Automated tongue segmentation based on 2D Gabor filters and fast marching. In: Sun C, Fang F, Zhou ZH, Yang W, Liu ZY (eds) Intelligence science and big data engineering. https://doi.org/10.1007/978-3-642-42057-3_42. Springer, Berlin, pp 328–335 Dash S, Verma S, Kavita Khan MS, Wozniak M, Shafi J, Ijaz MF (2021) A hybrid method to enhance thick and thin vessels for blood vessel segmentation. Diagnostics 11(11). https://doi.org/10.3390/DIAGNOSTICS11112017 Guo J, Yang Y, Wu Q, Su J, Ma F (2016) Adaptive active contour model based automatic tongue image segmentation. In: 2016 9th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI). https://doi.org/10.1109/CISP-BMEI.2016.7852933, pp 1386–1390 Huang Y, Lai Z, Wang W (2020) TU-Net: a precise network for tongue segmentation. In: Proceedings of the 2020 9th international conference on computing and pattern recognition. https://doi.org/10.1145/3436369.3437428. ACM, New York , pp 244–249 Ijaz MF, Attique M, Son Y (2020) Data-driven cervical cancer prediction model with outlier detection and over-sampling methods. Sensors 20(10):2809. https://doi.org/10.3390/s20102809 Johnson PM, Muckley MJ, Bruno M, Kobler E, Hammernik K, Pock T, Knoll F (2019) Joint multi-anatomy training of a variational network for reconstruction of accelerated magnetic resonance image acquisitions. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). https://doi.org/10.1007/978-3-030-33843-5_7, vol 11905. LNCS, pp 71–79 Lachinov D, Vasiliev E, Turlapov V (2018) Glioma segmentation with cascaded UNet. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). arXiv:1810.04008, https://doi.org/10.1007/978-3-030-11726-9_17, vol 11384. LNCS, pp 189–198 Li J, Xu B, Ban X, Tai P, Ma B (2017a) A tongue image segmentation method based on enhanced HSV convolutional neural network. In: Luo Y (ed) Cooperative design, visualization, and engineering. Springer International Publishing, Cham, pp 252–260 Li R, Liu W, Yang L, Sun S, Hu W, Zhang F, Li W (2018a) DeepUNet: a deep fully convolutional network for pixel-level sea-land segmentation. IEEE J Select Topics Appl Earth Observ Remote Sens 11(11):3954–3962. arXiv:1709.00201, https://doi.org/10.1109/JSTARS.2018.2833382 Li X, Yang T, Hu Y, Xu M, Zhang W, Li F (2017b) Automatic tongue image matting for remote medical diagnosis. In: 2017 IEEE international conference on bioinformatics and biomedicine (BIBM). https://doi.org/10.1109/BIBM.2017.8217710, pp 561–564 Li X, Chen H, Qi X, Dou Q, Fu CW, Heng PA (2018b) H-DenseUNet: hybrid densely connected UNet for liver and tumor segmentation from CT volumes. IEEE Trans Med Imag 37(12):2663–2674. arXiv:1709.07330, https://doi.org/10.1109/TMI.2018.2845918 Lin B, Xle J, Li C, Qu Y (2018) Deeptongue: tongue segmentation via Resnet. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). https://doi.org/10.1109/ICASSP.2018.8462650, https://ieeexplore.ieee.org/document/8462650/. IEEE, pp 1035–1039 Liu W, Zhou C, Li Z, Hu Z (2020) Patch-driven tongue image segmentation using sparse representation. IEEE Access 8:41372–41383. https://doi.org/10.1109/ACCESS.2020.2976826 Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440 Mandal M, Singh PK, Ijaz MF, Shafi J, Sarkar R (2021) A tri-stage wrapper-filter feature selection framework for disease classification. Sensors 21(16):5571. https://doi.org/10.3390/s21165571 Mehta S, Rastegari M, Caspi A, Shapiro L, Hajishirzi H (2018) Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV) Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: Proceedings of the British machine vision conference 2015, british machine vision association. https://doi.org/10.5244/c.29.41, http://www.bmva.org/bmvc/2015/papers/paper041/index.html, pp 41.1–41.12 Paszke A, Chaurasia A, Kim S, Culurciello E (2016) Enet: A deep neural network architecture for real-time semantic segmentation. arXiv:1606.02147 Pinheiro PO, Collobert R, Dollar P (2015) Learning to segment object candidates. arXiv:1506.06204 Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). https://doi.org/10.1007/978-3-319-24574-4_28, vol 9351, pp 234–241 Rother C, Kolmogorov V, Blake A (2004) grabcut: Interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314. https://doi.org/10.1145/1015706.1015720 Saparudin E, Fachrurrozi M (2017) Tongue segmentation using active contour model. In: IOP conference series: materials science and engineering. https://doi.org/10.1088/1757-899X/190/1/012041, vol 190, p 012041 Shen X, Tao X, Gao H, Zhou C, Jia J (2016) Deep automatic portrait matting. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). https://doi.org/10.1007/978-3-319-46448-0_6, http://www.cse.cuhk.edu.hk/leojia/projects/automatting, vol 9905 LNCS. Springer, pp 92–107 Shi D, Tang C, Blackley SV, Wang L, Yang J, He Y, Bennett SI, Xiong Y, Shi X, Zhou L, Bates DW (2020) An annotated dataset of tongue images supporting geriatric disease diagnosis. Data Brief 32:106153. https://doi.org/10.1016/j.dib.2020.106153, https://www.sciencedirect.com/science/article/pii/S2352340920310477 Shi MJ, Li GZ, Li FF (2013) C2g2FSnake: Automatic tongue image segmentation utilizing prior knowledge. Sci Chin Inf Sci 56(9):1–14. https://doi.org/10.1007/S11432-011-4428-Z Srinivasu PN, Ahmed S, Alhumam A, Kumar AB, Ijaz MF (2021a) An AW-HARIS based automated segmentation of human liver using CT images. Comput Mater Continua 69(3):3303–3319. https://doi.org/10.32604/cmc.2021.018472 Srinivasu PN, Sivasai JG, Ijaz MF, Bhoi AK, Kim W, Kang JJ (2021b) Classification of skin disease using deep learning neural networks with mobilenet v2 and lstm. Sensors 21(8):2809. https://doi.org/10.3390/s21082852 Tang C (2019) Replication data for: An annotated dataset of tongue images. https://doi.org/10.7910/DVN/COJZMQ Vulli A, Srinivasu PN, Sashank MSK, Shafi J, Choi J, Ijaz MF (2022) Fine-tuned DenseNet-169 for breast cancer metastasis prediction using fastAI and 1-cycle policy. Sensors 22(8):2988. https://doi.org/10.3390/s22082988 Wu K, Zhang D (2015) Robust tongue segmentation by fusing region-based and edge-based approaches. Expert Syst Appl 42(21):8027–8038. https://doi.org/10.1016/j.eswa.2015.06.032 Xue Y, Li X, Wu P, Li J, Wang L, Tong W (2018) Automated tongue segmentation in chinese medicine based on deep learning. In: Cheng L, Leung ACS, Ozawa S (eds) Neural information processing. Springer International Publishing, Cham, pp 542–553 Zhang P, Ke Y, Zhang Z, Wang M, Li P, Zhang S (2018) Urban land use and land cover classification using novel deep learning models based on high spatial resolution satellite imagery. Sensors 18(11):3717. https://doi.org/10.3390/S18113717, https://www.mdpi.com/1424-8220/18/11/3717/htmhttps://www.mdpi.com/1424-8220/18/11/3717