Active learning with point supervision for cost-effective panicle detection in cereal crops
Tóm tắt
Panicle density of cereal crops such as wheat and sorghum is one of the main components for plant breeders and agronomists in understanding the yield of their crops. To phenotype the panicle density effectively, researchers agree there is a significant need for computer vision-based object detection techniques. Especially in recent times, research in deep learning-based object detection shows promising results in various agricultural studies. However, training such systems usually requires a lot of bounding-box labeled data. Since crops vary by both environmental and genetic conditions, acquisition of huge amount of labeled image datasets for each crop is expensive and time-consuming. Thus, to catalyze the widespread usage of automatic object detection for crop phenotyping, a cost-effective method to develop such automated systems is essential. We propose a point supervision based active learning approach for panicle detection in cereal crops. In our approach, the model constantly interacts with a human annotator by iteratively querying the labels for only the most informative images, as opposed to all images in a dataset. Our query method is specifically designed for cereal crops which usually tend to have panicles with low variance in appearance. Our method reduces labeling costs by intelligently leveraging low-cost weak labels (object centers) for picking the most informative images for which strong labels (bounding boxes) are required. We show promising results on two publicly available cereal crop datasets—Sorghum and Wheat. On Sorghum, 6 variants of our proposed method outperform the best baseline method with more than 55% savings in labeling time. Similarly, on Wheat, 3 variants of our proposed methods outperform the best baseline method with more than 50% of savings in labeling time. We proposed a cost effective method to train reliable panicle detectors for cereal crops. A low cost panicle detection method for cereal crops is highly beneficial to both breeders and agronomists. Plant breeders can obtain quick crop yield estimates to make important crop management decisions. Similarly, obtaining real time visual crop analysis is valuable for researchers to analyze the crop’s response to various experimental conditions.
Tài liệu tham khảo
Grinblat GL, Uzal LC, Larese MG, Granitto PM. Deep learning for plant identification using vein morphological patterns. Comput Electron Agric. 2016;127:418–24. https://doi.org/10.1016/j.compag.2016.07.003.
Ghosal S, Blystone D, Singh AK, Ganapathysubramanian B, Singh A, Sarkar S. An explainable deep machine vision framework for plant stress phenotyping. Proc Natl Acad Sci. 2018;115(18):4613–8. https://doi.org/10.1073/pnas.1716999115.
Ghosal S, Zheng B, Chapman SC, Potgieter AB, Jordan DR, Wang X, Singh AK, Singh A, Hirafuji M, Ninomiya S, Ganapathysubramanian B, Sarkar S, Guo W. A weakly supervised deep learning framework for sorghum head detection and counting. Plant Phenomics. 2019;2019:1–14. https://doi.org/10.34133/2019/1525874.
Desai SV, Balasubramanian VN, Fukatsu T, Ninomiya S, Guo W. Automatic estimation of heading date of paddy rice using deep learning. Plant Methods. 2019;15(1):76. https://doi.org/10.1186/s13007-019-0457-1.
Hasan MM, Chopin JP, Laga H, Miklavcic SJ. Detection and analysis of wheat spikes using convolutional neural networks. Plant Methods. 2018;14(1):100. https://doi.org/10.1186/s13007-018-0366-8.
Guo W, Zheng B, Potgieter AB, Diot J, Watanabe K, Noshita K, Jordan DR, Wang X, Watson J, Ninomiya S, Chapman SC. Aerial imagery analysis—quantifying appearance and number of sorghum heads for applications in breeding and agronomy. Front Plant Sci. 2018;9:1544. https://doi.org/10.3389/fpls.2018.01544.
Sadeghi-Tehran P, Virlet N, Ampe EM, Reyns P, Hawkesford MJ. Deepcount: in-field automatic quantification of wheat spikes using simple linear iterative clustering and deep convolutional neural networks. Front Plant Sci. 2019;10:1176. https://doi.org/10.3389/fpls.2019.01176.
Ubbens J, Cieslak M, Prusinkiewicz P, Stavness I. The use of plant models in deep learning: an application to leaf counting in rosette plants. Plant Methods. 2018;14:6.
Sa I, Ge Z, Dayoub F, Upcroft B, Perez T, McCool C. Deepfruits: a fruit detection system using deep neural networks. Sensors. 2016;16:1222.
Xiong X, Duan L, Liu L, Tu H, Yang P, Wu D, Chen G, Xiong L, Yang W, Liu Q. Panicle-seg: a robust image segmentation method for rice panicles in the field based on deep learning and superpixel optimization. Plant Methods. 2017;13(1):104. https://doi.org/10.1186/s13007-017-0254-7.
Oh M-h, Olsen P, Ramamurthy KN. Counting and segmenting sorghum heads. 2019. arXiv:1905.13291
Milioto A, Lottes P, Stachniss C. Real-time semantic segmentation of crop and weed for precision agriculture robots leveraging background knowledge in CNNs. In: 2018 IEEE international conference on robotics and automation (ICRA). 2018. p. 2229–35.
Kamilaris A, Prenafeta-Boldú FX. Deep learning in agriculture: a survey. Comput Electron Agric. 2018;147:70–90. https://doi.org/10.1016/j.compag.2018.02.016.
Singh AK, Ganapathysubramanian B, Sarkar S, Singh A. Deep learning for plant stress phenotyping: trends and future perspectives. Trends Plant Sci. 2018;23(10):883–98. https://doi.org/10.1016/j.tplants.2018.07.004.
Albarqouni S, Baur C, Achilles F, Belagiannis V, Demirci S, Navab N. Aggnet: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE Trans Med Imaging. 2016;35(5):1313–21. https://doi.org/10.1109/TMI.2016.2528120.
Russakovsky O, Li L, Fei-Fei L. Best of both worlds: human–machine collaboration for object annotation. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). 2015. p. 2121–31. https://doi.org/10.1109/CVPR.2015.7298824
Papadopoulos DP, Uijlings JRR, Keller F, Ferrari V. We don’t need no bounding-boxes: training object class detectors using only human verification. CoRR. 2016. arXiv:abs/1602.08405.
Vijayanarasimhan S, Grauman K. Large-scale live active learning: training object detectors with crawled data and crowds. Int J Comput Vis. 2014;108(1):97–114. https://doi.org/10.1007/s11263-014-0721-9.
Yao A, Gall J, Leistner C, Van Gool L. Interactive object detection. In: 2012 IEEE conference on computer vision and pattern recognition. 2012. p. 3242–9. https://doi.org/10.1109/CVPR.2012.6248060
Barsoum E, Zhang C, Ferrer CC, Zhang Z. Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM international conference on multimodal interaction. ICMI ’16. New York: Association for Computing Machinery; 2016. p. 279–83. https://doi.org/10.1145/2993148.2993165.
Kawano Y, Yanai K. Automatic expansion of a food image dataset leveraging existing categories with domain adaptation. In: Agapito L, Bronstein MM, Rother C, editors. Computer vision–ECCV 2014 workshops. Cham: Springer; 2015. p. 3–17.
Welinder P, Perona P. Online crowdsourcing: Rating annotators and obtaining cost-effective labels. In: 2010 IEEE computer society conference on computer vision and pattern recognition—workshops. 2010. p. 25–32. https://doi.org/10.1109/CVPRW.2010.5543189
Sorokin A, Forsyth D. Utility data annotation with amazon mechanical turk. In: 2008 IEEE computer society conference on computer vision and pattern recognition workshops. 2008. p. 1–8. https://doi.org/10.1109/CVPRW.2008.4562953
Papadopoulos DP, Uijlings JRR, Keller F, Ferrari V. Training object class detectors with click supervision. CoRR. 2017. arXiv:abs/1704.06189.
Papadopoulos DP, Uijlings JRR, Keller F, Ferrari V. Extreme clicking for efficient object annotation. In: 2017 IEEE international conference on computer vision (ICCV), 2017. p. 4940–9. https://doi.org/10.1109/ICCV.2017.528
Russakovsky O, Bearman AL, Ferrari V, Li F. What’s the point: semantic segmentation with point supervision. CoRR. 2015. arXiv:abs/1506.02106.
Mettes P, Gemert J, Snoek C. Spot on: Action localization from pointly-supervised proposals. 2016.
Teng E, Falcão JD, Iannucci B. Clickbait: click-based accelerated incremental training of convolutional neural networks. CoRR. 2017. arXiv:abs/1709.05021.
Bilen H, Vedaldi A. Weakly supervised deep detection networks. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). 2015. p. 2846–54.
Papadopoulos DP, Uijlings JRR, Keller F, Ferrari V. We don’t need no bounding-boxes: training object class detectors using only human verification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). 2016. p. 854–63.
Settles B. Active learning literature survey. Technical report, University of Wisconsin-Madison. 2010.
Gal Y, Islam R, Ghahramani Z. Deep bayesian active learning with image data. In: ICML. 2017.
Sener O, Savarese S. Active learning for convolutional neural networks: a core-set approach. In: ICLR 2018. 2018.
Wang K, Zhang D, Li Y, Zhang R, Lin L. Cost-effective active learning for deep image classification. IEEE Trans Circuit Syst Video Technol. 2017;27(12):2591–600. https://doi.org/10.1109/TCSVT.2016.2589879.
Brust C, Käding C, Denzler J. Active learning for deep object detection. CoRR. 2018.arXiv:abs/1809.09875.
Roy S, Unmesh A, Namboodiri VP. Deep active learning for object detection. In: BMVC. 2018.
Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A. The pascal visual object classes challenge: a retrospective. Int J Comput Vis. 2015;111(1):98–136.
Lin T, Maire M, Belongie SJ, Bourdev LD, Girshick RB, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL. Microsoft COCO: common objects in context. CoRR. 2014. arXiv:abs/1405.0312.
Ren S, He K, Girshick RB, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. CoRR. 2015. arXiv:abs/1506.01497.
Madec S, Jin X, Lu H, de Solan B, Liu S, Duyme F, Heritier E, Frederic B. Ear density estimation from high resolution RGB imagery using deep learning technique. Agric For Meteorol. 2019;264:225–34. https://doi.org/10.1016/j.agrformet.2018.10.013.
Yang J, Lu J, Batra D, Parikh D. A faster pytorch implementation of faster r-cnn. 2017. https://github.com/jwyang/faster-rcnn.pytorch.
He K, Gkioxari G, Dollár P, Girshick RB. Mask r-cnn. In: 2017 IEEE international conference on computer vision (ICCV). 2017. p. 2980–8.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). 2016. p. 770–8.
Salton G, McGill MJ. Introduction to modern information retrieval. New York: McGraw-Hill Inc; 1986.
Su H, Deng J, Fei-Fei L. Crowdsourcing annotations for visual object detection. AAAI workshops. 2012.
Canny J. A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell. 1986;8:679–98. https://doi.org/10.1109/TPAMI.1986.4767851.
Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. 2013. arXiv:1311.2524.
Forsyth D. Object detection with discriminatively trained part-based models. Computer. 2014;47:6–7. https://doi.org/10.1109/MC.2014.42.
Dalal N, Triggs B. Histograms of oriented gradients for human detection, vol. 1. 2005. p. 886–93. https://doi.org/10.1109/CVPR.2005.177.
Viola P, Jones M. Robust real-time object detection, vol. 57. 2001.
Bhatia R, Davis C. A better bound on the variance. Am Math Monthly. 2000;107(4):353–7.