Nội dung được dịch bởi AI, chỉ mang tính chất tham khảo
Phát hiện các đối tượng đồng nổi bật bằng cách nhìn sâu và nhìn rộng
Tóm tắt
Trong bài báo này, chúng tôi đề xuất một khuôn khổ phát hiện đối tượng đồng nổi bật thống nhất bằng cách giới thiệu hai hiểu biết mới: (1) nhìn sâu để chuyển giao các đại diện cấp cao hơn bằng cách sử dụng mạng nơ-ron tích chập với các lớp thích ứng bổ sung có thể phản ánh tốt hơn các thuộc tính ngữ nghĩa của các đối tượng đồng nổi bật; (2) nhìn rộng để tận dụng các hàng xóm tương tự về mặt thị giác từ các nhóm hình ảnh khác có thể hiệu quả ngăn chặn ảnh hưởng của các vùng nền chung. Thông tin rộng và sâu được khai thác cho các cửa sổ đề xuất đối tượng được trích xuất trong mỗi hình ảnh. Điểm số đồng nổi bật cấp cửa sổ được tính toán bằng cách tích hợp độ tương phản trong hình ảnh, tính nhất quán trong nhóm và khả năng phân tách giữa các nhóm thông qua một công thức Bayesian có nguyên tắc và sau đó được chuyển đổi thành các bản đồ đồng nổi bật cấp siêu điểm qua một chiến lược thỏa thuận vùng tiền cảnh. Các thí nghiệm toàn diện trên hai tập dữ liệu hiện có và một tập dữ liệu mới đã chứng minh mức tăng hiệu suất nhất quán của phương pháp đề xuất.
Từ khóa
#phát hiện đối tượng #đồng nổi bật #mạng nơ-ron tích chập #vùng tiền cảnh #thuật toán BayesianTài liệu tham khảo
Achanta, R., Hemami, S., Estrada, F., & Susstrunk, S. (2009). Frequency-tuned salient region detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1597–1604).
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., & Susstrunk, S. (2012). Slic superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2274–2282.
Akamine, K., Fukuchi, K., Kimura, A., & Takagi, S. (2012). Fully automatic extraction of salient objects from videos in near real time. The Computer Journal, 55(1), 3–14.
Batra, D., Kowdle, A., Parikh, D., Luo, J., & Chen, T. (2010). icoseg: Interactive co-segmentation with intelligent scribble guidance. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3169–3176).
Bengio, Y. (2009). Learning deep architectures for ai. Foundations and trends in Machine Learning, 2(1), 1–127.
Boiman, O., Shechtman, E., & Irani, M. (2008) In defense of nearest-neighbor based image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–8).
Cao, X., Cheng, Y., Tao, Z., & Fu, H. (2014). Co-saliency detection via base reconstruction. In Proceedings of the ACM international conference on multimedia (pp. 997–1000).
Cao, X., Tao, Z., Zhang, B., Fu, H., & Feng, W. (2014). Self-adaptively weighted co-saliency detection via rank constraint. IEEE Transactions on Image Processing, 22(9), 4175–4182.
Chen, H.-T. (2010). Preattentive co-saliency detection. In Proceedings of the IEEE international conference on image processing (pp. 1117–1120).
Chen, X., Shrivastava, A., & Gupta, A. (2014). Enriching visual knowledge bases via object discovery and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2035–2042).
Cheng, M.-M., Zhang, Z., Lin, W.-Y. & Torr, P. (2014). Bing: Binarized normed gradients for objectness estimation at 300fps. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3286–3293).
Dai, J., Wu, Y. N., Zhou, J. & Zhu,S.-C. (2013). Cosegmentation and cosketch by unsupervised learning. In Proceedings of the IEEE international conference on computer vision (pp. 1305–1312).
Eichner, M., & Ferrari, V. (2012). Human pose co-estimation and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11), 2282–2288.
Fu, H., Cao, X., & Tu, Z. (2013). Cluster-based co-saliency detection. IEEE Transactions on Image Processing, 22(10), 3766–3778.
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580–587).
Goferman, S., Zelnik-Manor, L., & Tal, A. (2012). Context-aware saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(10), 1915–1926.
Guo, J., Li, Z., Cheong, L.-F., & Zhou, S. Z. (2013). Video co-segmentation for meaningful action extraction. In Proceedings of the IEEE international conference on computer vision (pp. 2232–2239).
Jacobs, D. E., Goldman, D. B., & Shechtman, E. (2010). Cosaliency: Where people look when comparing images. In Proceedings of the 23nd annual ACM symposium on User interface software and technology (pp. 219–228).
Jia, Y., & Han, M. (2013). Category-independent object-level saliency detection. In Proceedings of the IEEE international conference on computer vision (pp. 1761–1768).
Jiang, H., Wang, J., Yuan, Z., Liu, T., & Zheng, N. (2011). Automatic salient object segmentation based on context and shape prior. In Proceedings of the British machine vision conference (pp. 1–12).
Jiang, H., Wang, J., Yuan, Z., Wu, Y., Zheng, N., & Li, S. (2013). Salient object detection: A discriminative regional feature integration approach. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2083–2090).
Joulin, A., Bach, F., & Ponce, J. (2010). Discriminative clustering for image co-segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1943–1950).
Joulin, A., Bach, F., & Ponce, J. (2012). Multi-class cosegmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 542–549).
Kim, G., Xing, E. P., Fei-Fei, L., & Kanade, T. (2011). Distributed cosegmentation via submodular optimization on anisotropic diffusion. In Proceedings of the IEEE conference on computer vision (pp. 169–176).
Kuettel, D., & Ferrari, V. (2012). Figure-ground segmentation by transferring window masks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 558–565).
Lee, W.-F., Huang, T.-H., Yeh, S.-L., & Chen, H. H. (2011). Learning-based prediction of visual attention for video signals. IEEE Transactions on Image Processing, 20(11), 3028–3038.
Li, W.-T., Chang, H.-S., Lien, K.-C., Chang, H.-T., & Wang, Y. (2013). Exploring visual and motion saliency for automatic video object extraction. IEEE Transactions on Image Processing, 22(7), 2600–2610.
Li, Y., Fu, K., Liu, Z., & Yang, J. (2015). Efficient saliency-model-guided visual co-saliency detection. IEEE Signal Processing Letters, 22(5), 588–592.
Li, H., Meng, F., & Ngan, K. N. (2013). Co-salient object detection from multiple images. IEEE Transactions on Multimedia, 15(8), 1896–1909.
Li, H., & Ngan, K. N. (2011). A co-saliency model of image pairs. IEEE Transactions on Image Processing, 20(12), 3365–3375.
Li, Y., Sheng, B., Ma, L., Wu, W., & Xie, Z. (2013). Temporally coherent video saliency using regional dynamic contrast. IEEE Transactions on Circuits and Systems for Video Technology, 23(12), 2067–2076.
Li, J., Tian, Y., & Huang, T. (2014). Visual saliency with statistical priors. International Journal of Computer Vision, 107(3), 239–253.
Li, J., Tian, Y., Huang, T., & Gao, W. (2010). Probabilistic multi-task learning for visual saliency estimation in video. International Journal of Computer Vision, 90(2), 150–165.
Liu, Z., Zou, W., Li, L., Shen, L., & Le Meur, O. (2014). Co-saliency detection based on hierarchical segmentation. IEEE Signal Processing Letters, 21(1), 88–92.
Marat, S., Phuoc, T. H., Granjon, L., Guyader, N., Pellerin, D., & Guérin-Dugué, A. (2009). Modelling spatio-temporal saliency to predict gaze direction for short videos. International Journal of Computer Vision, 82(3), 231–243.
Meng, F., Li, H., Liu, G., & Ngan, K. N. (2012). Object co-segmentation based on shortest path algorithm and saliency model. IEEE Transactions on Multimedia, 14(5), 1429–1441.
Oquab, M., Bottou, L., Laptev, I., & Sivic, J. (2014). Learning and transferring mid-level image representations using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1717–1724).
Prest, A., Leistner, C., Civera, J., Schmid, C., & Ferrari, V. (2012). Learning object class detectors from weakly annotated video. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3282–3289).
Rubinstein, M., Joulin, A., Kopf, J. & Liu, C. (2013). Unsupervised joint object discovery and segmentation in internet images. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1939–1946).
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., & LeCun, Y. (2013). Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229.
Shen, X., & Wu, Y. (2012). A unified approach to salient object detection via low rank matrix recovery. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 853–860).
Siva, P., Russell, C., & Xiang,T. (2012). In defence of negative mining for annotating weakly labelled data. In Proceedings of the European conference on computer vision (pp. 594–608).
Siva, P., Russell, C., Xiang, T., & Agapito,L. (2013). Looking beyond the image: Unsupervised learning for object saliency and detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3238–3245).
Tan, Z., Wan, L., Feng, W., & Pun, C.-M. (2013). Image co-saliency detection by propagating superpixel affinities. In Proceedings of the IEEE international conference on acoustics, speech and signal processing (pp. 2114–2118).
Tang, K., Joulin, A., Li, L.-J., & Fei-Fei, L. (2014). Co-localization in real-world images. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1464–1471).
Tian, Y., Li, J., Yu, S., & Huang, T. (2014). Learning complementary saliency priors for foreground object segmentation in complex scenes. International Journal of Computer Vision, 111(2), 153–170.
Tighe, J., & Lazebnik, S. (2013). Finding things: Image parsing with regions and per-exemplar detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3001–3008).
Toshev, A., Shi, J., & Daniilidis, K. (2007). Image matching via saliency region correspondences. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–8).
Vicente, S., Rother, C., & Kolmogorov, V. (2011). Object cosegmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2217–2224).
Wang Z., & Liu, R. (2012). Semi-supervised learning for large scale image cosegmentation. In Proceedings of the IEEE international conference on computer vision (pp. 393–400).
Wang, L., Hua, G., Sukthankar, R., Xue, J., & Zheng, N. (2014). Video object discovery and co-segmentation with extremely weak supervision. In Proceedings of the European conference on computer vision (pp. 640–655).
Wang, J., DaSilva, M. P., LeCallet, P., & Ricordel, V. (2013). Computational model of stereoscopic 3d visual saliency. IEEE Transactions on Image Processing, 22(6), 2151–2165.
Winn, J., Criminisi, A., & Minka, T. (2005). Object categorization by learned universal visual dictionary. In Proceedings of the IEEE international conference on computer vision (pp. 1800–1807).
Xie, Y., Lu, H., & Yang, M.-H. (2013). Bayesian saliency via low and mid level cues. IEEE Transactions on Image Processing, 22(5), 1689–1698.
Xue, J., Wang, L., Zheng, N., & Hua, G. (2013). Automatic salient object extraction with contextual cue and its applications to recognition and alpha matting. Pattern Recognition, 46(11), 2874–2889.
Yang, C., Zhang, L., Lu, H., Ruan, X., & Yang, M.-H. (2013). Saliency detection via graph-based manifold ranking. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3166–3173).
Zhang, D., Han, J., Li, C., & Wang, J. (2015). Co-saliency detection via looking deep and wide. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2994–3002).
Zhang, L., Tong, M. H., Marks, T. K., Shan, H., & Cottrell, G. W. (2008). Sun: A bayesian framework for saliency using natural statistics. Journal of Vision, 8(7), 32.
Zhou, D., Weston, J., Gretton, A., Bousquet, O., & Schölkopf, B. (2004). Ranking on data manifolds. In Proceedings of advances in neural information processing systems (pp. 169–176).
Zhu, J.-Y., Wu, J. Wei, Y., Chang, E., & Tu, Z. (2012). Unsupervised object class discovery via saliency-guided multiple class learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3218–3225).
