Active and semi-supervised learning for object detection with imperfect data

Cognitive Systems Research - Tập 45 - Trang 109-123 - 2017

Phill Kyu Rhee¹, Enkhbayar Erdenee¹, Shin Dong Kyun¹, Minhaz Uddin Ahmed¹, Songguo Jin¹

¹Computer and Information Engineering Department, Inha University, 100 Inha-ro, Nam-gu 22212, Incheon, Republic of Korea

Tóm tắt

Từ khóa

Tài liệu tham khảo

Angelova, 2015, Real-time pedestrian detection with deep network cascades, BMVC, 2015, 1

Benenson, 2015, Ten years of pedestrian detection, what have we learned?, Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 8926, 613

Bengio, 2013, Representation learning: A review and new perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 1798, 10.1109/TPAMI.2013.50

Breiman, 1996, Bagging predictors, Machine Learning, 24, 123, 10.1007/BF00058655

Brinker, 2003, Incorporating diversity in active learning with support vector machines, Strategy, 20, 59

Chapelle. (2006). Semi-supervised learning. Interdisciplinary sciences computational life sciences (Vol. 1). doi:http://dx.doi.org/10.1007/s12539-009-0016-2.

Chetlur, S., & Woolley, C. (2014). cuDNN: Efficient primitives for deep learning. arXiv Preprint arXiv: …, 1-9. Retrieved <http://arxiv.org/abs/1410.0759>.

Culotta, 2005, Reducing labeling effort for structured prediction tasks, Proceedings of the National Conference on Artificial Intelligence, 20, 746

Dagan, 1995, Committee-based sampling for training probabilistic classifiers, Proceedings of the Twelfth International Conference on Machine Learning, 150

Demir, 2011, Batch-mode active-learning methods for the interactive classification of remote sensing images, IEEE Transactions on Geoscience and Remote Sensing, 49, 1014, 10.1109/TGRS.2010.2072929

Deng, J, Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In CVPR09.

Dollar, 2011, {P}edestrian detection: {A}n evaluation of the state of the art, IEEE Transaction on Pattern Analysis and Machine Intelligence, 1

Felzenszwalb, P.F., Girshick, R.B., Mcallester, D., & Ramanan, D. (2009). Object detection with discriminatively trained part based models, doi:http://dx.doi.org/10.1109/TPAMI.2009.167.

Fergus, R., Fei-Fei, L., Perona, P., & Zisserman, A. (2005). Learning object categories from Google’s image search. In Tenth IEEE international conference on computer vision (ICCV’05) volume 1 (Vol. 2, pp. 1816–1823). doi:10.1109/ICCV.2005.142.

Girshick, R. (2015). Fast R-CNN, doi:http://dx.doi.org/10.1109/ICCV.2015.169.

Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 580–587), doi:http://dx.doi.org/10.1109/CVPR.2014.81.

Hasan, 2015, A continuous learning framework for activity recognition using deep hybrid feature models, IEEE Transactions on Multimedia, 17, 1909, 10.1109/TMM.2015.2477242

He, 2015, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1904, 10.1109/TPAMI.2015.2389824

Heckman, 1979, Sample selection bias as a specification error, Econometrica, 47, 153, 10.2307/1912352

Hinton, 2006, Reducing the dimensionality of data with neural networks, Science, 313, 504, 10.1126/science.1127647

Hoiem, D., Chodpathumwan, Y., & Dai, Q. (2012). Diagnosing error in object detectors. In Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (Vol. 7574 LNCS(PART 3), pp. 340–353), doi:http://dx.doi.org/10.1007/978-3-642-33712-3_25.

Hosang, J., Omran, M., Benenson, R., & Schiele, B. (2015). Taking a deeper look at pedestrians. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 07-12-June, pp. 4073–4082), doi:http://dx.doi.org/10.1109/CVPR.2015.7299034.

Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., …, Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM international conference on multimedia (pp. 675–678), doi:http://dx.doi.org/10.1145/2647868.2654889.

Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2007). Active learning with gaussian processes for object categorization. In 2007 IEEE 11th international conference on computer vision, doi:http://dx.doi.org/10.1109/ICCV.2007.4408844.

Krizhevsky, 2012, ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 1–9

LeCun, 1989, Backpropagation applied to handwritten zip code recognition, Neural Computation, 10.1162/neco.1989.1.4.541

Lewis, D. D., & Catlett, J. (1994). Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the 11th international conference on machine learning (ICML’94) (pp. 148–156) <http://www.cs.brynmawr.edu/cs372/LeC94.pdf>.

Lewis, D. D., & Gale, W. A. (1994). A sequential algorithm for training text classifiers. In Proceedings of the 17th international conference on research and development in information retrieval (SIGIR’94) (pp. 3–12), doi:http://dx.doi.org/10.1145/219587.219592.

Li, M., & Sethi, I. K. (2006). Confidence-based active learning, 28(8), 1251–1261.

Li, 2010, OPTIMOL: Automatic online picture collection via incremental model learning, International Journal of Computer Vision, 88, 147, 10.1007/s11263-009-0265-6

Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., …, Zitnick, C. L. (2014). Microsoft COCO: common objects in context. In Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (Vol. 8693 LNCS, pp. 740–755), doi:http://dx.doi.org/10.1007/978-3-319-10602-1_48.

Lindenbaum, 2004, Selective sampling for nearest neighbor classifiers, Machine Learning, 54, 125, 10.1023/B:MACH.0000011805.60520.fe

Lu, 2015, Active learning through adaptive heterogeneous ensembling, IEEE Transactions on Knowledge and Data Engineering, 27, 368, 10.1109/TKDE.2014.2304474

Makantasis, K., Doulamis, A., Doulamis, N., & Psychas, K. (2016). Deep learning based human behavior recognition in industrial workflows. In IEEE international conference on image processing (ICIP), 2016 (pp. 1609–1613). IEEE.

Mitchell, T., & Blum, A. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on computational learning theory (pp. 92–100). doi:http://dx.doi.org/10.1145/279943.279962.

Paisitkriangkrai, S., Shen, C., & Hengel, A. Van Den. (2014). Pedestrian detection with spatially pooled features and structured ensemble learning, 1–19, doi:http://dx.doi.org/10.1109/TPAMI.2015.2474388, arXiv Preprint arXiv:1409.5209.

Persello, 2011, Active and semi-supervised learning for the classification of remote sensing images, IEEE Transactions on Geoscience and Remote Sensing, 52, 6937, 10.1109/TGRS.2014.2305805

Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., & Zhang, H.-J. (2008). Two-Dimensional Active Learning for image classification. In IEEE conference on computer vision and pattern recognition (pp. 1–8), doi:http://dx.doi.org/10.1109/CVPR.2008.4587383.

Ren, 2015, Faster R-CNN: Towards real-time object detection with region proposal networks, Nips, 1–10

Roy, N., & McCallum, A. (2001). Toward optimal active learning through sampling estimation of error reduction. in Proceedings of the 18th international conference on machine learning (pp. 441–448). Retrieved from <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.28.9963&rep=rep1&type=pdf>.

Russakovsky, 2015, ImageNet large scale visual recognition challenge, International Journal of Computer Vision, 115, 211, 10.1007/s11263-015-0816-y

Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., & LeCun, Y. (2013). OverFeat: Integrated recognition, localization and detection using convolutional networks. arXiv Preprint arXiv:1312.6229. Retrieved from <http://arxiv.org/abs/1312.6229>.

Settles, 2010, Active learning literature survey, Machine Learning, 15, 201

Seung, 1992, Query by committee, Proceedings of the fifth annual workshop on computational learning theory - COLT’92, 287, 10.1145/130385.130417

Siddiquie, B., & Gupta, A. (2010). Beyond active noun tagging: Modeling contextual interactions for multi-class active learning. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 2979–2986), doi:http://dx.doi.org/10.1109/CVPR.2010.5540044.

Simonyan, 2014, Very deep convolutional networks for large-scale image recognition, ImageNet Challenge, 1–10

Simonyan, 2015, Very deep convolutional networks for large-scale image recognition, Iclr, 1–14

Sorokin, A., & Forsyth, D. (2008). Utility data annotation with Amazon Mechanical Turk. In 2008 IEEE Computer society conference on computer vision and pattern recognition workshops, CVPR workshops, doi:http://dx.doi.org/10.1109/CVPRW.2008.4562953.

Su, H., Deng, J., Fei-Fei, Li. (2012). Crowdsourcing annotations for visual object detection. Human Computation AAAI Technical Report WS-12-08.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., …, Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 07-12-June (pp. 1–9), doi:http://dx.doi.org/10.1109/CVPR.2015.7298594.

Tian, Y., Luo, P., Wang, X., & Tang, X. (2015). Pedestrian detection aided by deep learning semantic tasks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 07-12-June, pp. 5079–5087). doi:http://dx.doi.org/10.1109/CVPR.2015.7299143.

Tong, 2001, Support vector machine active learning with applications to text classification, Journal of Machine Learning Research, 45–66

Uijlings, 2013, Selective search for object recognition, International Journal of Computer Vision, 104, 154, 10.1007/s11263-013-0620-5

Vijayanarasimhan, 2009, Multi-level active prediction of useful image annotations for recognition, Advances in Neural Information Processing Systems, 21, 1705

Welinder, P., & Perona, P. (2010). Online crowdsourcing: Rating annotators and obtaining cost-effective labels. In 2010 IEEE computer society conference on computer vision and pattern recognition - workshops, CVPRW 2010 (pp. 25–32). doi:http://dx.doi.org/10.1109/CVPRW.2010.5543189.

Xu, Z. & Akella R. (2008). Active relevance feedback for difficult queries In Proceedings of the 17th ACM conference on Information and knowledge management, Napa Valley, California, USA (pp. 459–468).

Xu, Z., Yu, K., Tresp, V., Xu, X., & Wang, J. (2003). Representative sampling for text classification using support vector machines. In Proceedings of ECIR-03, 25th European conference on information retrieval (pp. 393-407). Retrieved from <http://link.springer.de/link/service/series/0558/papers/2633/26330393.pdf>.

Xu, L., Li, B., & Chen, E. (2012). Ensemble pruning via constrained eigen-optimization. In Proceedings - IEEE international conference on data mining, ICDM (pp. 715–724). doi:http://dx.doi.org/10.1109/ICDM.2012.97.

Yang, B., Yan, J., Lei, Z., & Li, S. Z. (2015). Convolutional channel features. In ICCV, doi:http://dx.doi.org/10.1109/ICCV.2015.18.

Yang, Y., Wang, Z., & Wu, F. (2015). Exploring prior knowledge for pedestrian detection. In BMVC2015 (pp. 1–12).

Zadrozny, B. (2004). Learning and evaluating classifiers under sample selection bias. In Twenty-first international conference on machine learning - ICML ’04 (p. 114). doi:http://dx.doi.org/10.1145/1015330.1015425.

Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer vision–ECCV 2014 (Vol. 8689, pp. 818–833), doi:http://dx.doi.org/10.1007/978-3-319-10590-1_53, arXiv:1311.2901v3 [cs.CV] 28 Nov 2013.

Zhang, S., Benenson, R., & Schiele, B. (2015). Filtered channel features for pedestrian detection. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 07–12-June, pp. 1751–1760). doi:http://dx.doi.org/10.1109/CVPR.2015.7298784.

Zhu, 2008, Semi-supervised learning literature survey contents, Sciences New York, 10, 10

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA