Active and semi-supervised learning for object detection with imperfect data
Tóm tắt
Từ khóa
Tài liệu tham khảo
Angelova, 2015, Real-time pedestrian detection with deep network cascades, BMVC, 2015, 1
Benenson, 2015, Ten years of pedestrian detection, what have we learned?, Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), 8926, 613
Bengio, 2013, Representation learning: A review and new perspectives, IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 1798, 10.1109/TPAMI.2013.50
Brinker, 2003, Incorporating diversity in active learning with support vector machines, Strategy, 20, 59
Chapelle. (2006). Semi-supervised learning. Interdisciplinary sciences computational life sciences (Vol. 1). doi:http://dx.doi.org/10.1007/s12539-009-0016-2.
Chetlur, S., & Woolley, C. (2014). cuDNN: Efficient primitives for deep learning. arXiv Preprint arXiv: …, 1-9. Retrieved <http://arxiv.org/abs/1410.0759>.
Culotta, 2005, Reducing labeling effort for structured prediction tasks, Proceedings of the National Conference on Artificial Intelligence, 20, 746
Dagan, 1995, Committee-based sampling for training probabilistic classifiers, Proceedings of the Twelfth International Conference on Machine Learning, 150
Demir, 2011, Batch-mode active-learning methods for the interactive classification of remote sensing images, IEEE Transactions on Geoscience and Remote Sensing, 49, 1014, 10.1109/TGRS.2010.2072929
Deng, J, Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In CVPR09.
Dollar, 2011, {P}edestrian detection: {A}n evaluation of the state of the art, IEEE Transaction on Pattern Analysis and Machine Intelligence, 1
Felzenszwalb, P.F., Girshick, R.B., Mcallester, D., & Ramanan, D. (2009). Object detection with discriminatively trained part based models, doi:http://dx.doi.org/10.1109/TPAMI.2009.167.
Fergus, R., Fei-Fei, L., Perona, P., & Zisserman, A. (2005). Learning object categories from Google’s image search. In Tenth IEEE international conference on computer vision (ICCV’05) volume 1 (Vol. 2, pp. 1816–1823). doi:10.1109/ICCV.2005.142.
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 580–587), doi:http://dx.doi.org/10.1109/CVPR.2014.81.
Hasan, 2015, A continuous learning framework for activity recognition using deep hybrid feature models, IEEE Transactions on Multimedia, 17, 1909, 10.1109/TMM.2015.2477242
He, 2015, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1904, 10.1109/TPAMI.2015.2389824
Heckman, 1979, Sample selection bias as a specification error, Econometrica, 47, 153, 10.2307/1912352
Hinton, 2006, Reducing the dimensionality of data with neural networks, Science, 313, 504, 10.1126/science.1127647
Hoiem, D., Chodpathumwan, Y., & Dai, Q. (2012). Diagnosing error in object detectors. In Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (Vol. 7574 LNCS(PART 3), pp. 340–353), doi:http://dx.doi.org/10.1007/978-3-642-33712-3_25.
Hosang, J., Omran, M., Benenson, R., & Schiele, B. (2015). Taking a deeper look at pedestrians. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 07-12-June, pp. 4073–4082), doi:http://dx.doi.org/10.1109/CVPR.2015.7299034.
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., …, Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM international conference on multimedia (pp. 675–678), doi:http://dx.doi.org/10.1145/2647868.2654889.
Kapoor, A., Grauman, K., Urtasun, R., & Darrell, T. (2007). Active learning with gaussian processes for object categorization. In 2007 IEEE 11th international conference on computer vision, doi:http://dx.doi.org/10.1109/ICCV.2007.4408844.
Krizhevsky, 2012, ImageNet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 1–9
LeCun, 1989, Backpropagation applied to handwritten zip code recognition, Neural Computation, 10.1162/neco.1989.1.4.541
Lewis, D. D., & Catlett, J. (1994). Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the 11th international conference on machine learning (ICML’94) (pp. 148–156) <http://www.cs.brynmawr.edu/cs372/LeC94.pdf>.
Lewis, D. D., & Gale, W. A. (1994). A sequential algorithm for training text classifiers. In Proceedings of the 17th international conference on research and development in information retrieval (SIGIR’94) (pp. 3–12), doi:http://dx.doi.org/10.1145/219587.219592.
Li, 2010, OPTIMOL: Automatic online picture collection via incremental model learning, International Journal of Computer Vision, 88, 147, 10.1007/s11263-009-0265-6
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., …, Zitnick, C. L. (2014). Microsoft COCO: common objects in context. In Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) (Vol. 8693 LNCS, pp. 740–755), doi:http://dx.doi.org/10.1007/978-3-319-10602-1_48.
Lindenbaum, 2004, Selective sampling for nearest neighbor classifiers, Machine Learning, 54, 125, 10.1023/B:MACH.0000011805.60520.fe
Lu, 2015, Active learning through adaptive heterogeneous ensembling, IEEE Transactions on Knowledge and Data Engineering, 27, 368, 10.1109/TKDE.2014.2304474
Makantasis, K., Doulamis, A., Doulamis, N., & Psychas, K. (2016). Deep learning based human behavior recognition in industrial workflows. In IEEE international conference on image processing (ICIP), 2016 (pp. 1609–1613). IEEE.
Mitchell, T., & Blum, A. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on computational learning theory (pp. 92–100). doi:http://dx.doi.org/10.1145/279943.279962.
Paisitkriangkrai, S., Shen, C., & Hengel, A. Van Den. (2014). Pedestrian detection with spatially pooled features and structured ensemble learning, 1–19, doi:http://dx.doi.org/10.1109/TPAMI.2015.2474388, arXiv Preprint arXiv:1409.5209.
Persello, 2011, Active and semi-supervised learning for the classification of remote sensing images, IEEE Transactions on Geoscience and Remote Sensing, 52, 6937, 10.1109/TGRS.2014.2305805
Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., & Zhang, H.-J. (2008). Two-Dimensional Active Learning for image classification. In IEEE conference on computer vision and pattern recognition (pp. 1–8), doi:http://dx.doi.org/10.1109/CVPR.2008.4587383.
Ren, 2015, Faster R-CNN: Towards real-time object detection with region proposal networks, Nips, 1–10
Roy, N., & McCallum, A. (2001). Toward optimal active learning through sampling estimation of error reduction. in Proceedings of the 18th international conference on machine learning (pp. 441–448). Retrieved from <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.28.9963&rep=rep1&type=pdf>.
Russakovsky, 2015, ImageNet large scale visual recognition challenge, International Journal of Computer Vision, 115, 211, 10.1007/s11263-015-0816-y
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., & LeCun, Y. (2013). OverFeat: Integrated recognition, localization and detection using convolutional networks. arXiv Preprint arXiv:1312.6229. Retrieved from <http://arxiv.org/abs/1312.6229>.
Settles, 2010, Active learning literature survey, Machine Learning, 15, 201
Seung, 1992, Query by committee, Proceedings of the fifth annual workshop on computational learning theory - COLT’92, 287, 10.1145/130385.130417
Siddiquie, B., & Gupta, A. (2010). Beyond active noun tagging: Modeling contextual interactions for multi-class active learning. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 2979–2986), doi:http://dx.doi.org/10.1109/CVPR.2010.5540044.
Simonyan, 2014, Very deep convolutional networks for large-scale image recognition, ImageNet Challenge, 1–10
Simonyan, 2015, Very deep convolutional networks for large-scale image recognition, Iclr, 1–14
Sorokin, A., & Forsyth, D. (2008). Utility data annotation with Amazon Mechanical Turk. In 2008 IEEE Computer society conference on computer vision and pattern recognition workshops, CVPR workshops, doi:http://dx.doi.org/10.1109/CVPRW.2008.4562953.
Su, H., Deng, J., Fei-Fei, Li. (2012). Crowdsourcing annotations for visual object detection. Human Computation AAAI Technical Report WS-12-08.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., …, Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 07-12-June (pp. 1–9), doi:http://dx.doi.org/10.1109/CVPR.2015.7298594.
Tian, Y., Luo, P., Wang, X., & Tang, X. (2015). Pedestrian detection aided by deep learning semantic tasks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 07-12-June, pp. 5079–5087). doi:http://dx.doi.org/10.1109/CVPR.2015.7299143.
Tong, 2001, Support vector machine active learning with applications to text classification, Journal of Machine Learning Research, 45–66
Uijlings, 2013, Selective search for object recognition, International Journal of Computer Vision, 104, 154, 10.1007/s11263-013-0620-5
Vijayanarasimhan, 2009, Multi-level active prediction of useful image annotations for recognition, Advances in Neural Information Processing Systems, 21, 1705
Welinder, P., & Perona, P. (2010). Online crowdsourcing: Rating annotators and obtaining cost-effective labels. In 2010 IEEE computer society conference on computer vision and pattern recognition - workshops, CVPRW 2010 (pp. 25–32). doi:http://dx.doi.org/10.1109/CVPRW.2010.5543189.
Xu, Z. & Akella R. (2008). Active relevance feedback for difficult queries In Proceedings of the 17th ACM conference on Information and knowledge management, Napa Valley, California, USA (pp. 459–468).
Xu, Z., Yu, K., Tresp, V., Xu, X., & Wang, J. (2003). Representative sampling for text classification using support vector machines. In Proceedings of ECIR-03, 25th European conference on information retrieval (pp. 393-407). Retrieved from <http://link.springer.de/link/service/series/0558/papers/2633/26330393.pdf>.
Xu, L., Li, B., & Chen, E. (2012). Ensemble pruning via constrained eigen-optimization. In Proceedings - IEEE international conference on data mining, ICDM (pp. 715–724). doi:http://dx.doi.org/10.1109/ICDM.2012.97.
Yang, B., Yan, J., Lei, Z., & Li, S. Z. (2015). Convolutional channel features. In ICCV, doi:http://dx.doi.org/10.1109/ICCV.2015.18.
Yang, Y., Wang, Z., & Wu, F. (2015). Exploring prior knowledge for pedestrian detection. In BMVC2015 (pp. 1–12).
Zadrozny, B. (2004). Learning and evaluating classifiers under sample selection bias. In Twenty-first international conference on machine learning - ICML ’04 (p. 114). doi:http://dx.doi.org/10.1145/1015330.1015425.
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer vision–ECCV 2014 (Vol. 8689, pp. 818–833), doi:http://dx.doi.org/10.1007/978-3-319-10590-1_53, arXiv:1311.2901v3 [cs.CV] 28 Nov 2013.
Zhang, S., Benenson, R., & Schiele, B. (2015). Filtered channel features for pedestrian detection. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 07–12-June, pp. 1751–1760). doi:http://dx.doi.org/10.1109/CVPR.2015.7298784.
Zhu, 2008, Semi-supervised learning literature survey contents, Sciences New York, 10, 10