Neural Networks in Video-Based Age and Gender Recognition on Mobile Platforms
Tóm tắt
The paper considers the use of convolutional neural networks for the concurrent recognition of the gender and age of a person by video records of his face. The emphasis is on the incorporation of the approach into mobile video analytics systems. We have investigated the fusion of decisions obtained during the processing of each video frame, including the use of the classifier committee based on Dempster-Shafer theory. We propose the novel age prediction method using the evaluation of the expectation of the most probable ages. We have compared existing neural-net models with a specially trained modification of the MobileNet convolution network with two outputs. The experimental results are given for such data collections as Kinect, IJB-A, Indian Movie and EmotiW. As compared with other conventional methods, our approach makes it possible to increase the age and gender recognition accuracy by 2–5% and 5–10% respectively.
Tài liệu tham khảo
Zhang, H., Object-level video advertising: An optimization framework, IEEE Trans. Ind. Inf., 2017, vol. 13, no. 2, pp. 520–531.
Savchenko, A.V., Search Techniques in Intelligent Classification Systems, Basel: Springer-Verlag, 2016.
Kittler, J. and Alkoot, E.M., Sum versus vote fusion in multiple classifiers, IEEE Trans. Pattern Anal. Mach. Int., 2003, vol. 25, no. 1, pp. 110–115.
Bagheri, M.A., Gao, Q., and Escalera, S., Logo recognition based on the Dempster-Shafer fusion of multiple classifiers, Can. Conf. Artif. Intell., Springer Berlin Heidelberg, 2013, pp. 1–12.
Dempster, A., Upper and lower probabilities induced by multivalued mappings, Ann. Math. Stat., 1967, vol. 38, no. 2, pp. 325–339.
LeCun, Y., Bengio, Y., and Hinton, G., Deep learning, Nature, 2015, vol. 521, no. 7553, pp. 436–444.
Theodoridis, S., Pattern Recognition, Elsevier Inc., 2009.
Levi, G. and Hassner, T., Age and gender classification using convolutional neural networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015, pp. 34–42.
Rothe, R., Timofte, R., and Van Gool, L., DEX: Deep EXpectation of apparent age from a single image, Proceedings of the IEEE International Conference on Computer Vision Workshops, 2015, pp. 10–15.
Kwon, Y.H., Age classification from facial images, Proceedings CVPR'94 IEEE Computer Society Conference, 1994, pp. 762–767.
Geng, X., Learning from facial aging patterns for automatic age estimation, Proceedings of the 14th ACM International Conference on Multimedia. ACM, 2006, pp. 307–316.
Yan, S., Zhou, X., Liu, M., Hasegawa-Johnson, M., and Huang, T.S., Regression from patch-kernel, Proc. Conf. Comput. Vision Pattern Recognition. IEEE, 2008.
Guo, G., Mu, G., and Fu, Y., Human age estimation using bio-inspired features, CVPR 2009, IEEE Conference on, 2009, pp. 112–119.
Choi, S.E., Age estimation using a hierarchical classifier based on global and local facial features, Pattern Recognit., 2011, no. 6, pp. 1262–1281.
Makinen, E. and Raisamo, R., Evaluation of gender classification methods with automatically detected and aligned faces, IEEE Trans. Pattern Anal. Mach. Intell., 2008, vol. 30, no. 4, pp. 541–547.
Shan, C., Face recognition and retrieval in video, in Video Search and Mining, Springer Berlin Heidelberg, 2010, pp. 235–260.
Simonyan, K. and Zisserman, A., Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556, 2014.
Jin, J., Dundar, A., and Culurciello, E., Flattened convolutional neural networks for feedforward acceleration, arXiv preprint arXiv:1412.5474, 2014.
Howard, A. et al., MobileNets: Efficient convolutional neural networks for mobile vision applications, arXiv preprint arXiv:1704.04861, 2017.
Savchenko, A.V. and Belova, N.S., Unconstrained face identification using maximum likelihood of distances between deep off-the-shelf features, Expert Syst. Appl., 2018, vol. 108, pp. 170–182.
Cao, Q., Shen, L., Xie, W., Parkhi, O.M., and Zisserman, A., Vggface2: A dataset for recognizing faces across pose and age, Automatic Face & Gesture Recognition (FG 2018), 2018 13th IEEE International Conference on, 2018, pp. 67–74.
Eidinger, E., Enbar, R., Hassner, T., Age and gender estimation of unfiltered faces, Trans. Inf. Forensics Secur., 2014, vol. 9, no. 12.
Szegedy, C., Going deeper with convolutions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1–9.
Krizhevsky, A., Sutskever, I., and Hinton, G.E., Imagenet classification with deep convolutional neural networks, Adv. Neural Inf. Process. Syst., 2012, pp. 1097–1105.
Esmaeili, M., Creating of multiple classifier systems by fuzzy decision making in human-computer interface systems, Conference IEEE Fuzzy Systems, 2007, pp. 1–7.
Savchenko, A.V., Belova, N.S., and Savchenko, L.V., Fuzzy analysis and deep convolution neural networks in still-to-video recognition, Opt. Mem. Neural Networks (Inf. Opt.), 2018, vol. 27, no. 1, pp. 23–31.
Savchenko, A.V., Adaptive video image recognition system using a committee machine, Opt. Mem. Neural Networks (Inf. Opt.), 2012, vol. 21, no. 4, pp. 219–226.
Lienhart, R. and Maydt, J., An extended set of Haar-like features for rapid object detection, Proceedings of IEEE International Conference on Image Processing, 2002, vol. 1, p. 1.
Kaipeng, Z., Zhanpeng, Z., Zhifeng, L., and Qiao, Y., Joint face detection and alignment using multi-task cascaded convolutional networks, IEEE Signal Process. Lett., 2016, vol. 23, no. 10, pp. 1499–1503.
Chen, W., Compressing neural networks with the hashing trick, International Conference on Machine Learning, 2015, pp. 2285–2294.
Han, S., Han, H., and Mao, W., Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding, arXiv preprint arXiv:1510.00149, 2015.
Wu, J., Wu, J., Leng, C., Wang, Y., Hu, Q., and Cheng, J., Quantized convolutional neural networks for mobile devices, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4820–4828.
Min, R., Kose N., Dugelay, J., KinectFaceDB: A kinect database for face recognition, IEEE Trans. Syst. Man Cybern. Syst., 2014, vol. 44, no. 11, pp. 1534–1548.
Setty, S. et al., Indian movie face database: A benchmark for face recognition under wide variations, Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), IEEE Fourth National Conference on, 2013, pp. 1–5.
Klare, B., Klein, B., Taborsky, E., Blanton, A., Cheney, J., Allen, K., and Jain, A.K., Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1931–1939.
Dhall, A., Goecke, R., Gedeon, T., and Sebe, N., Emotion recognition in the wild, J. Multimodal User Interfaces, 2016, vol. 10, no. 2, pp. 95–97.