An effective few-shot learning approach via location-dependent partial differential equation

Knowledge and Information Systems - Tập 62 - Trang 1881-1901 - 2019
Haotian Wang1, Zhenyu Zhao2, Yuhua Tang1
1National University of Defense Technology, Changsha, China
2College of Liberal Arts and Sciences, National University of Defense Technology, Changsha, China

Tóm tắt

Recently, learning-based partial differential equation (L-PDE) has achieved success in few-shot learning area, while its feature weighting mechanism and recognition stability require further improvement. To address these issues, we propose a novel model called “location-dependent PDE” (LD-PDE) based on Navier–Stokes equation and rotational invariants in this paper. To our best knowledge, LD-PDE is the first application of the Navier–Stokes equation to achieve image recognition as a high-level vision task. Specifically, we formulate the feature variation with respect to each time step as a linear combination of rotational invariants in LD-PDE. Meanwhile, we design location-dependent mechanism to adaptively weight each invariant in an attention-based approach, which provides hierarchical discrimination in the spatial domain. Once the ultimate feature is learned, we measure the model error with the cross-entropy loss and update the parameters by the coordinate descent algorithm. As a verification, experimental results on face recognition datasets show that LD-PDE method outperforms the state-of-the-art approaches with few training samples. Moreover, compared to L-PDE, LD-PDE achieves a much more stable recognition with low sensitivity to its hyper-parameters.

Tài liệu tham khảo

Batchelor G (1967) An introduction to fluid dynamics. Cambridge University Press, Cambridge Behmardi D, Nayeri E (2008) Introduction of fréchet and gâteaux derivative. Appl Math Sci 2(20):975–980 Chan TF, Esedoglu S (2005) Aspects of total variation regularized \(L^1\) function approximation. SIAM J Appl Math 65(5):1817–1837 Chen K, Liu L (2009) “Best k”: critical clustering structures in categorical datasets. Knowl Inf Syst 20(1):1–33 Chorowski JK, Bahdanau D, Serdyuk D, Cho K, Bengio Y (2015) Attention-based models for speech recognition. In: Proceedings of the 28th international conference on neural information processing systems, vol 1, pp 577–585 Fan F, Zhao WX, Wen JR, Xu G, Chang EY (2017) Mining collective knowledge: inferring functional labels from online review for business. Knowl Inf Syst 53(3):723–747 Fang C, Zhao Z, Zhou P, Lin Z (2017) Feature learning via partial differential equation with applications to face recognition. Pattern Recognit 69:14–25 Fletcher R (1976) Conjugate gradient methods for indefinite systems. Numerical analysis. Springer, Berlin, pp 73–89 Ganan S, McClure D (1985) Bayesian image analysis: an application to single photon emission tomography. Am Stat Assoc 20:12–18 Garcia V, Bruna J (2017) Few-shot learning with graph neural networks. arXiv preprint arXiv:1711.04043 Geng M, Shang S, Ding B, Wang H, Zhang P (2019) Unsupervised learning-based depth estimation-aided visual slam approach. Circuits Syst Signal Process 1–28 Georghiades AS, Belhumeur PN, Kriegman DJ (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 23(6):643–660 Hinton GE, Salakhutdinov RR (2009) Replicated softmax: an undirected topic model. In: Proceedings of the 22nd international conference on neural information processing systems, pp 1607–1614 Horng MH, Liou RJ (2011) Multilevel minimum cross entropy threshold selection based on the firefly algorithm. Expert Syst Appl 38(12):14805–14811 Ishii H (1989) A boundary value problem of the dirichlet type for hamilton–jacobi equations. Annali della Scuola Normale Superiore di Pisa-Classe di Scienze 16(1):105–135 Jang E, Gu S, Poole B (2016) Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 Jiang L, Li C (2019) Two improved attribute weighting schemes for value difference metric. Knowl Inf Syst 60(2):949–970 Jiang Z, Lin Z, Davis LS (2013) Label consistent K-SVD: learning a discriminative dictionary for recognition. IEEE Trans Pattern Anal Mach Intell 35(11):2651–2664 Le QV, Ngiam J, Coates A, Lahiri A, Prochnow B, Ng AY (2011) On optimization methods for deep learning. In: Proceedings of the 28th international conference on international conference on machine learning. Omnipress, pp 265–272 Leroux S, Bohez S, De Coninck E, Verbelen T, Vankeirsbilck B, Simoens P, Dhoedt B (2017) The cascading neural network: building the internet of smart things. Knowl Inf Syst 52(3):791–814 Li D, Wen G, Hou Z, Huan E, Hu Y, Li H (2019) Rtcrelief-f: an effective clustering and ordering-based ensemble pruning algorithm for facial expression recognition. Knowl Inf Syst 59(1–32):219–250 Liu R, Lin Z, Zhang W, Su Z (2010) Learning PDES for image restoration via optimal control. In: European conference on computer vision. Springer, Berlin, pp 115–128 Liu R, Lin Z, Zhang W, Tang K, Su Z (2013) Toward designing intelligent pdes for computer vision: an optimal control approach. Image Vis Comput 31(1):43–56 Liu R, Cao J, Lin Z, Shan S (2014) Adaptive partial differential equation learning for visual saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3866–3873 Mairal J, Bach F, Ponce J (2012) Task-driven dictionary learning. IEEE Trans Pattern Anal Mach Intell 34(4):791–804 Martinez AM (1998) The AR face database. CVC Technical Report 24 Mitchell AR, Griffiths DF (1980) The finite difference method in partial differential equations. Wiley, New York Osher S, Rudin LI (1990) Feature-oriented image enhancement using shock filters. SIAM J Numer Anal 27(4):919–940 Parkhi OM, Vedaldi A, Zisserman A et al (2015) Deep face recognition. BMVC 1(3):6 Peterson LE (2009) \(K\)-nearest neighbor. Scholarpedia 4(2):1883 Rudin LI, Osher S, Fatemi E (1992) Nonlinear total variation based noise removal algorithms. Phys D Nonlinear Phenom 60(1–4):259–268 Sapiro G (2006) Geometric partial differential equations and image analysis. Cambridge University Press, Cambridge Sayeed F, Hanmandlu M (2017) Properties of information sets and information processing with an application to face recognition. Knowl Inf Syst 52(2):485–507 Sim T, Baker S, Bsat M (2002) The CMU pose, illumination, and expression (PIE) database. In: Proceedings of fifth IEEE international conference on automatic face and gesture recognition. IEEE, pp 53–58 Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Proceedings of the 28th international conference on neural information processing systems, pp 4077–4087 Strong D, Chan T (2003) Edge-preserving and scale-dependent properties of total variation regularization. Inverse Probl 19(6):S165 Tai XC, Borok S, Hahn J (2009) Image denoising using TV-stokes equation with an orientation-matching minimization. In: International conference on scale space and variational methods in computer vision. Springer, pp 490–501 Tao D, Li X, Hu W, Maybank S, Wu X (2005) Supervised tensor learning. In: Fifth IEEE international conference on data mining. IEEE, p 8 Temam R (2001) Navier–Stokes equations: theory and numerical analysis, vol 343. American Mathematical Society, New York Triantafillou E, Zemel R, Urtasun R (2017) Few-shot learning through an information retrieval lens. In: Proceedings of the 31st international conference on neural information processing systems, pp 2255–2265 Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77–95 Vincent OR, Folorunso O (2009) A descriptive algorithm for sobel image edge detection. In: Proceedings of informing science and IT education conference (InSITE), vol 40. Informing Science Institute, California, pp 97–107 Vinyals O, Blundell C, Lillicrap T, Wierstra D et al (2016) Matching networks for one shot learning. In: Proceedings of the 30th international conference on neural information processing systems, pp 3630–3638 Wang Y, Huang M, Zhao L et al (2016) Attention-based LSTM for aspect-level sentiment classification. In: Proceedings of the 2016 conference on empirical methods in natural language processing, pp 606–615 Wang YCF, Wei CP, Chen CF (2012) Low-rank matrix recovery with structural incoherence for robust face recognition. In: IEEE Conference on computer vision and pattern recognition. IEEE, pp 2618–2625 Wardetzky M, Mathur S, Kälberer F, Grinspun E (2007) Discrete Laplace operators: no free lunch. In: Symposium on geometry processing. Aire-la-Ville, Switzerland, pp 33–37 Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227 Wright SJ (2015) Coordinate descent algorithms. Math Program 151(1):3–34 Yao Y, Liu Y, Yu Y, Xu H, Lv W, Li Z, Chen X (2013) K-SVM: an effective svm algorithm based on \(k\)-means clustering. JCP 8(10):2632–2639 Zhang Y, Jiang Z, Davis LS (2013) Learning structured low-rank representations for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 676–683 Zhao W, Chellappa R, Phillips PJ, Rosenfeld A (2003) Face recognition: a literature survey. ACM Comput Surv (CSUR) 35(4):399–458 Zhao W, Liu H, Dai W, Ma J (2016) An entropy-based clustering ensemble method to support resource allocation in business process management. Knowl Inf Syst 48(2):305–330 Zhou P, Lin Z, Zhang C (2016) Integrated low-rank-based discriminative feature learning for recognition. IEEE Trans Neural Netw Learn Syst 27(5):1080–1093 Zineddin B, Wang Z, Liu X (2011) Cellular neural networks, the Navier–Stokes equation, and microarray image reconstruction. IEEE Trans Image Process 20(11):3296–3301