VEDesc: vertex-edge constraint on local learned descriptors

Signal, Image and Video Processing - Tập 17 - Trang 865-872 - 2021
Jianhua Yin1, Longzhen Zhu1, Yang Bai1, Zhenyu He1,2
1School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Shenzhen, China
2Peng Cheng Laboratory, Shenzhen, China

Tóm tắt

To improve the performance of local learned descriptors, many researchers pay primary attention to the triplet loss network. As expected, it is useful to achieve state-of-the-art performance on various datasets. However, these local learned descriptors suffer from the inconsistency problem without considering the relationship between two descriptors in a patch. Consequently, the problem causes the irregular spatial distribution of local learned descriptors. In this paper, we propose a neat method to overcome the above inconsistency problem. The core idea is to design a triplet loss function of vertex-edge constraint (VEC), which takes the correlation between two descriptors of a patch into account. Furthermore, to minimize the non-matching descriptors’ influence, we propose an exponential algorithm to reduce the difference between the long and short sides. The competitive performance against state-of-the-art methods on various datasets demonstrates the effectiveness of the proposed method.

Tài liệu tham khảo

Zhu, S., Zhang, R., Zhou, L., Shen, T., Fang, T., Tan, P., Quan, L.: Very large-scale global sfm by distributed motion averaging. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4568–4577 (2018) Azuma, R.: A survey of augmented realty, presence teleoperators and virtual. Environments 6(4), 355–385 (1997) Azuma, R.: Recent advances in augmented reality. IEEE Comput. Gr. Appl. 21(6), 34–47 (2001) Djordjevic, D., Cvetkoviç, S., Nikoliç, S.V.: An accurate method for 3D object reconstruction from unordered sparse views. Signal Image Video Process. 11(6), 1147–1154 (2017) Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping (slam) part I. IEEE Robot. Autom. Mag. 13(2), 99–110 (2006) Bailey, T., Durrant-Whyte, H.: Simultaneous localization and mapping (slam) part II. IEEE Robot. Autom. Mag. 13(3), 108–117 (2006) David, G.L.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004) Bay, H., Tuytelaars, T., Van Surf, G.: Speeded up robust features. In: European Conference on Computer Vision, pp. 404–417 (2006) Calonder, M., Lepetit, V., Strecha, C.: Brief: Binary robust independent elementary features. In: European Conference on Computer Vision, pp. 778–792 (2011) Leutenegger, S., Siegwart, R. Y., Chli, M.: Brisk: binary robust invariant scalable keypoints. In: International Conference on Computer Vision, pp. 2548–2555 (2011) Bradski, G.K., Konolige, V.R., Orb, E.R.: An efficient alternative to sift or surf. In: International Conference on Computer Vision, pp. 2564–2571 (2011) Al-Garaawi, N., Wu, T., Morris, O.: Brief-based face descriptor: an application to automatic facial expression recognition (afer). Signal Image Video Process. 25, 193 (2020) Tian, Y., Fan, B., Wu, F.: L2-net: Deep learning of discriminative patch descriptor in euclidean space. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6128–6136 (2017) Mishchuk, A., Mishkin, D., Radenovic, F., Matas, J.: Working hard to know your neighbor’s margins: Local descriptor learning loss. In: Neural Information Processing Systems, pp. 4829–4840 (2017) He, K., Yan, L., Stan, S.: Local descriptors optimized for average precision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 596–605 (2018) Wang, S., Li, Y., Liang, X., Quan, D., Yang, B., Wei, S., Jiao, L.: Better and faster: exponential loss for image patch matching. In: International Conference on Computer Vision, pp. 4811–4820 (2019) Zhang, L., Szymon, R.: Learning local descriptors with a cdf-based dynamic soft margin. In: International Conference on Computer Vision, pp. 2969–2978 (2019) Dong, J., Soatto, S.: Domain-size pooling in local descriptors: Dsp-sift. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5097–5106 (2015) Ke, Y., Sukthankar, R.: Pca-sift: a more distinctive representation for local image descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 506–513 (2004) Han, X., Leung, T., Jia, Y., Rahul, S., Alexander, C.B.: Matchnet: unifying feature and metric learning for patch-based matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3279–3286 (2015) Kumar, V.B.G., Carneiro, G.R.I.: Learning local image descriptors with deep siamese and triplet convolutional networks by minimising global loss functions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5385–5394 (2016) S.-S. Edgar, T. Eduard, F. Luis, K. Iasonas, F. Pascal, M.-N. Francesc, Discriminative learning of deep convolutional feature point descriptors, in: International Conference on Computer Vision, 2015, pp. 118–126 Vassileios, B., Edgar, R., Daniel, P., Krystian, M.: Learning local feature descriptors with triplets and shallow convolutional neural networks. In: British Machine Vision Conference, pp. 119.1–119.11 (2016) Rathee, N., Ganotra, D.: An efficient approach for facial action unit intensity detection using distance metric learning based on cosine similarity. SIViP 12(6), 1141–1148 (2018) Yuan, D., Kang, W., He, Z.: Tracking: robust visual tracking with correlation filters and metric learning. Knowl-Based Syst. 2020(195), 137 (2020) Li, D., Tian, Y.: Global and local metric learning via eigenvectors. Knowl.-Based Syst. 2017(116), 152–162 (2017) Lin, H., Fu, Y., Lu, P., Gong, S., Xue, X., Jiang, Y.: Tc-net for isbir: Triplet classifcation network for instance-level sketch based image retrieval. In: In Proceedings of the 27th ACM International Conference on Multimedia, pp. 1676–1684 (2019) Yu, J., Hu, C.-H., Jing, X.-Y., Feng, Y.-J.: Deep metric learning with dynamic margin hard sampling loss for face verification. SIViP 14(4), 791–798 (2020) Yoshida, T., Takeuchi, I., Karasuyama, M.: Distance metric learning for graph structured data, arXiv:2002.00727 (2020) Shen, X., Wang, C., Li, X., Yu, Z., Li, J., Wen, C., Cheng, M., He, Z.: Rf-net: An end-to-end image matching network based on receptive field. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8124–8132 (2019) Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a mahalanobis metric from equivalence constraints. J. Mach. Learn. Res. 6(6), 937–965 (2005) Globerson, A., Roweis, S.: Metric learning by collapsing classes. Neural Inf. Process. Syst. 31, 451–458 (2005) Wen, J., Xu, Y., Liu, H.: Incomplete multiview spectral clustering with adaptive graph learning. IEEE Trans. Cybern. 20(4), 1418–1429 (2018) Wen, J., Yan, K., Zhang, Z., et al.: Adaptive graph completion based incomplete multi-view clustering. IEEE Trans. Multimed. 20, 59 (2020) Brown, M., Winder, S.J.: Learning local image descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007) Balntas, V., Lenc, K., Vedaldi, A., Mikolajczyk, K.: Hpatches: a benchmark and evaluation of handcrafted and learned local descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3852–3861 (2017) Mishkin, D., Matas, J., Perdoch, M., Lenc, K.: Wxbs: wide baseline stereo generalizations. In: Proceedings of the British Machine Vision Conference, pp. 12.1–12.12 (2015) Bsat, M., Sim, T., Baker, S.: A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1615–1618 (2003) Ramnath, K., Zitnick, C.: Edge foci interest points. In: International Conference on Machine Learning, pp. 359–366 (2011) Krystian, M., Cordelia, S.: Scale and affine invariant interest point detectors. Int. J. Comput. Vision 60(1), 63–86 (2004) Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust widebaseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004) Tian, Y., Yu, X., Fan, B., Wu, F., Heijnen, H., Balntas, V.: Sosnet: second order similarity regularization for local descriptor learning. In: International Conference on Computer Vision, pp. 118–126 (2015)