Graph-based particular object discovery

Machine Vision and Applications - Tập 30 - Trang 243-254 - 2019
Oriane Siméoni1, Ahmet Iscen2, Giorgos Tolias2, Yannis Avrithis1, Ondřej Chum2
1Inria, Univ Rennes, CNRS, IRISA, Rennes, France
2VRG, FEE, CTU in Prague, Prague, Czech Republic

Tóm tắt

Severe background clutter is challenging in many computer vision tasks, including large-scale image retrieval. Global descriptors, which are popular due to their memory and search efficiency, are especially prone to corruption by such a clutter. Eliminating the impact of the clutter on the image descriptor increases the chance of retrieving relevant images and prevents topic drift due to actually retrieving the clutter in the case of query expansion. In this work, we propose a novel salient region detection method. It captures, in an unsupervised manner, patterns that are both discriminative and common in the dataset. Saliency is based on a centrality measure of a nearest neighbor graph constructed from regional CNN representations of dataset images. The proposed method exploits recent CNN architectures trained for object retrieval to construct the image representation from the salient regions. We improve particular object retrieval on challenging datasets containing small objects.

Tài liệu tham khảo

Arandjelović, R., Zisserman, A.: Visual vocabulary with a semantic twist. In: ACCV (2014) Avrithis, Y., Kalantidis, Y.: Approximate gaussian mixtures for large scale vocabularies. In: ECCV, pp. 15–28. Springer (2012) Azizpour, H., Razavian, A.S., Sullivan, J., Maki, A., Carlsson, S.: From generic to specific deep representations for visual recognition. arXiv preprint arXiv:1406.5774 (2014) Babenko, A., Lempitsky, V.: Aggregating deep convolutional features for image retrieval. In: ICCV (2015) Babenko, A., Slesarev, A., Chigorin, A., Lempitsky, V.: Neural codes for image retrieval. In: ECCV (2014) Bagon, S., Brostovski, O., Galun, M., Irani, M.: Detecting and sketching the common. In: CVPR (2010) Cho, M., Kwak, S., Schmid, C., Ponce, J.: Unsupervised object discovery and localization in the wild: part-based matching with bottom-up region proposals. In: CVPR (2015) Chum, O., Matas, J.: Unsupervised discovery of co-occurrence in sparse high dimensional data. In: CVPR (2010) Dong, W., Charikar, M., Li, K.: Efficient k-nearest neighbor graph construction for generic similarity measures. In: WWW (2011) Gammeter, S., Bossard, L., Quack, T., Gool, L.V.: I know what you did last summer: Object-level auto-annotation of holiday snaps. In: ICCV (2009) Gordo, A., Almazan, J., Revaud, J., Larlus, D.: Deep image retrieval: Learning global representations for image search. In: ECCV (2016) Gordo, A., Almazan, J., Revaud, J., Larlus, D.: End-to-end learning of deep visual representations for image retrieval. arXiv preprint arXiv:1610.07940 (2016) Hubbell, C.H.: An input-output approach to clique identification. Sociometry (1965) Iscen, A., Avrithis, Y., Tolias, G., Furon, T., Chum, O.: Fast spectral ranking for similarity search. In: CVPR (2018) Iscen, A., Tolias, G., Avrithis, Y., Furon, T., Chum, O.: Efficient diffusion on region manifolds: recovering small objects with compact cnn representations. In: CVPR (2017) Jégou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. IJCV 87(3), 316–336 (2010) Jeong, D.-J., Choo, S., Seo, W., Cho, N.I.: Regional deep feature aggregation for image retrieval. In: ICASSP (2017) Jimenez, A., Alvarez, J.M., Giro-i Nieto, X.: Class-weighted convolutional features for visual instance search. In: BMVC (2017) Kalantidis, Y., Mellina, C., Osindero, S.: Cross-dimensional weighting for aggregated deep convolutional features. In: arXiv (2015) Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953) Kim, G., Torralba, A.: Unsupervised detection of regions of interest using iterative link analysis. In: NIPS (2009) Kim, J., Yoon, S.-E.: Regional attention based deep feature for image retrieval. In: BMVC (2018) Knopp, J., Sivic, J., Pajdla, T.: Avoiding confusing features in place recognition. In: ECCV (2010) Kwak, S., Cho, M., Laptev, I., Ponce, J., Schmid, C.: Unsupervised object discovery and tracking in video collections. In: CVPR (2015) Laskar, Z., Kannala, J.: Context aware query image representation for particular object retrieval. In: Scandinavian Conference on Image Analysis (2017) Mej, N.: Networks: An Introduction. Oxford University Press, Oxford (2010) Mikolajczyk, K., Matas, J.: Improving descriptors for fast tree matching by optimal linear projection. In: CVPR (2007) Mohedano, E., McGuinness, K., Giro-i Nieto, X., O’Connor, N.E.: Saliency weighted convolutional features for instance search. arXiv preprint arXiv:1711.10795 (2017) Nocedal, J., Wright, S.: Numerical Optimization. Springer, Berlin (2006) Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B.: Large-scale image retrieval with attentive deep local features. In: arXiv (2016) Oliva, A., Torralba, A.: Building the gist of a scene: the role of global image features in recognition. Prog. Brain Res. 155, 23–36 (2006) Omercevic, D., Perko, R., Targhi, A.T., Eklundh, J.-O., Leonardis, A.: Vegetation segmentation for boosting performance of mser feature detector. In: Computer Vision Winter Workshop (2008) Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web (1999) Pang, S., Ma, J., Xue, J., Zhu, J., Ordonez, V.: Image retrieval using heat diffusion for deep feature aggregation. arXiv preprint arXiv:1805.08587 (2018) Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (2007) Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (2008) Radenović, F., Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Revisiting oxford and paris: large-scale image retrieval benchmarking. In: CVPR (2018) Radenović, F., Tolias, G., Chum, O.: CNN image retrieval learns from bow: unsupervised fine-tuning with hard examples. In: ECCV (2016) Radenović, F., Tolias, G., Chum, O.: Fine-tuning cnn image retrieval with no human annotation. IEEE Trans. PAMI (2018) Razavian, A.S., Sullivan, J., Carlsson, S., Maki, A.: Visual instance retrieval with deep convolutional networks. ITE Trans. Media. Technol. Appl. 4, 251–258 (2016) Rubinstein, M., Joulin, A., Kopf, J., Liu, C.: Unsupervised joint object discovery and segmentation in internet images. In: CVPR (2013) Salvador, A., Giró-i Nieto, X., Marqués, F., Satoh, S.: Faster r-cnn features for instance search. In: CVPRW (2016) Selvaraju, R.R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., Batra, D.: Grad-CAM: Why did you say that? visual explanations from deep networks via gradient-based localization. arXiv preprint arXiv:1610.02391 (2016) Shi, M., Avrithis, Y., Jegou, H.: Early burst detection for memory-efficient image retrieval. In: CVPR (2015) Simeoni, O., Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Unsupervised object discovery for instance recognition. In: WACV (2018) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. ICLR (2014) Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV (2003) Song, J., He, T., Gao, L., Xu, X., Shen, H.T.: Deep region hashing for efficient large-scale instance search from images. In: arXiv (2017) Tolias, G., Avrithis, Y., Jégou, H.: Image search with selective match kernels: aggregation across single and multiple images. IJCV (2016) Tolias, G., Kalantidis, Y., Avrithis, Y.: Symcity: Feature selection by symmetry for large scale image retrieval. In: ACM Multimedia (2012) Tolias, G., Sicre, R., Jégou, H.: Particular object retrieval with integral max-pooling of cnn activations. In: ICLR (2016) Turcot, P., Lowe, D.G.: Better matching with fewer features: The selection of useful features in large database recognition problems. In: ICCVW (2009) Vigna, S.: Spectral ranking. arXiv preprint arXiv:0912.0238 (2009) Wang, S., Jiang, S.: Instre: a new benchmark for instance-level object retrieval and recognition. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 11, 37 (2015) Zheng, L., Wang, S., Wang, J., Tian, Q.: Accurate image search with multi-scale contextual evidences. IJCV 120(1), 1–13 (2016) Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: CVPR (2016) Zhou, D., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B.: Ranking on data manifolds. In: NIPS (2003) Zhu, Y., Wang, J., Xie, L., Zheng, L.: Attention-based pyramid aggregation network for visual place recognition. arXiv preprint arXiv:1808.00288 (2018)