The Earth Mover's Distance as a Metric for Image Retrieval

International Journal of Computer Vision - Tập 40 Số 2 - Trang 99-121 - 2000
Rubner, Yossi1, Tomasi, Carlo1, Guibas, Leonidas J.1
1Computer Science Department, Stanford University, Stanford, USA

Tóm tắt

We investigate the properties of a metric between two distributions, the Earth Mover's Distance (EMD), for content-based image retrieval. The EMD is based on the minimal cost that must be paid to transform one distribution into the other, in a precise sense, and was first proposed for certain vision problems by Peleg, Werman, and Rom. For image retrieval, we combine this idea with a representation scheme for distributions that is based on vector quantization. This combination leads to an image comparison framework that often accounts for perceptual similarity better than other previously proposed methods. The EMD is based on a solution to the transportation problem from linear optimization, for which efficient algorithms are available, and also allows naturally for partial matching. It is more robust than histogram matching techniques, in that it can operate on variable-length representations of the distributions that avoid quantization and other binning problems typical of histograms. When used to compare distributions with the same overall mass, the EMD is a true metric. In this paper we focus on applications to color and texture, and we compare the retrieval performance of the EMD with that of other distances.

Từ khóa


Tài liệu tham khảo

citation_title=Network Flows; citation_publication_date=1993; citation_id=CR1; citation_author=R.K. Ahuja; citation_author=T.L. Magnanti; citation_author=J.B. Orlin; citation_publisher=Prentice Hall

Belongie, S., Carson, C., Greenspan, H., and Malik, J. 1998. Colorand texture-based image segmentation using EM and its application to content-based image retrieval. In IEEE International Conference on Computer Vision, Bombay, India. pp. 675–682.

citation_journal_title=Communications of the ACM; citation_title=Multidimensional binary search trees used for associative searching; citation_author=J.L. Bentley; citation_volume=18; citation_publication_date=1975; citation_pages=509-517; citation_id=CR3

citation_journal_title=IEEE Transactions on Pattern Analysis and Machine Intelligence; citation_title=N-folded symmetries by complex moments in Gabor space and their application to unsupervised texture segmentation; citation_author=J. Bigün, J.M. du Buf; citation_volume=16; citation_issue=1; citation_publication_date=1994; citation_pages=80-87; citation_id=CR4

citation_journal_title=IEEE Transactions on Pattern Analysis and Machine Intelligence; citation_title=Multichannel texture analysis using localized spatial filters; citation_author=A.C. Bovik, M. Clark, W.S. Geisler; citation_volume=12; citation_issue=12; citation_publication_date=1990; citation_pages=55-73; citation_id=CR5

citation_journal_title=SIGMOD Record (ACM Special Interest Group on Management of Data); citation_title=Distance-based indexing for high-dimensional metric spaces; citation_author=T. Bozkaya, M. Ozsoyoglu; citation_volume=26; citation_issue=2; citation_publication_date=1997; citation_pages=357-368; citation_id=CR6

citation_title=Textures: A Photographic Album for Artists and Designers; citation_publication_date=1966; citation_id=CR7; citation_author=P. Brodatz; citation_publisher=Dover

Clarkson, K.L. 1997. Nearest neighbor queries in metric spaces. In ACM Symposium on the Theory of Computing, pp. 609–617.

citation_title=Elements of Information Theory; citation_publication_date=1991; citation_id=CR9; citation_author=T.M. Cover; citation_author=J.A. Thomas; citation_publisher=John Wiley & Sons

Das, M., Riseman, E.M., and Draper, B.A. 1997. FOCUS: Searching for multi-colored objects in a diverse image database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 756–761.

citation_journal_title=IEEE Transactions on Acoustics, Speech, and Signal Processing; citation_title=Complete discrete 2-d Gabor transforms by neural networks for image analysis and compression; citation_author=J.D. Daugman; citation_volume=36; citation_publication_date=1998; citation_pages=1169-1179; citation_id=CR11

citation_title=Pattern Classification and Scene Analysis; citation_publication_date=1973; citation_id=CR12; citation_author=R.O. Duda; citation_author=P.E. Hart; citation_publisher=Wiley

Farrokhnia, F. and Jain, A.K. 1991. A multi-channel filtering approach to texture segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 364–370.

citation_journal_title=Journal of the Optical Society of America A; citation_title=Relations between the statistics of natural images and the response properties of cortical cells; citation_author=D.J. Field; citation_volume=4; citation_issue=12; citation_publication_date=1987; citation_pages=2379-2394; citation_id=CR14

citation_journal_title=The Journal of the Institute of Electrical Engineers, Part III; citation_title=Theory of communication; citation_author=D. Gabor; citation_volume=93; citation_issue=21; citation_publication_date=1946; citation_pages=429-457; citation_id=CR15

citation_journal_title=IEEE Transactions on Pattern Analysis and Machine Intelligence; citation_title=Efficient color histogram indexing for quadratic form distance functions; citation_author=J. Hafner, H.S. Sawhney, W. Equitz, M. Flickner, W. Niblack; citation_volume=17; citation_issue=7; citation_publication_date=1995; citation_pages=729-735; citation_id=CR16

citation_title=Introduction to Mathematical Programming; citation_publication_date=1990; citation_id=CR17; citation_author=F.S. Hillier; citation_author=G.J. Lieberman; citation_publisher=McGraw-Hill

citation_journal_title=J. Math. Phys.; citation_title=The distribution of a product from several sources to numerous localities; citation_author=F.L. Hitchcock; citation_volume=20; citation_publication_date=1941; citation_pages=224-230; citation_id=CR18

Karmarkar, N. 1984. A new polynomial-time algorithm for linear programming. In Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, Washington, DC. pp. 302–311.

citation_title=How good is the simplex algorithm; citation_inbook_title=Inequalities; citation_publication_date=1972; citation_pages=159-175; citation_id=CR20; citation_author=V. Klee; citation_author=G. Minty; citation_publisher=Academic Press

citation_title=Information Theory and Statistics; citation_publication_date=1968; citation_id=CR21; citation_author=S. Kullback; citation_publisher=Dover

citation_journal_title=IEEE Transactions on Pattern Analysis and Machine Intelligence; citation_title=Periodicity, directionality, and randomness: Wold features for image modeling and retrieval; citation_author=F. Liu, R.W. Picard; citation_volume=18; citation_issue=7; citation_publication_date=1996; citation_pages=722-733; citation_id=CR22

citation_journal_title=IEEE Transactions on Pattern Analysis and Machine Intelligence; citation_title=Texture features for browsing and retrieval of image data; citation_author=B.S. Manjunath, W.Y. Ma; citation_volume=18; citation_issue=8; citation_publication_date=1996; citation_pages=837-842; citation_id=CR23

citation_journal_title=IEEE Transactions on Communication; citation_title=Image coding using vector quantization: A review; citation_author=N.M. Nasrabad, R.A. King; citation_volume=36; citation_issue=8; citation_publication_date=1988; citation_pages=957-971; citation_id=CR24

citation_journal_title=SPIE Conference on Storage and Retrieval for Image and Video Databases; citation_title=Querying images by content, using color, texture, and shape; citation_author=W. Niblack, R. Barber, W. Equitz, M.D. Flickner, E.H. Glasman, D. Petkovic, P. Yanker, C. Faloutsos, G. Taubin, Y. Heights; citation_volume=1908; citation_publication_date=1993; citation_pages=173-187; citation_id=CR25

citation_journal_title=IEEE Transactions on Pattern Analysis and Machine Intelligence; citation_title=A unified approach to the change of resolution: Space and gray-level; citation_author=S. Peleg, M. Werman, H. Rom; citation_volume=11; citation_publication_date=1989; citation_pages=739-742; citation_id=CR26

citation_title=A Technical Introduction to Digital Video; citation_publication_date=1996; citation_id=CR27; citation_author=C. Poynton; citation_publisher=John Wiley and Sons

Puzicha, J., Hofmann, T., and Buhmann, J. 1997. Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 267–272.

citation_journal_title=Theory of Probability and its Applications; citation_title=The Monge-Kantorovich mass transference problem and its stochastic applications; citation_author=S.T. Rachev; citation_volume=XXIX; citation_issue=4; citation_publication_date=1984; citation_pages=647-676; citation_id=CR29

citation_journal_title=Operations Research; citation_title=Extension of Dantzig's algorithm to finding an initial near-optimal basis for the transportation problem; citation_author=E.J. Russell; citation_volume=17; citation_publication_date=1969; citation_pages=187-191; citation_id=CR30

citation_journal_title=Computer, Vision, Graphics, and Image Processing; citation_title=Generalized texture representation and metric; citation_author=H.C. Shen, A.K.C. Wong; citation_volume=23; citation_publication_date=1983; citation_pages=187-206; citation_id=CR31

Smith, J.R. 1997. Integrated Spatial and Feature Image Systems: Retrieval, Analysis and Compression. Ph.D. Thesis, Columbia University.

Stolfi, J. 1994. Personal communication.

citation_journal_title=SPIE Conference on Storage and Retrieval for Image and Video Databases III; citation_title=Similarity of color images; citation_author=M. Stricker, M. Orengo; citation_volume=2420; citation_publication_date=1995; citation_pages=381-392; citation_id=CR34

citation_journal_title=International Journal of Computer Vision; citation_title=Color indexing; citation_author=M.J. Swain, D.H. Ballard; citation_volume=7; citation_issue=1; citation_publication_date=1991; citation_pages=11-32; citation_id=CR35

citation_journal_title=Psychological Review; citation_title=Features of similarity; citation_author=A. Tversky; citation_volume=84; citation_issue=4; citation_publication_date=1977; citation_pages=327-352; citation_id=CR36

citation_journal_title=Computer, Vision, Graphics, and Image Processing; citation_title=A distance metric for multi-dimensional histograms; citation_author=M. Werman, S. Peleg, A. Rosenfeld; citation_volume=32; citation_publication_date=1985; citation_pages=328-336; citation_id=CR37

citation_title=Color Science: Concepts and Methods, Quantitative Data and Formulae; citation_publication_date=1982; citation_id=CR38; citation_author=G. Wyszecki; citation_author=W.S. Stiles; citation_publisher=John Wiley and Sons

Zikan, K. 1990. The Theory and Applications of Algebraic Metric Spaces. Ph.D. Thesis, Stanford University.