Learning Low-Level Vision

Springer Science and Business Media LLC - Tập 40 - Trang 25-47 - 2000
William T. Freeman1, Egon C. Pasztor2, Owen T. Carmichael3
1Mitsubishi Electric Research Labs, Cambridge, USA
2MIT Media Laboratory, Cambridge, USA
3209 Smith Hall, Carnegie-Mellon University, Pittsburgh, USA

Tóm tắt

We describe a learning-based method for low-level vision problems—estimating scenes from images. We generate a synthetic world of scenes and their corresponding rendered images, modeling their relationships with a Markov network. Bayesian belief propagation allows us to efficiently find a local maximum of the posterior probability for the scene, given an image. We call this approach VISTA—Vision by Image/Scene TrAining. We apply VISTA to the “super-resolution” problem (estimating high frequency details from a low-resolution image), showing good results. To illustrate the potential breadth of the technique, we also apply it in two other problem domains, both simplified. We learn to distinguish shading from reflectance variations in a single image under particular lighting conditions. For the motion estimation problem in a “blobs world”, we show figure/ground discrimination, solution of the aperture problem, and filling-in arising from application of the same probabilistic machinery.

Tài liệu tham khảo

Adelson, E.H. 1995. Personal communication. Barrow, H.G. and Tenenbaum, J.M. 1981. Computational vision. Proc. IEEE, 69(5):572–595. Bell, A.J. and Sejnowski, T.J. 1997. The independent components of natural scenes are edge filters. Vision Research, 37(23):3327–3338. Berger, J.O. 1985. Statistical Decision Theory and Bayesian Analysis. Springer: Berlin. Besag, J. 1974. Spatial interaction and the statistical analysis of lattice systems (with discussion). J. Royal Statist. Soc. B, 36:192–326. Binford, T., Levitt, T. and Mann, W. 1988. Bayesian inference in model-based machine vision. In Uncertainty in Artificial Intelligence, J.F. Lemmer and L.M. Kanal (Eds.), Morgan Kaufmann: Los Alos, CA. Bishop, C.M. 1995. Neural Networks for Pattern Recognition. Oxford. Burt, P.J. and Adelson, E.H. 1983. The Laplacian pyramid as a compact image code. IEEE Trans. Comm., 31(4):532–540. Carandini, M. and Heeger, D.J. 1994, Summation and division by neurons in primate visual cortex. Science, 264:1333–1336. DeBonet, J.S. and Viola, P. 1998. Texture recognition using a nonparametric multi-scale statistical model. In Proc. IEEE Computer Science Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA. Field, D.J. 1994. What is the goal of sensory coding. Neural Computation, 6:559–601. Freeman, W.T. 1994. The generic viewpoint assumption in a framework for visual perception. Nature, 368(6471):542–545. Freeman,.T., Haddon, J.A., and Pasztor, E.C. 2001. Learning motion analysis. In Statistical Theories of the Brain, R. Rao, B. Olshausen, and M. Lewicki (Eds.), MIT Press, Cambridge, MA. See also http://www.merl.com/reports/TR2000-32. Freeman, W.T. and Pasztor, E.C. 1999. Learning to estimate scenes from images. In Adv. Neural Information Processing Systems, Kearns, M.S., Solla, S.A., and Cohn, D.A. (Eds.). Vol. 11, Cambridge, MA. See also http://www.merl.com/reports/TR99-05/. Freeman,.T. and Viola, P.A. 1998. Bayesian model of surface perception. Adv. in Neural Information Processing Systems, Vol. 10. Frey, B.J. 1998. Graphical Models for Machine Learning and Digital Communication. MIT Press: Cambridge, MA. Frey, B.J. 2000. Filling in scenes by propagating probabilities through layers and into appearance models. In Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Hilton Head Island, S.C. Geiger, D. and Girosi, F. 1991. Parallel and deterministic algorithms from MRF's: Surface reconstruction. IEEE Pattern Analysis and Machine Intelligence, 13(5):401–412. Geman, S. and Geman, D. 1984. Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images. IEEE Pattern Analysis and Machine Intelligence, 6:721–741. Heeger, D.J. and Bergen, J.R. 1995. Pyramid-based texture analysis/synthesis. In ACM SIGGRAPH. In Computer Graphics Proceedings, Annual Conference Series, pp. 229–236. Horn, B.K.P. 1986. Robot Vision. MIT Press: Cambridge, MA. Horn, B.K.P. and Brooks, M.J. (Eds.). 1989. Shape From Shading. The MIT Press: Cambridge, MA. Hurlbert, A.C. and Poggio, T.A. 1988. Synthesizing a color algorithm from examples. Science, 239:482–485. Isard, M. and Blake, A. 1996. Contour tracking by stochastic propagation of conditional density. In Proc. European Conf. on Computer Vision, pp. 343–356. Jahne, B. 1991. Digital Image Processing. Springer-Verlag: Berlin. Jordan, M.I. (Ed.). 1998. Learning in Graphical Models. MIT Press: Cambridge, MA. Jordan, M.I., Kearns, M.J., and Solla, S.A. (Eds.), MIT Press, Cambridge, MA. See also http://www.merl.com/reports/TR98-05. Kersten, D., O'Toole, A.J., Sereno, M.E., Knill, D.C., and Anderson, J.A. 1987. Associative learning of scene parameters from images. Applied Optics, 26(23):4999–5006. Kittler, J. and Illingworth, J. 1985. Relaxation labelling algorithms—a review. Image and Vision Computing, 3(11):206–216. Knill, D. and Richards, W. (Eds.). 1996. Perception as Bayesian Inference. Cambridge Univ. Press: Cambridge, London. Kschischang, F.R. and Frey, B.J. 1998. Iterative decoding of compound codes by probability propagation in graphical models. IEEE Journal on Selected Areas in Communication, 16(2):219–230. Landy, M.S. and Movshon, J.A. (Eds.). 1991. Computational Models of Visual Processing. MIT Press: Cambridge, MA. Luettgen, M.R., Karl, W.C., and Willsky, A.S. 1994. Efficient multiscale regularization with applications to the computation of optical flow. IEEE Trans. Image Processing, 3(1):41–64. McEliece, R., MacKay, D., and Cheng, J. 1998. Turbo decoding as as an instance of pearl's ‘Belief Propagation’ algorithm. IEEE J. on Sel. Areas in Comm., 16(2):140–152. Olshausen, B.A. and Field, D.J. 1996. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607–609. Pearl, J. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann: Los Altos, CA. Pentland, A. and Horowitz, B. 1993. A practical approach to fractal-based image compression. In Digital Images and Human Vision. A.B. Watson (Ed.). MIT Press: Cambridge, MA. Poggio, T., Torre, V., and Koch, C. 1985. Computational vision and regularization theory. Nature, 317(26):314–139. Polvere, M. 1998. Mars v. 1.0, A quadtree based fractal image coder/decoder. http://inls.ucsd.edu/y/Fractals/. Rosenfeld, A., Hummel, R.A., and Zucker, S.W. 1976. Scene labeling by relaxation operations. IEEE Trans. Systems, Man, Cybern, 6(6):420–433. Saund, E. 1999. Perceptual organization of occluding contours generated by opaque surfaces. In Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition., Ft. Collins, CO. Schultz, R.R. and Stevenson, R.L. 1994. A Bayesian approach to image expansion for improved definition. IEEE Trans. Image Processing, 3(3):233–242. Simoncelli, E.P. 1997. Statistical models for images: Compression, restoration and synthesis. In 31st Asilomar Conf. on Sig., Sys. and Computers, Pacific Grove, CA. Sinha, P. and Adelson, E.H. 1993. Recovering reflectance and illumination in a world of painted polyhedra. In Proc. 4th Intl. Conf. Comp. Vis., pp. 156–163. Smyth, P., Heckerman, D., and Jordan, M.I. 1997. Probabilistic independence networks for hidden Markov probability models. Neural Computation, 9(2):227–270. Szeliski, R. 1989. Bayesian Modeling of Uncertainty in Low-level Vision. Kluwer Academic Publishers: Boston. Weiss,. 1997. Interpreting images by propagating Bayesian beliefs. Adv. in Neural Information Processing Systems, Vol. 9. pp. 908–915. Weiss, Y. 1998. Belief propagation and revision in networks with loops. Technical Report 1616, AI Lab Memo, MIT, Cambridge, MA 02139. Weiss, Y. and Freeman, W.T. 1999. Correctness of belief propagation in Gaussian graphical models of arbitrary topology. Technical Report UCB.CSD-99-1046, Berkeley Computer Science Dept. www.cs.berkeley.edu/*#x223D;yweiss/gaussTR.ps.gz. Weiss, Y. and Freeman, W.T. 2001. On the optimality of solutions of the max-product belief propagation algorithm in arbitrary graphs. IEEE Trans. Info. Theory. Special issue on codes on Graphs and Iterative Algorithms. See also: http://www.merl.com/reports/TR 99-39. Yedidia, J.S., Freeman, W.T., and Weiss, Y. 2000. Generalized belief propagation. Technical Report 2000– 26, MERL, Mitsubishi Electric Research Labs., www.merl.com. Zhu, S.C. and Mumford, D. 1997. Prior Learning and Gibbs Reaction-Diffusion. IEEE Pattern Analysis and Machine Intelligence, 19(11).