Sparse Coding and Decorrelation in Primary Visual Cortex During Natural Vision

American Association for the Advancement of Science (AAAS) - Tập 287 Số 5456 - Trang 1273-1276 - 2000

W. Vinje^1,2, Jack L. Gallant^3,2

¹Department of Molecular and Cellular Biology, and

²Program in Neuroscience

³Department of Psychology, University of California at Berkeley, Berkeley, CA 94720–1650, USA.

Tóm tắt

Theoretical studies suggest that primary visual cortex (area V1) uses a sparse code to efficiently represent natural scenes. This issue was investigated by recording from V1 neurons in awake behaving macaques during both free viewing of natural scenes and conditions simulating natural vision. Stimulation of the nonclassical receptive field increases the selectivity and sparseness of individual V1 neurons, increases the sparseness of the population response distribution, and strongly decorrelates the responses of neuron pairs. These effects are due to both excitatory and suppressive modulation of the classical receptive field by the nonclassical receptive field and do not depend critically on the spatiotemporal structure of the stimuli. During natural vision, the classical and nonclassical receptive fields function together to form a sparse representation of the visual world. This sparse code may be computationally efficient for both early vision and higher visual processing.

Từ khóa

Tài liệu tham khảo

H. B. Barlow in Sensory Communication W. A. Rosenblith Ed. (MIT Press Cambridge MA 1961) pp. 217–234; Neural Comput. 1 295 (1989);

Daugman J. G., IEEE Trans. Biomed. Eng. 36, 107 (1989);

Field D. J., J. Opt. Soc. Am. A 4, 2379 (1987);

; Neural Comput. 6 559 (1994).

10.1038/381607a0

; Vision Res. 23 3311 (1997).

P. Foldiak and M. P. Young in The Handbook of Brain Theory and Neural Networks M. A. Arbib Ed. (MIT Press Cambridge MA 1995) pp. 895–898.

Levy W., Baxter R. A., Neural Comput. 8, 531 (1996);

10.1038/236

10.1126/science.1598577

Rolls E. T., Tovee M. J., J. Neurophysiol. 73, 713 (1995);

; R. Baddeley et al. Proc. R. Soc. London Ser. B 264 1775 (1997).

E. P. Simoncelli and O. Schwartz in Advances in Neural Information Processing Systems 11 M. I. Jordan M. J. Kearns S. A. Solla Eds. (MIT Press Cambridge MA 1998).

Allman J., Miezin F., McGuinness E., Annu. Rev. Neurosci. 8, 407 (1985).

pp. 153–154.

Gallant J. L., Connor C. E., Van Essen D. C., Neuroreport 9, 2153 (1998).

We generated saccadic eye scan paths by using a model of natural macaque eye movements. We acquired eye movement data from a scleral search coil during free-viewing experiments (8 17). The model chose random saccade directions from a uniform distribution of angles. We chose saccade lengths randomly from a distribution based on a b-spline fit to the measured distribution of free-viewing saccade lengths. The eye velocity versus time profile for each saccade was obtained from a lookup table of b-spline fits to actual velocity/time profiles (as a function of saccade length). We chose fixation durations from a gaussian distribution (mean 350 ms; standard deviation 50 ms).

We extracted image patches from 1280 × 1024 pixel images obtained from a high-resolution commercial photo-CD library (Corel Inc.). Images included nature scenes as well as man-made objects and animals as well as humans. To avoid aliasing artifacts that might result from displaying movies on a monitor with 72-Hz refresh we used an antialiasing algorithm in which each 13.8-ms frame of a movie was constructed by averaging 14 images representing the position of the CRF at about 1-ms resolution.

All animal procedures were approved by the University of California Berkeley Animal Care and Use Committee and conformed to or exceeded all relevant National Institutes of Health and U.S. Department of Agriculture standards. Single neuron recordings were made from two awake behaving macaque monkeys ( Macaca mulatta ) with extracellular electrodes. Additional details about recording and surgical procedures are given in [

10.1523/JNEUROSCI.17-09-03201.1997

]. All data reported here were taken under conditions of excellent single-unit isolation. Eye position was monitored with a scleral search coil and trials were aborted if the eye deviated from fixation by more than 0.35°. Movie duration varied from 5 to 10 s. During recording sessions each movie was divided into 5-s segments; segments were then shown in and around the CRF on successive trials while the animal performed a fixation task for a juice reward. Each trial consisted of a stimulus of a single size with differently sized stimulus conditions randomly interleaved across trials.

A well-established and useful description of how sparsely a neuron responds across stimuli is given by its activity fraction A = (Σ r i / n ) 2 /Σ( r i 2 / n ). For further discussion see [

Rolls E. T., Tovee M. J., J. Neurophysiol. 73, 713 (1995);

]. Our sparseness statistic is a convenient rescaling of A that ranges from 0% to 100%: S = (1 − A )/(1 − A min ) = (1 − A )/(1 − 1/ n ).

Throughout this report we measured significance with randomization tests using 1000 random permutations of the relevant data. For further discussion see [B. F. J. Manly Randomization and Monte-Carlo Methods in Biology (Chapman & Hall New York 1991)].

If responses are averaged within a fixation sparseness declines from 41 to 23% 52 to 34% 61 to 42% and 62 to 45% for stimuli one two three and four times the size of the CRF respectively.

The boundaries of the CRF were estimated with bar and grating stimuli whose characteristics were controlled interactively. For 38 of 61 neurons we confirmed these manual estimates by reverse correlation on responses evoked by a dynamic sequence of small white squares distributed in and around the CRF (square positions were chosen randomly for each frame). Reliable CRF estimates were obtained with 150 to 300 s of data (30 to 60 behavioral fixation trials). Generally there is excellent agreement between the CRF profile estimates obtained with the two methods. Our CRF estimates ranged from about 20 to 50 min of arc which is entirely consistent with the range of receptive field diameters obtained in awake behaving macaques by other researchers; for example see [

Snodderly D. M., Gur M., J. Neurophysiol. 74, 2100 (1995)].

10.1152/jn.1980.43.4.1133

10.1152/jn.1993.70.3.909

Animals viewed high-resolution natural images digitized on commercial photo-CDs (Corel Corp.) and shown at a resolution of 1280 × 1024 pixels. Images were shown for 10 s each. Neural responses and eye position were recorded continuously during this free viewing (8). Natural vision movies that simulated these specific free-viewing episodes were constructed by using the eye position records to determine the position of the recorded CRF during free viewing. In six cells the diameter of the reconstructed movies was four times the CRF and in 11 cells it was three times the CRF. These data have been combined in this report.

Each free-viewing episode produced a single-spike train evoked by a unique pattern of exploratory eye movements. In contrast natural vision movies were repeated many times. To obtain comparable sparseness estimates for these data we separately analyzed the spike train evoked by each repetition of the natural vision movie. The average of this distribution of sparseness values was then compared with the single sparseness value obtained from the free-viewing data. To ensure matched stimulus conditions we made all comparisons on a movie-by-movie basis. Note that sparseness values based on single-spike trains are biased upward because of the discrete nature of spike generation.

The random sinusoidal grating sequence was similar to that used by D. L. Ringach M. J. Hawken R. Shapley [ Nature 387 281 (1997)]. The orientation spatial frequency and phase of the grating were chosen randomly on each video frame (at 72 Hz). Gratings were shown at a Michelson contrast of 0.5. Before analysis stimuli were binned into 10° orientation steps and 6 to 12 spatial frequency steps. Responses were analyzed by parametric reverse correlation on orientation and spatial frequency averaging over phase. The mean responses across stimulus bins (at the peak response latency) were used to estimate the sparseness statistic.

Bell A. J., Sejnowski T. J., Vision Res. 37, 3327 (1997).

Several theoretical studies of sparse population coding have reported the kurtosis of the distribution of responses observed across a set of linear filters with respect to a particular stimulus ensemble (2 20). This measure is not directly applicable to our data because the responses of area V1 neurons are asymmetric: cells typically exhibit low spontaneous rates and appropriate stimuli elevate these rates. To estimate kurtosis we converted each response distribution to a symmetric distribution by reflecting the data about the origin. The resulting symmetric distributions are unimodal with zero mean and decrease smoothly to zero. Our kurtosis statistic is well behaved and directly comparable to the results of theoretical studies.

Let P 1 and P 2 be the PSTH response vectors for a pair of neurons. Then cos(θ) = P 1 P 2 /∥ P 1 ∥ × ∥ P 2 ∥ where ∥ P n ∥ is the norm of the appropriate vector. This measure is sensitive to changes across the basis dimensions of the movie time stream and is insensitive to differences in absolute rate.

It is difficult to choose a scalar measure of response similarity appropriate for all situations; see [

Di Lorenzo P., J. Neurophysiol. 62, 823 (1989);

]. To validate our results we performed two alternative versions of the population decorrelation analysis. For each neuron pair we also computed both the linear correlation coefficient and the neural discrimination index of Di Lorenzo. In both cases nCRF stimulation leads to significant decorrelation ( P ≤ 0.001). To ensure that the slightly different stimulus sizes do not influence our results we also performed all similarity analyses on a data set restricted to neuron pairs with identical CRF sizes (and thus identical stimulation). Under these conditions the decorrelating effect of the nCRF remains significant ( P ≤ 0.001).

The compound grating stimulus consisted of a CRF conditioning grating and a probe grating. We set the conditioning grating's orientation and spatial frequency to the neuron's preferred values [as determined by reverse correlation on responses to a dynamic grating sequence (19) presented in the CRF]. The phase of the conditioning grating varied randomly with each video frame. Both gratings were presented at a Michelson contrast of 0.5 and their edges were blended into one another and into the background. We performed reverse correlation on the position of the probe grating within the nCRF annulus (collapsing over all other parameters). To measure baseline responses we presented interleaved trials containing only the conditioning grating.

G. A. Walker et al. J. Neurosci. 19 10536 (1999).

Dan Y., Atick J. J., Reid R. C., J. Neurosci. 16, 3351 (1996).

M. S. Lewicki and B. A. Olshausen in Advances in Neural Information Processing Systems 10 M. I. Jordan M. J. Kearns S. A. Solla Eds. (MIT Press Cambridge MA 1997) pp. 815–821.

Heeger D. J., Visual Neurosci. 9, 181 (1992);

10.1016/0896-6273(92)90215-Y

Sillito A. M., et al., Nature 378, 492 (1995) ;

Gilbert C. D., et al., Proc. Natl. Acad. Sci. U.S.A. 93, 615 (1996);

Carandini M., et al., J. Neurosci. 17, 8621 (1997);

Knierim J. J., Van Essen D. C., J. Neurophysiol. 67, 961 (1992).

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA