The primate visual and auditory systems routinely perform complex pattern recognition tasks which elude the most sophisticated computer algorithms. My research is motivated by the view that it is essential to understand the information-processing principles operating in biological sensory systems in order to improve the abilities of artificial sensory systems. Central to this goal is understanding the representation of sensory stimuli which allows the brain to extract useful information from a complex and noisy environment. Since sensory systems evolved to represent the natural environment, a working hypothesis which guides my research is that sensory representations should be optimized to efficiently represent natural stimuli. Understanding the details of how natural stimuli are efficiently represented at multiple hierarchical layers of sensory processing is the main goal of my work.
Current Projects
In this study, we consider the functional role of the complex-cell pathway of the mammalian visual system by comparing the ability of two generative models of natural scenes to reliably represent texture regions and boundaries. The first generative model (ICA or sparse coding) directly synthesizes image patches, with the latent variables encoding image structure in a manner similar to simple cells in V1. The second generative model is a distribution coding model (DC) whose inferred latent variables generate the Gaussian distribution from which an image patch was most likely to have arisen, and the latent variables in this model have been shown to encode image structure in a manner similar to complex cells. We hypothesize that the complex cells may play an important role in coding textural properties of natural stimuli, and that the population of complex cells reliably represents texture regions as well as texture boundaries which are important for object recognition. We test this hypothesis in a computational study comparing the efficacy of the ICA and DC models at representing texture regions and boundaries.
In this study, we will consider the problem of how the mammalian visual system may perform Bayesian inference of local stimulus parameters by making use of information from the global context and context-dependent prior probabilities derived from the statistics of natural stimuli. This is important because quite often in natural perception the information which is available locally to a single receptive field is ambiguous due to noise or interference, making it difficult to estimate the local parameters of a source (for instance, the orientation of an edge) or to detect the presence of a source in the presence of noise. In these cases, it may be useful to integrate global information in the image to detect or disambiguate the local signal, and this integration should reflect the statistical regularities of the visual environment.
For the past 15 years, many studies have shown that training hierearchical generative models of natural images can result in learned basis functions resembling the receptive fields of sensory neurons. However, relatively few studies have considered models with multiple hierarchical layers, and to the best of our knowledge no work has been done on how higher-level representations can be learned to encode correlations between largely non-overlapping regions of space. In this study, we will investigate extending DC and ICA models to additional hierarchical layers which will represent patters between disparate regions of input space. This will enable us to learn hierarchical generative models not only for small 20x20 image patches, but in principle for arbitrarily large image patches while keeping the number of model parameters to a minimum.
Any hierarchical generative model of natural scenes with a sparse Bayesian prior p(y) on the latent variables y automatically defines a measure of the "informativeness" or "surprise" of a sensory stimulus. The less probable the value of the inferred latent variables for a given stimulus x, the more unlikely a priori and hence more surprising x is. Following Shannon we may quantify the statistical salience of an image patch by by S = -ln p(y). Applying this measure to generative models of natural images like the DC and ICA models and comparing with human eye movement data allows us to quantify whether humans tend to look at images regions which are more salient by this measure, and compare the efficacy of the measure applied to ICA and DC models for predicting human viewing preferences.
Past Work
Perhaps the most common procedure in neurophysiology and psychophysics is measurement of neural tuning curves and psychometric functions. If the functional form of the model is known a priori, then it is possible to employ active model-based data collection methods in order to estimate the parameters of the function with fewer sensory stimuli. In this study we demonstrate that these methods can partially overcome the 'curse of dimensionality' and make it practical to measure tuning curves and psychometric functions in high dimensional spaces. We consider an efficient particle filter implementation of optimal design, and introduce a fast approximate 'look-up table' approach to speed up the implemetation. (preprint)
Sensory neurophysiology experiments can potentially benefit from adaptively choosing stimuli on-line in order to maximize the information gained about the underlying system. It has been shown that this active data collection strategy can greatly reduce the number of trials needed to estimate the parameters of a sensory processing model. Here we demonstrate that for nonlinear neural network models active data collection is not simply useful for speeding up the convergence of estimates, but may in fact be necessary for accurate recovery of parameters. We also present one practical algorithm for combining the dual goals of model estimation and model comparison in on-line experiments and apply it to neural network models which extend and generalize popular linear receptive field models. (preprint)
What is the relationship between the values of the parameters (weights, threholds, etc..) in a neural network and the input-output relationship? We derive a general mathematical condition for there to be a continuum of functionally equivalent neural network models. We find that in general this is only possible when the hidden unit gain functions are given by power, exponential or logarithmic forms. However, since the standard tanh and sigmoid gain functions commonly used in neural modeling may be well approximated by these forms over limited ranges of inputs, one may practically observe a continuum of parameters in these networks giving rise to identical functionality, making unique identification of the network parameters from finite noisy data impossible and suggesting a need for active learning methods when identifying nonlinear models. (pdf)
The exact relationship between a neural circuit's architecture and stimulus-response properties remains poorly understood. Here we obtain general theoretical results for quadratic analysis on multilayer feedforward neural networks and classify all possible quadratic behaviors in relation to the optimal stimulus and invariant stimuli. The optimal stimulus, or the stimulus that best drives a neuron, describes a neuron's selectivity, wheras stimulus invariance describes the generalization of a neuron's response to a continuum of equivalently effective stimuli. Despite the intuitive notion of the optimal stimulus as a peak in the stimulus-response relationship, diverse quadratic behaviors are possible and can occur in the same network. We demonstrate that invariant stimuli are ubiquitous in hierarchical networks, and identify two classes of invariant stimulus transformations which leave neural responses unchanged. (preprint)
Neurophysiologists have long been interested in finding the 'optimal stimulus' for sensory neurons, which is defined as the stimulus which produces the maximum firing rate response. However, it is not clear what constraints the architecture of the underlying neural system which actually generates the responses to sensory stimuli may place on the location of the 'optimal stimulus'. In this study, we analyze feed-forward and recurrent neural network models and discover that for convergent networks whose connections between processing layers form a non-degenerate weight matrix that it is impossible for an strict firing rate maximum to exist in the interior of a compact stimulus space isomorphic to the peripheral representation. This result may explain why many sensory neurons exhibit their strongest responses to stimuli of maximum contrast. (pdf)
Most studies of species-specific vocalization coding have made use of pre-recorded token vocalization stimuli. The inability to systematically manipulate token stimuli makes it difficult to determine which of the many acoustical features present in the vocalization are driving neural responses. Other studies have made use of synthetic vocalizations, but these are typically based on a single exemplar and thus fail to capture the full range of acoustic variability. We utilized a database of thousands of marmoset vocalizations to develop synthetic "Virtual Vocalization" stimuli. These vocalizations can be systematically varied both within and outside of the range of natural acoustical variation along several parameter dimensions and provide a useful tool for the study of vocalization coding in the auditory cortex. (pdf)
It is known that the inhibitory projection from the MNTB to the LSO nucleus in the auditory brainstem exhibits an age-dependent form of long-lasting depression when activated at a low rate. However, since this synapse releases both Glycine and GABA during maturation, the mechanism of this age-dependent synaptic depresssion is unclear. Using GABA(B) receptor anatoginists, it was found that synaptic depression was blocked, suggesting a role for GABA-ergic transmission for inhibitory synaptic depression. Using bath applied BDNF and TRK receptor agonists it was also found that neurotrophins may play a role in this age-dependent inhibitory synaptic depression. My role in this work was the creation of the SLICE software package used to collect the data, and helping to develop methods to analyze of the synaptic data. (pdf)
