Our new study shows how the predicting machinery of our brains syncs hearing with vision

Un communiqué de presse en français est disponible ici.

We all learn that sound and light travel at very different speeds. If the brain did not account for this difference, it would be much harder for us to identify the source of the sounds we hear and how they are related to what we see. 

The brain allows us to make better sense of our world by playing us tricks, warping our sense of timing, so that a sound and an image may be perceived as synchronous, even though they reach the brain and are processed by neural circuits at different speeds. If this did not happen, we would be constantly annoyed by the perception of a world around us with “bad lip synching…”

Whatever it is, the way you tell your story online can make all the difference.

One of the brain’s tricks is temporal recalibration. Our new study finds that this form of temporal recalibration of the senses is based on brain signals that constantly adapt to context in order to sample, order and associate competing sensory inputs together.

Dr. Therese Lennert (CIHR Post-Doctoral Fellow in the lab) recruited volunteers to view short flashes of light paired with sounds with a variety of delays between them. She asked the participants to report whether they thought the audio and visual stimuli happened at the same time. She used a magnetoencephalography (MEG) machine, to record and image brain waves with millisecond precision. The audio-visual pairs of stimuli changed each time, with sounds and visual objects presented closer or farther apart in time, and with random orders of presentation.

Screen Shot 2021-05-10 at 3.23.52 PM.png

Audiovisual stimulus pairs

SOA = Stimulus Onset Asynchrony (+/- 350 ms)

Time course of an example stimulus sequence used to test audiovisual synchrony judgments. The audiovisual stimulus pair was presented in one of three possible temporal configurations: visual stimulus (red) leading auditory (blue; t:VA, V < A), simultaneous presentation, and auditory leading visual (t:AV, A < V).

Time course of an example stimulus sequence used to test audiovisual synchrony judgments. The audiovisual stimulus pair was presented in one of three possible temporal configurations: visual stimulus (red) leading auditory (blue; t:VA, V < A), simultaneous presentation, and auditory leading visual (t:AV, A < V).

We found that the volunteers’ perception of simultaneity between the audio and visual stimuli in a pair was strongly affected by the perceived simultaneity of the stimulus pair presented immediately before. For example, if one is presented with a sound followed by an image milliseconds apart that they perceive as asynchronous, one is much more likely to report the next audio-visual stimulus pair as synchronous, even when it’s not. This form of constant temporal recalibration is one of the mechanisms used by the brain to avoid a distorted or disconnected perception of reality. It helps us establish causal relations between the images and sounds we perceive, despite different physical velocities and neural processing speeds.

Behavioral performances: Rate of responses to audiovisual pairs perceived as synchronous, as a function of stimulus onset asynchrony. The psychometric curves show the percentages of synchronous responses as a function of SOA for trials with visual lead on the previous trial (t-1:V, red) and those with auditory lead (t-1:A, blue). Dots represent average behavioral reports across participants (n = 18). Inset: mean temporal recalibration (TR, about 35 ms) estimated across participants and evaluated against zero.

Behavioral performances: Rate of responses to audiovisual pairs perceived as synchronous, as a function of stimulus onset asynchrony. The psychometric curves show the percentages of synchronous responses as a function of SOA for trials with visual lead on the previous trial (t-1:V, red) and those with auditory lead (t-1:A, blue). Dots represent average behavioral reports across participants (n = 18). Inset: mean temporal recalibration (TR, about 35 ms) estimated across participants and evaluated against zero.

The MEG signals revealed that this brain feat was enabled by a unique interaction between fast and slow brain waves in auditory and visual brain regions. Slower brain rhythms pace the temporal fluctuations of excitability in brain circuits. The higher the excitability, the easier an external input is registered and processed by receiving neural networks. 

Based on this, we propose the dynamic-insertion model as a mechanistic principle of multimodal sensory perception. The dynamic-insertion model explains temporal recalibration with faster oscillations riding on top of slower fluctuations and creating discrete and ordered time slots, or windows of opportunity, to register the temporal order of sensory inputs.

For example, when an audio signal reaches the first available time slot in the auditory cortex and so does a visual input, the pair is perceived as simultaneous. For this to happen, the brain needs to position the visual time slots a bit later than the auditory ones to account for the slower physiological transduction of visual signals.

We found that this relative delay between neural auditory and visual time slots is a dynamic process that constantly adapts to each participant’s recent exposure to audiovisual perception. 

Individual ratios of asynchronous-to-synchronous responses for all trial configurations—ratios of asynchronous to synchronous responses across participants for all t/t-1 trial combinations, i.e. auditory or visual lead on trial t (t:A or t:V) paired with visual or auditory lead on the previous trial (t-1:V and t-1:A).

Individual ratios of asynchronous-to-synchronous responses for all trial configurations—ratios of asynchronous to synchronous responses across participants for all t/t-1 trial combinations, i.e. auditory or visual lead on trial t (t:A or t:V) paired with visual or auditory lead on the previous trial (t-1:V and t-1:A).

Our data therefore confirmed the new dynamic-insertion model by showing how these subtle tens-of-millisecond delays of fast brain oscillations can be measured in every individual and how they explain their respective behavior in terms of judging perceived simultaneity. 

Empirical evidence of phase-amplitude coupling in neurophysiological cortical traces (here, in left and right auditory cortices, LAC &amp; RAC): time-frequency decompositions (top) of  cortical activity and corresponding event-related averages (bottom) time-locked to the troughs of slow-frequency fP (10 Hz) cycles, in a representative participant.

Empirical evidence of phase-amplitude coupling in neurophysiological cortical traces (here, in left and right auditory cortices, LAC & RAC): time-frequency decompositions (top) of cortical activity and corresponding event-related averages (bottom) time-locked to the troughs of slow-frequency fP (10 Hz) cycles, in a representative participant.

Individual temporal recalibration behavior is quantitatively related to the period of fast oscillatory bursts (fA) coupled to slower fluctuations (fP) in auditory cortex. Individual temporal recalibration estimates (n = 16) are plotted against the observed fA period durations measured over the pre-stimulus period in the auditory cortex for trials with visual (red) and auditory (blue) lead on the previous trial. The dots represent single-subject values. In each panel, the regression line of best fit is depicted in black.

Individual temporal recalibration behavior is quantitatively related to the period of fast oscillatory bursts (fA) coupled to slower fluctuations (fP) in auditory cortex. Individual temporal recalibration estimates (n = 16) are plotted against the observed fA period durations measured over the pre-stimulus period in the auditory cortex for trials with visual (red) and auditory (blue) lead on the previous trial. The dots represent single-subject values. In each panel, the regression line of best fit is depicted in black.

Distributions of individual preferred phase angles (dots) and the group circular average of phase-amplitude coupling (PAC) preferred phase φ of fA oscillations along the fP cycle. Data from trials with visual lead on the previous trial (t-1:V condition) are shown in red and those with auditory lead (t-1:A conditions) are shown in blue, in RAC (top) and RVC (bottom). Rayleigh tests assessed uniformity of phase angle distributions across subjects around the unit circle. Note how similar the phase shifts are in auditory (RAC) and visual (RVC) cortices when changing from a context of auditory-lead (t-1:A) vs. visual-lead (t-1:V) trials.

Distributions of individual preferred phase angles (dots) and the group circular average of phase-amplitude coupling (PAC) preferred phase φ of fA oscillations along the fP cycle. Data from trials with visual lead on the previous trial (t-1:V condition) are shown in red and those with auditory lead (t-1:A conditions) are shown in blue, in RAC (top) and RVC (bottom). Rayleigh tests assessed uniformity of phase angle distributions across subjects around the unit circle. Note how similar the phase shifts are in auditory (RAC) and visual (RVC) cortices when changing from a context of auditory-lead (t-1:A) vs. visual-lead (t-1:V) trials.

These phase shifts measured between conditions in the auditory (RAC) and visual (RVC) cortex, once converted to milliseconds from phase angles along the fP cycles, were statistically identical on average to the empirical values of temporal recalibration behavior (TR). Gray dots represent single subject data (n = 16).

These phase shifts measured between conditions in the auditory (RAC) and visual (RVC) cortex, once converted to milliseconds from phase angles along the fP cycles, were statistically identical on average to the empirical values of temporal recalibration behavior (TR). Gray dots represent single subject data (n = 16).

Empirical &amp; Dynamic-insertion model of point of subjective simultaneity (PSS): The PSS values derived from the Dynamic-insertion model predictions (black bars) were statistically identical to those observed empirically in participants (gray bars), in both conditions.

Empirical & Dynamic-insertion model of point of subjective simultaneity (PSS): The PSS values derived from the Dynamic-insertion model predictions (black bars) were statistically identical to those observed empirically in participants (gray bars), in both conditions.

In autism and speech disorders, the processing of the senses, especially hearing, is altered. In schizophrenia as well, patients can be affected by perceived distortions of sensory inputs. The neurophysiological mechanisms of temporal recalibration described in this study may be altered in these disorders, inducing a sense of disconnection with others and the environment. The discovery of the Dynamic-insertion mechaninsm may reveal new research goals to better understand and improve these deficits. 

“Overall, this study emphasizes that our brains constantly absorb and adapt in an active anticipatory fashion to the bombardment of sensory information from diverse sources,” says Sylvain Baillet, the study’s senior author. “To make sense of our complex environments, of our interactions with others, brain circuits actively make adjustments of subtle physiological mechanisms to better anticipate and predict the nature and timing of external stimulations. These processes may help us build a resilient and adaptive mental map of their representation.”

This study, published in open-access by the journal Communications Biology on May 11, 2021, was funded by a Canadian Institutes of Health Research (CIHR) Post-doctoral Fellowship to first author Dr. Therese Lennert, and by grants to Dr. Sylvain Baillet from the National Institutes of Health (USA), the Natural Science and Engineering Research Council of Canada, the Canada Research Chair of Neural Dynamics of Brain Systems from CIHR , the Brain Canada Foundation with support from Health Canada, and the Innovative Ideas program from the Canada First Research Excellence Fund, awarded to McGill University for the Healthy Brains for Healthy Lives initiative.

We are grateful to Shawn Hayward (Communications Officer at The Neuro) for his contributions to this note.

Previous
Previous

Music practice enhances motor recovery after stroke.

Next
Next

Olivia Bizimungu joins the lab as new graduate trainee.