Читать книгу The Handbook of Speech Perception - Группа авторов - Страница 37

Conclusions

Research on multisensory speech has flourished since 2005. This research has spearheaded a revolution in our understanding of the perceptual brain. The brain is now thought to be largely designed around multisensory input, with most major sensory areas showing crossmodal modulation. Behaviorally, research has shown that even our seemingly unimodal experiences are continuously influenced by crossmodal input, and that the senses have a surprising degree of parity and flexibility across multiple perceptual tasks. As we have argued, research on multisensory speech has provided seminal neurophysiological, behavioral, and phenomenological demonstrations of these principles.

Arguably, as this research has grown, it has continued to support claims made in the first version of this chapter. There is now more evidence that multisensory speech perception is ubiquitous and (largely) automatic. This ubiquity is demonstrated in the new research showing that tactile and kinesthetic speech information can be used, and can readily integrate, with heard speech. Next, the majority of the new research continues to reveal a function for which the streams are integrated at the earliest stages of the speech function. Much of this research comes from neurophysiological research showing that auditory brainstem and even cochlear functioning is modulated by visual speech information. Finally, evidence continues to accumulate for the salience of a supramodal form of information. This evidence now includes findings that, like auditory speech, visual speech can act to influence an alignment response, and can modulate motor‐cortex activity for that purpose. Other support shows that the speech and talker experience gained through one modality can be shared with another modality, suggesting a mechanism sensitive to the supramodal articulatory dimensions of the stimulus: the supramodal learning hypothesis.

There is also recent evidence that can be interpreted as unsupportive of a supramodal approach. Because the supramodal approach claims that “integration” is a consequence of the informational form across modalities, evidence should show that the function is early, impenetrable, and complete. As stated, however, there are findings that have been interpreted as showing that integration can be delayed until after some lexical analysis is conducted on unimodal input (e.g. Ostrand et al., 2016). There is also evidence interpreted as showing that integration is not impenetrable but is susceptible to outside influences including lexical status and attention (e.g. Brancazio, 2004; Alsius et al., 2005). Finally, there is evidence that has been interpreted to demonstrate that integration is not complete. For example, when subjects are asked to shadow a McGurk‐effect stimulus (e.g. responding ada when presented audio /aba/ and visual /aga/), their ada shadowed response will show articulatory remnants of the individual audio (/aba/) and video (/aga/) components (Gentilucci & Cattaneo, 2005).

In principle, all of these findings are inconsistent with a supramodal account. While we have provided alternative interpretations of these findings both in the current paper and elsewhere (e.g. Rosenblum, 2019), it is clear that more research is needed to test the viability of the supramodal account.

Подняться наверх