Читать книгу The Handbook of Speech Perception - Группа авторов - Страница 61

Models of feedback processing Computational processing of feedback

Оглавление

In functional terms, there are a number of models that could account for these data but also some classes of model that are clearly inadequate. A servomechanism in which behavior is controlled directly by feedback is too slow to modify the rapid movements of articulation (Lashley, 1951) and such control systems are known to have stability issues. An opposite theoretical approach places more reliance on memory‐based movement control. The idea of a motor program (Schmidt, 1980) is that movements are driven by a detailed motor representation and unfold without reference to sensation. Such programs cannot account for the adaptive timing and the flexibility of movement, and thus are viewed as too rigid to account for skilled movement data. More recent models suggest a more intricate role for sensory feedback. In such frameworks, auditory feedback is used to establish auditory target regions and to learn and maintain “forward models” that predict the consequences of behavior (e.g. Kawato, 1990). In part, these more computational models were anticipated by earlier physiological ideas about efference copy and corollary discharge.

The term efference copy is a direct translation of the German Efferenzkopie, introduced by von Holst and Mittelstaedt in 1950 to explain how we might distinguish changes in visual sensations due to our own movement and changes in visual sensations due to movement of the world. Crapse and Sommer (2008) consider corollary discharge (coined by Sperry in the same year, 1950) to be the more general term. Corollary discharges are viewed as copies of motor commands sent to any sensory structures, while efference copies were thought to be sent only to early or primary sensory structures.

Two current types of neurocomputational models of speech production differentiate how such corollary discharges and sensory feedback could influence speech. The Directions into Velocities of Articulators (DIVA) model and its extension, the Gradient Order DIVA (GODIVA) model, use the comparison of overt auditory feedback to auditory target maps as the mechanism to control speech errors (Guenther & Hickok, 2015). The auditory target maps can be understood as the predictions of the sensory state following a motor program. These predictions are also the goals represented in the speech‐sound map, where a speech sound is defined as a phonetic segment with its own motor program. This model requires two sensory‐to‐movement mappings to be learned in development. The speech‐sound map must be mapped to appropriate movements in what is considered a forward model. When errors are detected by mismatches between feedback and predicted sensory information, a correction must be generated. The sensorimotor mapping responsible for such corrective movements is considered an inverse model.

In contrast, the state feedback control model of speech production (SFC), or its extension, the hierarchical state feedback control model (HSFC) assumes an additional internal feedback loop (Hickok, 2012; Houde & Nagarajan, 2011; Houde & Chang, 2015). Similar to the DIVA models, the SFC models incorporate a form of corollary discharge. One critical difference is that the corollary discharge in SFC models is checked against an internal target map rather than overt auditory feedback (i.e. a prediction of speech errors is generated and thus provides a mechanism to prevent such errors). Overt auditory feedback is included in the model through its influence on how the speech‐error predictions are converted into corrections (Houde & Nagarajan, 2011).

Both of these models incorporate major but slightly different roles for auditory feedback in speech production. Such models play an important role in advancing our understanding of articulation but also have inherent problems that will make it difficult to unravel the exact form of the speech‐control system. Strengths include systematic frameworks for summarizing a large body of findings in the field and the ability to make novel predictions that lead to specific test experiments. However, all models make assumptions about the structure and processes involved in behavior. Auditory target maps and internal feedback loops, for example, are hypothetical constructs that are far from being rigorously supported. In addition, there is the challenging issue of how many levels of description (e.g. Marr’s computational, representational, and implementation levels) are needed and how the relationship between these levels can best be studied (see Peebles & Cooper, 2015, and other papers in the same issue). One reductionist approach is to look for neural evidence that might correspond with the computational architectures or constrain the behavior of the models. Both groups of modelers have pursued this approach.

The Handbook of Speech Perception

Подняться наверх