Читать книгу The Handbook of Speech Perception - Группа авторов - Страница 24

A constraint on normative descriptions of speech perception

Оглавление

The application of powerful statistical techniques to problems in cognitive psychology has engendered a variety of normative, incidence‐based accounts of perception. Since the 1980s, a technology of parallel computation based loosely on an idealization of the neuron has driven the creation of a proliferation of devices that perform intelligent acts. The exact modeling of neurophysiology is rare in this enterprise, though probabilistic models attired as neural nets enjoy a hopeful if unearned appearance of naturalness that older, algorithmic explanations of cognitive processes unquestionably lack. As a theory of human cognitive function, it is more truthful to say that deep learning implementations characterize the human actor as an office full of clerks at an insurance company, endlessly tallying the incidence of different states in one domain (perhaps age and zip code, or the bitmap of the momentary auditory effect of a noise burst in the spectrum) and associating them (perhaps in a nonlinear projection) with those in another domain (perhaps the risk of major surgery, or the place of articulation of a consonant).

In the perception of speech and language, the ability of perceivers to differentiate levels of linguistic structure has been attributed to a sensitivity to inhomogeneities in distributions of specific instances of sounds, words, and phrases. Although a dispute has taken shape about the exact dimensions of the domain within which sensitivity to distributions can be useful (e.g. Peña et al., 2002; contra Seidenberg, MacDonald, & Saffran, 2002), there is confident agreement that a distributional analysis of a stream of speech is performed in order to derive a linguistic phonetic segmental sequence. Indeed, this is claimed as one key component of language acquisition in early childhood (Saffran, Aslin, & Newport, 1996). The presumption of this assertion obliges a listener to establish and maintain in memory a distribution of auditory tokens projectable into phonetic types (Holt & Lotto, 2006). This is surely false. The rapid decay of an auditory trace of speech leaves it uniquely unfit for functions requiring memory lasting longer than 100 ms, and for this reason it is simply implausible that stable perceptual categories rest on durable representations of auditory exemplars of speech samples. Moreover, the notion of perceptual organization presented in this chapter argues that a speech stream is not usefully represented as a series of individual cues, whether for perceptual organization or for analysis. In fact, in order to determine that a particular acoustic moment is a cue, a perceptual function already sensitive to coordinate variation must apply. Whether or not a person other than a researcher compiling entries in the Dictionary of American Regional English can become sensitive to distributions of linguistic properties as such, it is exceedingly unlikely that the perceptual resolution of linguistic properties in utterances is much influenced by representations of the statistical properties of speech sounds. Indeed, the neural clerks would be free to tally what they will, but perception must act first to provide the instances.

The Handbook of Speech Perception

Подняться наверх