Читать книгу The Science of Reading - Группа авторов - Страница 40
CHAPTER TWO Models of Word Reading : What Have We Learned?
ОглавлениеMark S. Seidenberg, Molly Farry‐Thorn, and Jason D. Zevin
In reading, much depends on word recognition. Words are the elements out of which expressions are composed. They act as hubs that index the many types of information used in comprehending and producing language, whether written, spoken, or signed (Seidenberg, 2017). Words encode information about their meanings and senses, their internal structure (principally syllables, morphemes, phonemes), and their grammatical functions (e.g., noun, verb). They also carry information about the linguistic contexts in which they occur: for example the verb put refers to a particular action that occurs with an agent, object, and location, whereas carry only requires the agent and object (MacDonald et al., 1994). Knowledge of a word also includes information about language use: how often words occur and co‐occur with other words, given what is in the world and what we choose to communicate about (Clark, 2015). This statistical information is encoded as people acquire and use language (Seidenberg & MacDonald, 2018). Gaining the ability to read and understand words quickly and accurately is the great leap into literacy, but one that is challenging for many children.
Visual word recognition has been the focus of an enormous amount of research because of its complexity and importance, and because most of what is involved would otherwise be hidden from awareness (for reviews, see Rastle, 2016; Cohen‐Shikora & Balota, 2016). The goal of this research is to develop theories that explain its many aspects: the knowledge and processes that underlie word recognition; the linguistic, cognitive, and perceptual capacities recruited for the purpose; how the skill develops, the bases of individual differences, and how the brain makes it all happen, among other topics. Theories are often expressed as “models” that provide detailed accounts of important components of the word recognition system. Although the use of such models dates from the nineteenth century, progress was greatly accelerated by two developments from the 1970s–1980s. The first was Marshall and Newcombe’s (1973) formulation of what came to be known as the “dual‐route” model of reading (Coltheart, 1978). The model was an account of impairments in reading aloud observed in patients following brain injury; Coltheart and colleagues later applied it to unimpaired reading and learning to read. Much of the subsequent research in this area can be seen as following from this pioneering work. The second was the creation of a “connectionist” computational model of reading, again focused on reading aloud, by Seidenberg and McClelland (1989; hereafter SM89). This work was important because it challenged the core assumptions underlying the dual‐route approach and introduced a new theoretical framework for visual word recognition and other types of lexical processing, based on the PDP framework developed by Rumelhart et al. (1986). Coltheart and colleagues subsequently developed several computational models of the dual‐route theory, collectively known as the dual‐route cascade (DRC) model (Coltheart et al., 1993; Coltheart et al., 2001).
An enormous amount has been learned since then. Visual word recognition is one of the great success stories in modern cognitive science and neuroscience. For much of this period, the existence of two competing theoretical approaches – dual‐route and connectionist – accelerated research progress. These theories provided frameworks for investigating numerous aspects of reading and greatly expanded the scope of research in English and other languages. The theories also stimulated the development of computational models of specific types of information (e.g., orthography, semantics) and related phenomena (e.g., morphology: Seidenberg & Gonnerman, 2000; Seidenberg & Plaut, 2014). Visual word recognition also became a domain in which to explore contrasting approaches to computational modeling of cognitive phenomena (Coltheart, 2005; Seidenberg & Plaut, 2006), and methods for studying brain structure and function (e.g., Cox et al., 2015; Woollams et al., this volume). Given the sustained interest in the topic over many years, visual word recognition represents an important case study illustrating what modern cognitive science and neuroscience has achieved.
The purpose of this chapter is to provide a critical perspective on this long endeavor, focusing on the role of computational modeling. Computational models of cognition serve two essential, interacting functions. One is methodological. Modeling requires theoretical claims to be specified at a level that allows them to be implemented as working simulations. A theory’s validity can then be assessed by determining if a model incorporating its main assumptions can reproduce the phenomena the theory is meant to explain. This method has been widely embraced as an advance over the informal models of the “box‐and‐arrow” era in which the dual‐route approach originated (Seidenberg, 1988).
The second function is theoretical. Models are implemented within theoretical frameworks such as production systems (Anderson, 1983), connectionist networks (Thomas & McClelland, 2008), and Bayesian approaches (Griffiths et al., 2010) that introduce novel ways to conceptualize behavior. Applying such frameworks to phenomena such as reading can yield theories that are genuine departures from previous thinking. Comparing a model’s behavior to people’s then leads to accepting, adjusting, or abandoning the theoretical account, and generates new questions to investigate. This feedback loop between model and theory, each grounded by empirical evidence, is a powerful approach to investigating complex phenomena (Figure 2.1).
Figure 2.1 Theory development and evaluation using computational models. Theoretical frameworks are used to develop theories of particular phenomena. Models that implement core parts of the theory are intended to simulate target data. Model performance feeds back on theory development and generates new hypotheses and empirical tests.
With the benefit of 30‐some years of hindsight we can ask: Did computational models of reading yield the expected benefits? Did they indeed provide a basis for assessing competing theories? Did they yield new theoretical insights? In short, given the promise of the approach and several decades of modeling research, what have we learned?
Like many others, we think that computational modeling proved to be an invaluable tool in both methodological and theoretical respects. Taken as a method for testing theories, attempts to implement models based on the dual‐route theory revealed apparently intractable limitations of the approach. Researchers were unable to implement models that reproduced basic behavioral phenomena concerning the pronunciation of regular and irregular words and nonwords that the dual‐route theory was developed to explain.
Models based on the connectionist framework reproduced these effects, as well as additional phenomena that were not predicted by the dual‐route theory and were not simulated correctly within it. That approach is limited by the core assumption that pronunciations are either rule‐governed or exceptions. This dichotomy overlooks the fact that spelling‐sound correspondences exhibit varying degrees of consistency (Table 2.1). Regular (rule‐governed) words and exceptions occupy different points on this consistency continuum. Importantly, this account also predicts that words and nonwords can exhibit intermediate degrees of consistency. Consistency effects have been observed in numerous studies dating from Glushko (1979). Connectionist models could reproduce regularity, consistency, and other effects because they encode spelling‐sound correspondences as statistical dependencies rather than as rules and exceptions. The connectionist models also advanced theorizing by showing how concepts and computational mechanisms from the PDP framework could provide new insights about complex behavior.
Table 2.1 Regularity versus Consistency: What’s the Difference?
Categories of Words in Dual‐Route Theory | |
Regular/Rule‐governed | Irregular/Exception |
MUST CHAIR DIME BOAT | HAVE DONE SAID PINT |
Exceptions = words whose pronunciations are not correctly generated by rules. | |
Glushko Inconsistent Words | |
Regular but Inconsistent | |
SAVE (have) BONE (done) PAID (said) MINT (pint) | |
These words are rule‐governed according to dual‐route model, but they have one or more irregular neighbors (in parentheses). | |
Connectionist/Statistical Learning Approach | |
Degrees of Spelling‐Sound Consistency: | |
Low | High |
Strange words Exceptions Reg Inconsistent Regular | |
Words and nonwords exhibit varying degrees of spelling‐sound consistency. Regular, exception and inconsistent words occupy positions on this continuum, along with other intermediate cases. “Strange” words are oddballs like COLONEL and SPHINX. Locations on the continuum are approximate. |
It took many years of research within both approaches to arrive at these conclusions. Over time, successors to the DRC model discarded defining features of the approach in favor of networks incorporating distributed representations trained via weight‐adjusting learning procedures, the connectionist approach (e.g., Perry et al., 2007; Ziegler et al., 2014). This development reflects broader trends in cognitive science and neuroscience. Core PDP/connectionist ideas about distributed representations, quasiregularity, statistical learning, constraint satisfaction processing, and division of labor between components of the language system have been widely absorbed and continue to inform research (e.g., Chang et al., 2020; Chen et al., 2017; Gordon & Dell, 2003; Hoffman et al., 2015; Smith et al., 2021). This framework has proved particularly relevant to understanding the brain bases of reading, language, and visual cognition because the grain of the models is well matched to the grain of the data obtained using current neuroimaging methods (Cox et al., 2015). The computational models retain their relevance to understanding cognition and its brain bases even though they are simpler than deep learning networks that perform far more complex tasks, but are much harder to analyze and less closely tied to human experience (Joanisse & McClelland, 2015).
This chapter begins by showing that simulations presented as supporting the DRC model differed from the corresponding behavioral studies. The implemented models also exhibited other anomalous behaviors that were overlooked. Connectionist networks reproduce the behavioral effects but can also explain why they occur (Seidenberg & Plaut, 2006). Our discussion focuses on the major features of reading aloud, leaving aside many other important issues (the bases of individual differences and dyslexia, cross‐linguistic comparisons, brain bases of reading, and others) because of space limitations.
We then examine “connectionist dual‐route models” (Perry et al., 2007, 2010; Ziegler et al., 2014). These hybrid models incorporate the major assumptions of the “triangle” framework but differ in one respect: They retain a second, lexical route. However, the phenomena this mechanism is intended to explain are explained in connectionist models that incorporate additional parts of the orthography➔phonology➔semantics triangle. The “lexical route” allows the authors to claim a degree of continuity with dual‐route models, but it is not required to explain any data. We close by considering the relevance of computational modeling for understanding how children learn to read. The dual‐route theory remains influential in areas where computational modeling results are not well known. These include reading acquisition and instruction, where research and pedagogy still focus on learning pronunciation rules and adding sight words to the lexicon, and in some areas of cognitive neuroscience (e.g., Bouhali et al., 2019). Modeling established the inadequacy of the dual‐route model, but because those results are not known, the approach retains its intuitive appeal. There are deep concerns about literacy levels in the United States, United Kingdom, and many other countries, and great interest in using the “science of reading” to improve instruction and outcomes (Seidenberg et al., 2020). A theory of visual word recognition could contribute to improved educational practices but only if the theory is correct and speaks to relevant issues about how children learn.