Читать книгу A Companion to Chomsky - Группа авторов - Страница 80
7.10 Thoughts about the Future
ОглавлениеAfter this long odyssey about our studies in language acquisition, let me return to Chomsky's early proposal concerning “the poverty of the stimulus” (the series of insights about the “stimulus‐free” nature of language learning and use). Most generally, Chomsky has invited us to consider the human Mind in light of the fact that every normal child acquires any known natural language to an expert level in a relatively brief period of time in infancy and early childhood, based on the hearing (or gesturing, in the case of sign language) an adventitious set of sentences in context.
In contrast, behaviorist psychology saw language learning as an instance of more general principles of operant conditioning, a relatively straightforward distillation and organization of experience. But the models presented, as Chomsky argued, never offered a plausible account of the dimensions of structured generalization.
Classical AI raised an analogous argument by analyzing intelligence as applied logic. But logic and formal language theory left language open to massive and pervasive ambiguity, even if the “grammar” could be learned perfectly. So the next generation re‐interpreted intelligence as applied probability. This worked much better, except when it didn't.
Useful examples come from the so‐called Winograd Schema Challenge (Morgenstern et al., 2016). Consider the following sentences:
The town councillors refused to give the angry demonstrators a permit because they feared violence.
Who feared violence?
Answer 1: the town councillors
Answer 2: the angry demonstrators
Here the special word is “feared” and its alternate is “advocated” as in the following:
The town councillors refused to give the angry demonstrators a permit because they advocated violence.
Who advocated violence?
Any average 10‐year‐old can resolve these ambiguities flawlessly. But after years and years of machine modeling, these devices still go 50–50 on the chosen resolutions. Even the newer “deep learning” networks, which require exponentially more training data than humans do, learn things that humans would never consider and lack the ability to integrate common‐sense reasoning.
Perhaps the core problem for the machine learner is that he never gets the joke. He mulishly acquires what is most probable and is stymied by the improbabilities of everyday life. Instructive for this problem are several newspaper headlines collected by Steven Pinker and republished in his {(1994/2007)} book The Language Instinct. One example is “Queen Mary has bottom scraped,” which makes every listener chuckle, but the machines have no useful chuckle routines. Another example from Pinker was the headline, “Man gets six months in violin case.” True, humans and not machines know how this choice should “usually” be made and therefore giggle, and so opt for the interpretation of “case” as litigation rather than container. But now consider the recent escape of the billionaire Carlos Ghosn from the Japanese legal system, which was accomplished indeed by his secreting himself inside a violin case (a double bass case, but who's counting). The point is that we humans can understand the probabilities in the world so as to interpret ambiguity and take this into account, but when the improbable becomes the actual, we continue to understand.
These are the kinds of formal and substantive properties of language and thought that Chomsky has invited us to consider from the earliest to most recent of his writings. So extravagant did these claims seem at the time when I first read them that they set the stage for my life's work in fleshing out details with experiments relevant to this topic, namely, that is, how radically could experience differ and still support language learning. This is why we considered the deprivation of experience in the blind and isolated deaf.
Chomsky continues to frame and expatiate upon the central questions of language learning and knowledge, and I stand very much in his debt.