Читать книгу A Companion to Chomsky - Группа авторов - Страница 29

3.2 Some Context: The Emerging Idea of Principles and Parameters

A core aspect of generative grammar in its early days was a computational system in the human mind that contained phrase structure rules for building hierarchical structures and more complex operations that were able to modify these phrase structures. The latter were known as transformations, and transformations crucially operated on structures, not, say, sentences as in Harris (1951). This gave rise to the name Transformational Grammar, which is synonymous with generative grammar. Transformations made the theory much more powerful as it allowed an infinite number of grammars (cf. Lasnik 2000, p. 114; Lasnik and Lohndal 2013, pp. 27–28), raising serious questions regarding learnability: How can a child select the correct target grammar from all available grammars? In this section, we will summarize some of the most important context leading up to the Principles and Parameters approach, which we will present in section 3. For reasons of space, the present section will have to set aside a lot of details, but see Freidin and Lasnik (2011) and Lasnik and Lohndal (2013) for a more detailed exposition.

The grammatical architecture proposed in Chomsky (1965) looks as in (2) (Chomsky 1965, pp. 135–136, cf. also Lasnik and Lohndal 2013, p. 36).

1 (2)

To give one example, we can consider simple questions. In Chomsky (1955, 1957), (3a) and (3b) had the same initial phrase structure (called phrase marker at the time).

1 (3)Ellie will solve the problem.Will Ellie solve the problem?

Transformations take the structure of (3a) and transforms it into the structure of (3b). The details are not important; readers can consult Lasnik (2000) for a lucid exposition. A remarkable success of this approach, as Lasnik (2005, 69) emphasizes, is that it enabled a unified analysis of (3) and (4).

1 (4)Ellie solved the problem.Did Ellie solve the problem?

Native speakers can sense a relationship between (3b) and (4b), but prior to Chomsky's analysis, there was no account of this. In Chomsky (1965), the technicalities were different, but the intuitions were the same: A common underlying Deep Structure as the basis for both declaratives and interrogatives, and then transformations that altered the structure into a surface structure, followed by morphophonological operations that provide the accurate forms for phonetic interpretation.

Transformations are an essential and powerful part of this architecture. Because of this, work conducted in the late 1960s and 1970s suggested a range of constraints to limit the power of transformations and consequently the range of possible grammars. An example of this is the work by Ross (1967), which proposed constraints on long‐distance dependencies (Ross labeled them islands; see Boeckx 2013, den Dikken and Lahne 2013 and Müller, Chapter 12 of this volume for overviews). Nevertheless, as Chomsky and Lasnik (1977) point out, the quest for descriptive adequacy led to a tremendously rich theory. This can be seen quite clearly in Peters and Ritchie (1973), whose explicit formalization contains a range of mechanisms that were proposed at the time, such as global rules and transderivational constraints. Let us look at these mechanisms briefly (building on the discussion in Lasnik and Lohndal 2013).

Lakoff (1970, p. 628) defines a global rule as a rule that states conditions on “configurations of corresponding nodes in nonadjacent trees in a derivation.” In general, transformations have always been assumed to be Markovian, that is, that they involve one step at the time. However, global rules require a system that dramatically extends the power beyond Markovian properties. Ross (1969) famously provided an example of a global rule. In this paper, he extends results he obtained in Ross (1967) involving island constraints. One such island constraint is illustrated in (5), the Coordinate Structure Constraint, which prevents extraction from just one of the conjunctions. We have illustrated that by showing in (5) a copy of who in the position from which it has been deleted.

1 (5) *Irv and someone were dancing, but I don't know who Irv and who were dancing.

Notably, Ross (1969) showed that if the constraint isn't visible, it goes away. A way to make it disappear is to use ellipsis, as in (6).

1 (6) Irv and someone were dancing, but I don't know who.

In (6), the coordinate structure, the constituent that forms the island, has been elided and is not pronounced. That makes the example acceptable. More formally, Ross argued that for an island violation to occur, the constituent that forms the island needs to be present at surface structure. If a transformation deletes this constituent, the constraint no longer applies. This deletion became known as sluicing (see van Craenenbroeck and Merchant 2013). To capture the contrast between (5) and (6), island constraints need to mention both the surface structure and the point in the derivation where the movement of the relevant constitutent (who in (5)) takes place, the coordinate structure in (5). That the constraint needs to mention both properties makes it a global rule.

As for transderivational constraints, such constraints depend on derivations different from the one that is being considered. Hankamer (1973) provides arguments in favor of such constraints. One example involves the phenomenon known as gapping (see van Craenenroeck and Merchant 2013). Among others, he uses the example in (7) (Hankamer 1973, pp. 26–27).

1 (7) Max wanted Ted to persuade Alex to get lost, and Walt, Ira.

The question is how such a string is derived, that is, what is the correct derivation underlying (7)? Possible candidates could be (8a) or (8b).

1 (8)Max wanted Ted to persuade Alex to get lost, *and Walt [wanted] Ira [to persuade Alex to get lost]Max wanted Ted to persuade Alex to get lost,*and Walt [wanted Ted to persuade] Ira [to get lost]

Hankamer argued that both options in (8) are out because (7) can also be derived from a different constituent structure which still derives the intended meaning, namely (9).

1 (9) Max wanted Ted to persuade Alex to get lost,and [Max wanted] Walt [to persuade] Ira [to get lost]

When the bracketed constituents are deleted, (9) becomes (7). Given this, the constraint would not just have to make reference to alternative derivations created from the same deep structure, but also to alternative derivations created from different deep structures. That raises nontrivial questions concerning the expressive power of such a computational system, and consequently also its learnability.⁶

Any extension of the class of possible grammars requires significant empirical justification. Chomsky and Lasnik (1977) argued that this justification had not been provided in approaches that extended the original framework in Chomsky (1955/1975, Peters and Ritchie (1973), and comparable work, cf. Dougherty (1973), Chomsky (1973), and Brame (1976). Because of that, Chomsky and Lasnik proposed a new framework which restricted the number of possible grammars significantly. This was seen as a crucial step towards being able to explain the acquisition of grammatical competence, a central goal ever since Chomsky (1965).

The new framework departed from earlier frameworks in some crucial ways, not at least in assuming that Universal Grammar is not an “undifferentiated” system. That is, it was argued that core grammar has highly restricted options, since it consists of universal principles and a few parameters that account for variation. In addition to the core, there is the periphery, consisting of “marked” phenomena, e.g. irregularities (i.e. irregular verbs) and exceptions more generally (e.g. English has prepositions, but also the marked exception ago – which comes after its complement). In other words, the approach required something similar to a theory of markedness, with all its complications (see Haspelmath 2006 for a comprehensive discussion). As Chomsky and Lasnik (1977, p. 430) say:

Systems that fall within core grammar constitute “the unmarked case”; we may think of them as optimal in terms of the evaluation metric. An actual language is determined by fixing the parameters of core grammar and then adding rules or rule conditions, using much richer resources, perhaps resources as rich as those contemplated in the earlier theories of [transformational grammar]⁷

Research was generally devoted to the core phenomena: “A reasonable approach would be to focus attention on the core system, putting aside phenomena that result from historical accident, dialect mixture, personal idiosyncrasies, and the like” (Chomsky and Lasnik 1993, p. 510).

The name for constraints in Chomsky and Lasnik (1977) was “filters.” In their paper, the hypothesis was that surface filters can capture effects of ordering, obligatoriness and contextual dependencies. Such surface filters would be universal; thus, we would not expect any variation between languages. This makes filters different from parameters. Furthermore, a third component was language‐specific filters. For example, to capture the ill‐formedness of (10a) in Standard English, the language‐specific filter in (10b) was proposed.

1 (10)*We want for to win.*[for‐to]

This filter deems any for‐to string illicit. Chomsky and Lasnik (1977, p. 442) claim that the rule in (10b) would be a “dialect” filter, since it was assumed to involve “a high degree of uncertainty and variation.” And, indeed, for to sequences are perfectly possible in for example Irish English dialects. In essence, then, a filter can either be outside of core grammar, like (10b), or part of core grammar, like the ban on stranding an affix (The Stranded Affix Filter, cf. Lasnik 1981).

Chomsky and Lasnik's (1977) paper prepared the ground for a major change in how to think about universality and variation. We turn to that in the next section.

Подняться наверх