Читать книгу Formal Semantics in Modern Type Theories - Stergios Chatzikyriakidis - Страница 16
1.4.3. Merits of MTT-semantics
ОглавлениеWe contend that, as foundational semantic languages, MTTs are advantageous and MTT-semantics has unique merits, as compared with simple type theory and Montagovian semantics. Here, we outline some of its merits, organized below into two respects, which are further elaborated in the following chapters (and related papers). Please note that this is in no way to cover all, or even most, attractive aspects of MTT-semantics and it only serves the purpose of highlighting some notable features, hopefully making it easier to understand the following pages of the book.
Rich typing for semantic constructions. MTTs have a rich type structure: unlike simple type theory in which there are only the base types e and t and arrow types A → B (or A, B, in a traditional notation), there are many ways to form types including, for example, dependent types (e.g. Π-types and Σ-types), inductive types (e.g. types of numbers/lists/vectors/trees and disjoint union types), logical types for propositions (under the propositions-as-types principle) and type universes (types that contain types as objects).29 As we mentioned briefly earlier, the typing judgments of the form a : A, intuitively saying that a is an object of type A, are very different from the set-theoretical membership statements of the form s ∈ S: for example, the typing judgment a : A is mechanically decidable, while the membership relation s ∈ S (expressing that s satisfies predicate S) is a formula in the first-order logic and its truth is undecidable. Among other things, this decidability property is essential for any type theory to have a logic based on the propositions-as-types principle.
The rich typing disciplines also give us several advantageous points in semantic constructions. A simple advantage is that rich typing allows us to employ typing judgments, rather than membership statements, to deal with selectional restriction. In MTT-semantics, we can use the typing judgment j : Man to express that “John is a man” and this is different from Montague semantics where we use the formula man(j) to represent the meaning. Therefore, typing also offers us a means to deal with selectional restriction – to exclude meaningless sentences or phrases. This allows us to use typability as a formal criterion to judge whether a sentence is meaningful: for example, we would usually think (in a non-fictional world) that a sentence like “Tables talk” is meaningless with a category error – this is captured by means of typing in MTT-semantics: the MTT-interpretation of “Tables talk” is ill-typed (or an illegal expression). This is different from Montague semantics where the interpretation of “Tables talk” is just a false statement but it is a well-formed formula. (See section 3.1 for a further explanation.)
The rich type structure in MTTs enables us to use types to play the role of representing various collections,30 the role played by sets in the Montagovian set-theoretical semantics. Besides acting as logical propositions, types can also be used in various ways in semantic constructions. For example, this allows one to interpret common nouns as types rather than predicates (see the above, (Luo 2012a) and section 3.2.1); in section 3.3, we show how various adjectival modifications can be modeled by means of rich typing, with Σ-types for intersective modification, Π-types (and the associated polymorphism) for subsective modification, disjoint union types and Π-polymorphism for privative modification, and a logical modal collection operator for non-committal modification. We shall also show in Chapters 4 and 5 that the typing structure can be used to model more advanced linguistic features such as gradable adjectival modification and copredication.
As mentioned earlier, subtyping is not only essential for MTT-semantics, but also very useful in semantic constructions. For instance, coercive subtyping is employed in defining dot-types (Luo 2009c, 2012b) for giving adequate interpretations of sentences involving copredication, an example of which is the sentence in (1.16), where a delicious lunch refers to food and a lunch that took forever refers to a process. Such copredication phenomena have been studied by many researchers such as Pustejovsky (1995) and Asher (2012). For MTT-semantics, the second author has proposed to use dot-types to deal with copredication (Luo 2009c): for this example, lunch involves two aspects: that of being food and that being a process. This is formalized by means of the subtyping relationship (1.17). It is then straightforward to see that the formula in (1.18), which interprets (1.16), is well-typed because of the subtyping relation in (1.17), where l : Lunch is the interpretation of “the lunch”.
(1.16) The lunch was delicious but took forever.
(1.17) Lunch ≤ Food • Process
(1.18) delicious(l) ∧ take_forever(l)
When defining dot-types such as Food • Process, coercive subtyping is used: we have, for example, Food • Process ≤ Food and Food • Process ≤ Process and therefore (1.18) is well-typed: both predicates delicious and take_forever can be applied to l : Lunch because both domains Food and Process are supertypes of Lunch. (For more formal details on subtyping, see section 2.4 and section 3.2.2, and for more about copredication and dot-types, see Chapter 5.)
MTT-semantics is both model-theoretic and proof-theoretic. MTT-semantics has a unique important feature: it is both model-theoretic and proof-theoretic, as argued for in Luo (2014, 2019a), and this has made MTTs a particularly suitable framework for formal semantics. Being model-theoretic, MTT-semantics provides a wide coverage of various linguistic features and, being proof-theoretic, its foundational languages have proof-theoretic meaning theory based on inferential uses (appealing philosophically and theoretically) and it establishes a solid foundation for practical reasoning in natural languages on computers (appealing practically). Altogether, this strengthens the argument that MTT-semantics is a promising framework for formal semantics, both theoretically and practically.
MTTs are proof-theoretically specified.31 Usually, an MTT is presented as a natural deduction system where, in particular, every type constructor is specified by inference rules among which introduction rules are to declare how the type (formula) is inhabited (proved) and the elimination rules to specify what consequences can be derived under the assumption that the type (formula) is inhabited (proved). These rules are harmonious and, as an important consequence, MTTs themselves have proof-theoretic semantics, as studied by logicians such as Gentzen (1935), Prawitz (1974, 1973) and Martin-Löf (1984, 1996) and discussed by philosophers such as Dummett (1975, 1991) and Brandom (1994, 2000), among others.32 For example, Martin-Löf has studied and developed meaning theory and given a solid foundation for his type theory (Martin-Löf 1984). Therefore, MTT-semantics, the formal semantics in MTTs, is proof-theoretic in the sense that its foundational semantic languages have proof-theoretic meaning theory based on inferential uses.33
Furthermore, the fact that MTTs are proof systems with proof-theoretic semantics has another significant and practical consequence in natural language reasoning based on MTT-semantics. In particular, this makes it possible for MTTs to be implemented in proof assistants such as Agda (Agda 2008), Coq (Coq 2010) and Lego/Plastic (Luo and Pollack 1992; Callaghan and Luo 2001) – computer-assisted reasoning systems that computer scientists have developed and successfully used for the formalization of mathematics and verification of computer programs. Therefore, MTT-semantics can be directly implemented in proof assistants that implement MTTs: for example, the MTT-semantics in type theory UTT has been implemented in Coq (Chatzikyriakidis and Luo 2014; Luo 2011b) and Plastic (Xue and Luo 2012) and used for natural language reasoning (Chatzikyriakidis and Luo 2016) (see Chapter 6 for more details).
That MTTs are proof-theoretically presented has led to a widely held view that formal semantics based on MTTs is only proof-theoretic and, in particular, it is not model-theoretic.34 This is mistaken – MTT-semantics is also model-theoretic. First of all, it is necessary for us to make clear that, by MTT-semantics being model-theoretic, we do not mean that an MTT can be given a set-theoretical semantics (e.g. in some categorical framework); instead, we mean that an MTT itself can be employed as a meaning-carrying language to give model-theoretic semantics to natural language. That is, in MTT-semantics, an MTT plays the role of meaning-carrying language – in the traditional model-theoretic semantics, this is the role played by set theory. In other words, we argue that MTTs can well serve as a foundational semantic language to give meanings in themselves in a model-theoretic semantics.
We shall develop the theme that MTT-semantics is model-theoretic in two fronts: the first is to note that, like sets in Montague semantics, types in MTTs are employed to represent collections of objects (for example, semantic interpretations of common nouns), and the rich type structure in MTTs provides powerful means of representation for formal semantics. In other words, intuitively, types in MTTs are rich enough to play the role of representing collections, just as sets in Montague semantics.
As mentioned above, MTTs have a rich type structure and, as to be demonstrated in Chapters 3, 4 and 5, in MTT-semantics these types play an important role in representing different kinds of collections such as those given by CNs modified by various kinds of adjectives. Such representations by means of types are not available for simple type theory, which has only base types and arrow types. Types in MTTs are powerful and can provide various different representations in semantic constructions. This is the first facet to explicate that MTT-semantics is model-theoretic.
The second facet is that the contextual structures in MTTs, which are not available in traditional logical systems, provide very useful means in semantic constructions and, in particular, they offer effective support for model-theoretic descriptions of incomplete possible worlds. We shall describe the notion of signatures (Luo 2014) (see section 2.1) that allow MTTs to be more suitable to support model-theoretic descriptions.
The notion of signature in type theory, as far as we know, first appeared in Edinburgh Logical Framework (Harper et al. 1993), where signatures with membership entries are used to describe a logical system.35 We shall introduce signatures that do not just have the usual membership entries of the form x : A but also contain two new forms of entries, subtyping entries and manifest entries (see section 2.4), which strengthen the power of signatures in representing (incomplete) possible worlds, even in cases where, for example, situations are infinite or involve more sophisticated phenomena. As shown in Lungu and Luo (2018) and section 2.5, extending MTTs with signatures, which may contain entries of the above new forms as well as the membership entries, preserves their nice meta-theoretic properties such as strong normalization (and hence logical consistency).
It is worth pointing out that foundational semantic languages that are both model-theoretic and proof-theoretic were not available before the development of MTT-semantics or, at least, people have not recognized that there is such a possibility (in particular, set theory is not such a language since it is not proof-theoretic). MTT-semantics is both model-theoretic and proof-theoretic and sheds a new light on the division between model-theoretic semantics and proof-theoretic semantics and allows us to study formal semantics from a new perspective.
1 1 Attribution for the development of simple type theory should also go to the Polish logician Chwistek (see, for example, Linsky 2009).
2 2 The term “modern type theory” is adopted to distinguish MTTs, as used in MTT-semantics, from Church’s simple type theory, as used in Montague semantics, and it appeared for the first time in Luo (2009c).
3 3 In this respect, interested readers are referred to Logic-enriched Type Theories (LTTs), proposed by Aczel and Gambino in their study of type-theoretic interpretations of constructive set theory (Aczel and Gambino 2000; Gambino and Aczel 2006), and to the work by the second author and Adams on a type-theoretical framework of LTTs to support formal reasoning with different logical foundations, including classical inference as well as intuitionistic inference (Luo 2007; Adams and Luo 2010). Also related is the recent work on homotopy type theory where Martin-Löf’s type theory is employed in such a way that the logic can be either intuitionistic or classical (HoTT 2013).
4 4 Although the logic in IL can be used to describe a lot of linguistic features, there are many others still found very difficult, if not impossible, to be characterized in IL (or simple type theory); in the literature, these phenomena are rather studied in set theory directly – see such “direct interpretations” in, for example, Winter (2016). This is one of the reasons that, for Montague semantics, set theory itself is considered the foundational semantic language, rather than the intermediate language IL.
5 5 A remark may be necessary here: we are only emphasizing the similarities between type theory and set theory here. However, they are very different – for example, it is their difference that enables us to say that MTT-semantics is also proof-theoretic and can be successfully implemented on computers for natural language reasoning. For some readers, the difference will only be apparent later on, say, after reading Chapter 2 to become more familiar with MTTs.
6 6 Our description of simple type theory is formal and, in section 7.2, will be extended with dependent event types, one of the applications of dependent types in Davidsonian event semantics.
7 7 IL as described in Montague (1973) also contains a modal operator used to describe tense. As noted in Montague (1973), it can be considered as a special predicate symbol to be interpreted as intended. We omit it here.
8 8 For those who are unfamiliar with natural deduction rules, it may be worth remarking that the rules below define a natural deduction system whose sentences are called judgments. A judgment J is correct (formally called derivable) if there is a derivation of J, that is, a finite sequence of judgments J1, ..., Jn with Jn = J such that, for all 1 ≤ i ≤ n, Ji is the conclusion of an instance of some inference rule whose premises are in {Jk | k < i }.
9 9 Here, the names of the base types e and t follow the Montagovian tradition. In Church (1940), e and t are named ι and o, respectively.
10 10 As in λ-calculus, β-conversion holds: we have that (λx:A.b[x])(a) is convertible to b[a], notation (λx:A.b[x])(a) ≈ b[a], where b[a] is obtained from b[x] by substituting a for all free occurrences of x. In other words, these two expressions (λx:A.b[x])(a) and b[a] are computationally the same – the former computes to the latter, and the relation ≈ of β-conversion is the reflexive, symmetric and transitive closure of this basic computational relation.
11 11 This is slightly simpler than Montague’s treatment in IL (Montague 1973), although it gives the same power. For its justification, see Gallin’s work on TY2 (Gallin 1975).
12 12 We shall deal with aspects of intensionality when discussing non-committal adjectives in section 3.3.4 and intensional adverbs in section 4.5.4.
13 13 In predicative MTTs such as Martin-Löf’s type theory, one would use a predicative universe which is only a type of some propositions, not all of them. Although we may relate Prop to the type t of truth values in Church’s simple type theory, they have subtle differences. The differences between MTTs and Church’s simple type theory include whether the theory is classical or constructive and whether there are proof objects, among others. These are beyond the scope of our discussions here.
14 14 This constitutes the basis of dealing with selectional restriction by means of decidable typing, among many other things (more details later).
15 15 In MTT-semantics, we also study judgmental interpretations of sentences and how to turn them into propositional forms – see section 3.2.3 for an informal introduction and section 7.1 for a formal treatment.
16 16 In Martin-Löf’s type theory with propositions-as-types (the so-called PaT logic), existential quantification is represented by Σ, which is strong in the sense that an object of a Σ-type is a pair whose first component can be projected out as the witness of the truth of the existentially quantified sentence. For example, as shown in Sundholm (1986), the donkey sentence (1.11) can be interpreted as (1.12), where Farmer and Donkey are the types interpreting “farmer” and “donkey”, respectively:
17 (1.11) Every farmer who owns a donkey beats it.
18 (1.12) Πz : (Σx:FarmerΣy:Donkey. own(x, y)). beat(π1(z),π1(π2(z)))
19 Note that, in (1.12), πi’s are projection operators for Σ-types and, in particular, π1 projects out the witness of a proof of an existential formula represented by a Σ-type (see section 2.2.2 for more details). However, such a use of Σ-types as both existential quantifier and a structural mechanism causes problems such as counting – see the second last paragraph in this section and footnotes 25/26 on p.15/p.16 for an example and explanations.
20 17 Ranta (1994) discussed several possible solutions but failed to obtain a satisfactory solution and, in particular, he did not realize that subtyping should be employed to solve the problem. In fact, coercive subtyping (Luo 1999) which, unlike the ordinary subsumptive subtyping, is adequate for dependent type theories and can be used to rectify this – see section 3.3.1 and footnote 21 on p68 for more on this topic.
21 18 In this book, the phrases such as possible worlds and situations are used informally and, in particular, they do not refer to any of their formal meanings that have been used in the literature.
22 19 The type-theoretical notion of context is not quite proper to model situations since abstraction can always be done over entries in a context. A further development has been made in Luo (2014) as well as Lungu and Luo (2018) to introduce the notion of signature, which can be used to represent situations in a more adequate way. In this book, we shall use type theory with signatures – for more details, see Chapter 2, especially section 2.1, section 2.4 and section 2.5.
23 20 One may consider the point of view (and we agree with it) that an impredicative type theory with an internal totality Prop of logical propositions is better suited for formal semantics – see Luo (2019b) on proof irrelevance for one of the arguments for this.
24 21 In particular, it offers a satisfactory solution to Ranta’s multiple categorization problem as mentioned above and footnote 17 on p13, and hence overcomes the major obstacle for MTTs to be applied to formal semantics – for more on this, see section 3.2.2.
25 22 Formally, the type theories in this book are extended with signatures (Luo 2019a) – see Chapter 2.
26 23 For people familiar with type theory, it is worth mentioning that dot-types (Luo 2009c) are not ordinary inductive types, rather they are specially developed for dealing with copredication.
27 24 The idea of using coercive subtyping for formal semantics was first proposed by the second author in Luo and Callaghan (1998), but it was not until much later (a decade later) when Luo (2009c) was published where coercive subtyping was employed in semantic interpretations of modified CNs and constructing dot-types for copredication.
28 25 This problem with Σ playing a “double role” can be illustrated by considering (1.13), whose interpretation in Martin-Löf’s type theory would be (1.14), where “most” is interpreted by means of the quantifier Most defined in Sundholm (1989). (The second author thanks Justyna Grudziñska for a discussion about this example.)(1.13) Most farmers who own a donkey beat it.(1.14) Most z : (Σx:FarmerΣy:Donkey. own(x, y)). beat(π1(z),π1(π2(z)))
29 The interpretation (1.14) fails to respect correct counting because, with Σ, the proof objects of own(x, y) contribute to counting in a wrong and unintended way.
30 26 As pointed out by the second author in Luo (2019b), in a type theory such as UTT where we can enforce proof irrelevance (see section 3.3.1 and footnote 22 on p. 69), the donkey sentence (1.13) in the above footnote 25 can be given a satisfactory interpretation (1.15) by means of Σ and the weak propositional quantifier ∃, where the propositional quantifier Most is defined in UTT:
31 (1.15)
32 Note that ∃ and Σ are both traditional binding operators. Although we may not give an intended semantics to (1.13) using only either of them alone, it can be done if both are available. This implies that, in order to solve such a problem, we may not have to abandon traditional binding mechanisms to consider new mechanisms, a strategy adopted by proponents of dynamic semantics, such as Kamp (1981) and Groenendijk and Stokhof (1991). For example, dynamic predicate logic (Groenendijk and Stokhof 1991) is a non-standard logical system: among other things, it is non-monotonic and the notion of dynamic entailment fails to be reflexive or transitive – such a move seems to be paying too big a price that could have been avoided.
33 27 DTS has two versions as far as the underlying type theory is concerned: Martin-Löf’s type theory in Bekki (2014) and some impredicative type system in Bekki and Mineshima (2017). In the DTS approach, CNs are interpreted as predicates as in the traditional Montagovian semantics and, because of this, we would not need the rich type structures and DTS mainly uses Π/Σ-types as logical operators. This means that it does not have the advantages in the CNs-as-types paradigm (see section 3.2.1), either.
34 28 It may be worth clarifying that the system called Type Theory with Records (TTR) (Cooper 2005, 2017) is not a type theory, as the term is usually understood, but rather a set-theoretic notational framework, where a : T if, and only if, a is a member of the set whose name is T in set theory. Although it is set-theoretical, the development of TTR, especially in the early days, was influenced by the study of record types by Tasistro (1997) and Betarte (1998) in Martin-Löf’s logical framework.
35 29 See Chapter 2, and sections 2.2 and 2.3 in particular, for more details.
36 30 In this book, the word collection is used to refer to informal entities, rather than their formal representations such as sets or types.
37 31 For details, see Chapter 2 and related references for MTTs.
38 32 It is worth remarking that even if a logical system is specified by proof-theoretic rules, it can still fail to have a proper proof-theoretic semantics. For example, to have a proof-theoretic semantics, the introduction and elimination rules for any logical operator or type constructor must be in harmony, as discussed by Dummett (1991) and others.
39 33 Note that MTT-semantics itself is still denotational (and model-theoretic – see below) in the sense that the meaning of a natural language phrase/sentence is given by its denotation in an MTT, not directly by proof rules, although its foundational languages have a proof-theoretic meaning theory, as described above in the sense of “logical inferentialism” or “proof-theoretic semantics” (Kahle and Schroeder-Heister 2006). Can we extend logical inferentialism to natural language? For example, Francez and his colleagues (Francez 2015) have studied natural language semantics in a way that directly adopts the methodology of proof-theoretic semantics for logical systems and applies it to natural language. However, in order for such semantics to be adequate or viable for natural language, one would have to answer some difficult questions. For example, for a logical language, logicians have recognized that, to capture the meaning of a formula in the use theory, it is central for us to spell out two aspects of its use – how to prove it and how to derive consequences from it, as captured by introduction and elimination rules, respectively. Could this be generalized directly to natural language? This would be a big leap and, unfortunately, a convincing justification has not been forthcoming. In fact, philosophers such as Brandom who advocate “general inferentialism” do not restrict the rule patterns in this way as the following remark by Peregrin (2013) explains:... It is important to realize that this logical kind of inferentialism must be classified as a special case of general inferentialism not just because it is restricted to logical constants, but also because strict constraints are posed on the inferential patterns that can constitute the (meanings of the) logical constants. By contrast, general inferentialism only claims that the meaning of a word is its role vis-à-vis an inferential pattern; there is no claim that each word must have its own constitutive inferential pattern, let alone a claim that this pattern must be of a shape prescribed by Gentzen.In other words, the meaning of an NL phrase/sentence may not be captured by only introduction and elimination rules and there is a good reason for us to be pessimistic for the viability of such an approach.
40 34 An exception is (Ranta 2015, p. 346) where Ranta pointed out that it is a misunderstanding to think that formal semantics based on Martin-Löf’s type theory (an MTT) is not model-theoretic.
41 35 Historically, signatures have been used in describing algebraic structures and, for example, more recently they were used in describing many-sorted structures in the study of algebraic specifications (Goguen and Burstall 1983). However, it should be noted that the notion of signature in type theory is rather different from that in algebras, although they may be related informally.