Читать книгу Ontology Engineering - Elisa F. Kendall - Страница 11

Оглавление

CHAPTER 1

Foundations

Ontologies have become increasingly important as the use of knowledge graphs, machine learning, and natural language processing (NLP), and the amount of data generated on a daily basis has exploded. Many ontology projects have failed, however, due at least in part to a lack of discipline in the development process. This book is designed to address that, by outlining a proven methodology for the work of ontology engineering based on the combined experience of the authors, our colleagues, and students. Our intent for this chapter is to provide a very basic introduction to knowledge representation and a bit of context for the material that follows in subsequent chapters.

1.1 BACKGROUND AND DEFINITIONS

Most mature engineering fields have some set of authoritative definitions that practitioners know and depend on. Having common definitions makes it much easier to talk about the discipline and allows us to communicate with one another precisely and reliably about our work. Knowing the language makes you part of the club.

We hear many overlapping and sometimes confusing definitions for “ontology,” partly because the knowledge representation (KR) field is still maturing from a commercial perspective, and partly because of its cross-disciplinary nature. Many professional ontologists have background and training in fields including formal philosophy, cognitive science, computational linguistics, data and information architecture, software engineering, artificial intelligence, or library science. As commercial awareness about linked data and ontologies has increased, people knowledgeable about a domain but not trained in any of these areas have started to build ontologies for use in applications as well. Typically, these individuals are domain experts looking for solutions to something their IT departments haven’t delivered, or they are enterprise architects who have run into brick walls attempting to use more traditional technologies to address tough problems. The result is that there is not as much consensus about what people mean when they talk about ontologies as one might think, and people often talk past one another without realizing that they are doing so.

There are a number of well-known definitions and quotes in the knowledge representation field that practitioners often cite, and we list a few here to provide common grounding:

“An ontology is a specification of a conceptualization.” (Gruber, 1993)

This definition is one of the earliest and most cited definitions for ontology with respect to artificial intelligence. While it may seem a bit academic, we believe that by the time you finish reading this book, you’ll understand what it means and how to use it. It is, in fact, the most terse and most precise definition of ontology that we have encountered. Having said this, some people may find a more operational definition helpful:

“An ontology is a formal, explicit description of concepts in a domain of discourse (classes (sometimes called concepts)), properties of each concept describing various features and attributes of the concept (slots (sometimes called roles or properties)), and restrictions on slots (facets (sometimes called role restrictions)).” (Noy and McGuinness, 2001)

The most common term for the discipline of ontology engineering is “knowledge engineering,” as defined by John Sowa years ago:

“Knowledge engineering is the application of logic and ontology to the task of building computable models of some domain for some purpose.” (Sowa, 1999)

Any knowledge engineering activity absolutely must be grounded in a domain and must be driven by requirements. We will repeat this theme throughout the book and hope that the “of some domain for some purpose” part of John’s definition will compel our readers to specify the context and use cases for every ontology project you undertake. Examples of what we mean by context and use cases will be scattered throughout the sections that follow, and will be covered in depth in Chapter 3.

Here are a few other classic definitions and quotes that may be useful as we consider how to model knowledge and then reason with that knowledge:

“Artificial Intelligence can be viewed as the study of intelligent behavior achieved through computational means. Knowledge Representation then is the part of AI that is concerned with how an agent uses what it knows in deciding what to do.” (Brachman and Levesque, 2004)

“Knowledge representation means that knowledge is formalized in a symbolic form, that is, to find a symbolic expression that can be interpreted.” (Klein and Methlie, 1995)

“The task of classifying all the words of language, or what’s the same thing, all the ideas that seek expression, is the most stupendous of logical tasks. Anybody but the most accomplished logician must break down in it utterly; and even for the strongest man, it is the severest possible tax on the logical equipment and faculty.” (Charles Sanders Peirce, in a letter to editor B. E. Smith of the Century Dictionary)

Our own definition of ontology is based on applied experience over the last 25–30 years of working in the field, and stems from a combination of cognitive science, computer science, enterprise architecture, and formal linguistics perspectives.

An ontology specifies a rich description of the relevant to a particular domain or area of interest.

terminology, concepts, nomenclature;

relationships among and between concepts and individuals; and

sentences distinguishing concepts, refining definitions and relationships (constraints, restrictions, regular expressions)


Figure 1.1: Ontology definition and expressivity spectrum.

Figure 1.1 provides an abbreviated view of what we, and many colleagues, call the “ontology spectrum”—the range of models of information that practitioners commonly refer to as ontologies. It covers models that may be as simple as an acronym list, index, catalog, or glossary, or as expressive as a set of micro theories supporting sophisticated analytics.

The spectrum was developed during preparation for a panel discussion in 1999 at an Association for the Advancement of Artificial Intelligence (AAAI) conference, where a number of well-known researchers in the field attempted to arrive at a consensus on a definition of ontology. This spectrum is described in detail in McGuinness, Ontologies Come of Age (2003). We believe that an ontology can add value when defined at any level along the spectrum, which is usually determined by business or application requirements. Most of the ontologies we have developed, whether conceptual or application oriented, include at least a formal “is-a” or subclass hierarchy, and often additional expressions, such as restrictions on the number or type of values for a property, (i.e., they fall to the right of the red “squiggle” in the diagram).

Regardless of the level of expressivity and whether the ontology is conceptual in nature or application focused, we expect that an ontology will be: (1) encoded formally in a declarative knowledge representation language; (2) syntactically well-formed for the language, as verified by an appropriate syntax checker or parser; (3) logically consistent, as verified by a language-appropriate reasoner or theorem prover; and (4) will meet business or application requirements as demonstrated through extensive testing. The process of evaluating and testing an ontology is both science and art, with increasingly sophisticated methods available in commercial tools, but because no “one size fits all,” we typically need multiple tools to fully vet most ontologies. We will discuss some of the more practical and more readily available approaches to ontology evaluation in later chapters of this book.

1.2 LOGIC AND ONTOLOGICAL COMMITMENT

The primary reason for developing an ontology is to make the meaning of a set of concepts, terms, and relationships explicit, so that both humans and machines can understand what those concepts mean. The level of precision, breadth, depth, and expressivity encoded in a given ontology depends on the application: search applications over linked data tend to require broader ontologies and tolerate less precision than those that support data interoperability; some machine learning and natural language processing applications require more depth than others. Ontologies that are intended to be used as business vocabularies or to support data governance and interoperability require more metadata, including clearly stated definitions, provenance, and pedigree, as well as explanatory notes and other usage information than machine learning applications may need. The foundation for the machine-interpretable aspects of knowledge representation lies in a combination of set theory and formal logic. The basis for the metadata stems from library science and terminology work, which we discuss in Chapter 4.

Most people who are interested in knowledge representation took a course in logic at some point, either from a philosophical, mathematical, or linguistics perspective. Many of us also have basic knowledge of set theory, and can draw Venn diagrams showing set intersection when needed, but a little refresher may be helpful.

Logic can be more difficult to read than English, but is clearly more precise:

(forall ((x FloweringPlant))

(exists ((y Bloom)(z BloomColor))(and (hasPart x y)(hasCharacteristic y z))) )

Translation: Every flowering plant has a bloom which is a part of it, and which has a characteristic bloom color.

Language: Common Logic, CLIF syntax (ISO/IEC 24707:2018, 2018)

Logic is a simple language with few basic symbols. The level of detail depends on the choice of predicates made by the ontologist (e.g., FloweringPlant, hasPart, hasCharacteristic, in the logic, above); these predicates represent an ontology of the relevant concepts in the domain.

1.3 ONTOLOGY-BASED CAPABILITIES

An ontology defines the vocabulary that may be used to specify queries and assertions for use by independently developed resources, processes, and applications. “Ontological commitments are agreements to use a shared vocabulary in a coherent and consistent manner.”1 Agreements can be specified as formal ontologies, or ontologies with additional rules, to enforce the policies stated in those agreements. The meaning of the concepts included in the agreements can be defined precisely and unambiguously, sufficient to support machine interpretation of the assertions. By composing or mapping the terms contained in the ontologies, independently developed systems can work together to share information and processes consistently and accurately.

Through precise definitions of terms, ontologies enable shared understanding in conversations among agents to collect, process, fuse, and exchange information. For example, ontologies can be used to improve search accuracy through query expansion to clarify the search context. Typically, search accuracy includes both precision and recall, meaning that correct query results are returned and relevant answers are not missing. Ontologies designed for information sharing may be used in a number of ways, including but not limited to:

• on their own as terminologies or common vocabularies to assist in communications within and across groups of people;

• to codify, extend, and improve flexibility in XML2 and/or RDF Schema-based3 agreements;

• for information organization, for example for websites that are designed to support search engine optimization (SEO) and/ or those that use mark-up per schema.org;4 or

• to describe resources in a content management system, for example for archival, corporate website management, or for scientific experimentation and reuse.

Ontologies that describe information resources, processes, or applications are frequently designed to support question answering, either through traditional query languages such as SQL5 or SPARQL,6 or through business rules, including rule languages such as RuleML,7 Jess,8 Flora-2,9 and commercial production rule languages. They may also be designed to support more complex applications, including:

• recommender systems, for example, for garden planning, product selection, service provider selection, etc. as part of an event planning system;

• configuration systems such as product configurators or systems engineering design verification and validation;

• policy analysis and enforcement, such as for investment banking compliance and risk management;

• situational analysis systems, such as to understand anomalous behaviors for track and trace, fraud detection, or other business intelligence applications; and

• other complex analyses, such as those required for understanding drug formularies, disease characteristics, human genetics, and individual patient profiles to determine the best therapies for addressing certain diseases.

In other words, ontologies and the technologies that leverage them are well suited to solve problems that are cross-organizational, cross-domain, multi-disciplinary, or that span multiple systems. They are particularly useful in cases where traditional information technologies are insufficiently precise, where flexibility is needed, where there is uncertainty in the information, or where there are rich relationships across processes, systems, and or services that can’t be addressed in other ways. Ontologies can connect silos of data, people, places, and things.

In the sections that follow, we will provide examples and modeling patterns that are commonly used to support both lightweight use cases that do not involve much reasoning, as well as richer applications such as recommender systems or systems for policy analysis and enforcement that depend on more representation and reasoning power.

1.4 KNOWLEDGE REPRESENTATION LANGUAGES

Today’s approaches to knowledge representation (KR) emerged from 1970s and 1980s research in artificial intelligence, including work in areas of semantic networks, question-answering, neural networks, formal linguistics and natural language processing, theorem proving, and expert systems.

The term knowledge representation is often used to talk about representation of information for consumption by machines, although “good” knowledge representations should also be readable by people. Every KR language has a number of features, most of which are common to software engineering, query, and other languages. They include: (1) a vocabulary, consisting of some set of logical symbols and reserved terms plus variables and constants; (2) a syntax that provides rules for combining the symbols into well-formed expressions; (3) a formal semantics, including a theory of reference that determines how the constants and variables are associated with things in the universe of discourse and a theory of truth that distinguishes true statements from false ones; and (4) rules of inference, that determine how one pattern can be inferred from another. If the logic is sound, the rules of inference must preserve truth as determined by the semantics. It is this fourth element, the rules of inference and the ability to infer new information from what we already know, that distinguishes KR languages from others.

Many logic languages and their dialects have been used for KR purposes. They vary from classical first order logic (FOL) in terms of: (1) their syntax; (2) the subsets of FOL they implement (for example, propositional logic without quantifiers, Horn-clause, which excludes disjunctions in conclusions such as Prolog, and terminological or definitional logics, containing additional restrictions); (3) their proof theory, such as monotonic or non-monotonic logic (the latter allows defaults), modal logic, temporal logic, and so forth; and (4) their model theory, which as we mentioned above, determines how expressions in the language are evaluated with respect to some model of the world.

Classical FOL is two-valued (Boolean); a three-valued logic introduces unknowns; four-valued logic introduces inconsistency. Fuzzy logic uses the same notation as FOL but with an infinite range of certainty factors (0.0–1.0). Also, there are differences in terms of the built-in vocabularies of KR languages: basic ISO/IEC 24707:2018 (2018) is a tight, first-order language with little built in terminology, whereas the Web Ontology Language (Bao et al., 2012) includes support for some aspects of set theory.10

1.4.1 DESCRIPTION LOGIC LANGUAGES

Description logics (DLs) are a family of logic-based formalisms that represent a subset of first order logic. They were designed to provide a “sweet spot” in that they have a reasonable degree of expressiveness on the ontology spectrum, while not having so much expressive power that it is difficult to build efficient reasoning engines for them. They enable specification of ontologies in terms of concepts (classes), roles (relationships), and individuals (instances).

Description logics are distinguished by (1) the fact that they have a formal semantics, representing decidable fragments of first order logic, and (2) their provisions for inference services, which include sound and complete decision procedures for key problems. By decidable, we mean that there are effective algorithms for determining the truth value of the expressions stated in the logic. Description logics are highly optimized to support specific kinds of reasoning for implementation in operational systems.11

Example types of applications of description logics include:

• configuration systems—product configurators, consistency checking, constraint propagation, etc., whose first significant industrial application was called PROSE (Mc-Guinness and Wright, 1998) and used the CLASSIC knowledge representation system, a description logic, developed by AT&T Bell Laboratories in the late 1980s (Borgida et al., 1989);

• question answering and recommendation systems, for suggesting sets of responses or options depending on the nature of the queries; and

• model engineering applications, including those that involve analysis of the ontologies or other kinds of models (systems engineering models, business process models, and so forth) to determine whether or not they meet certain methodological or other design criteria.

1.5 KNOWLEDGE BASES, DATABASES, AND ONTOLOGY

An ontology is a conceptual model of some aspect of a particular universe of discourse (or of a domain of discourse). Typically, ontologies contain only “rarified” or “special” individuals, representing elemental concepts critical to the domain. In other words, they are comprised primarily of concepts, relationships, and axiomatic expressions.

One of the questions that we are often asked is, “What is the difference between an ontology and a knowledge base?” Sometimes people refer to the knowledge base as excluding the ontology and only containing the information about individuals along with their metadata, for example, the triples in a triple store without a corresponding schema. In other words, the ontology is separately maintained. In other cases, a knowledge base is considered to include both the ontology and the individuals (i.e., the triples in the case of a Semantic Web-based store). The ontology provides the schema and rules for interpretation of the individuals, facts, and other rules comprising the domain knowledge.

A knowledge graph typically contains both the ontology and related data. In practice, we have found that it is important to keep the ontology and data as separate resources, especially during development. The practice of maintaining them separately but combining them in knowledge graphs and/or applications makes them easier to maintain. Once established, ontologies tend to evolve slowly, whereas the data on which applications depend may be highly volatile. Data for well-known code sets, which might change less frequently than some data sets, can be managed in the form of “OWL ontologies,” but, even in these cases, the individuals should be separate from the ontology defining them to aid in testing, debugging, and integration with other code sets. These data resources are not ontologies in their own right, although they might be identified with their own namespace, etc.

Most inference engines require in-memory deductive databases for efficient reasoning (including commercially available reasoners). The knowledge base may be implemented in a physical, external database, such as a triple store, graph database, or relational database, but reasoning is typically done on a subset (partition) of that knowledge base in memory.

1.6 REASONING, TRUTH MAINTENANCE, AND NEGATION

Reasoning is the mechanism by which the logical assertions made in an ontology and related knowledge base are evaluated by an inference engine. For the purposes of this discussion, a logical assertion is simply an explicit statement that declares that a certain premise is true. A collection of logical assertions, taken together, form a logical theory. A consistent theory is one that does not contain any logical contradictions. This means that there is at least one interpretation of the theory in which all of the assertions are provably true. Reasoning is used to check for contradictions in a collection of assertions. It can also provide a way of finding information that is implicit in what has been stated. In classical logic, the validity of a particular conclusion is retained even if new information is asserted in the knowledge base. This may change if some of the prior knowledge, or preconditions, are actually hypothetical assumptions that are invalidated by the new information. The same idea applies for arbitrary actions—new information can make preconditions invalid.

Reasoners work by using the rules of inference to look for the “deductive closure” of the information they are given. They take the explicit statements and the rules of inference and apply those rules to the explicit statements until there are no more inferences they can make. In other words, they find any information that is implicit among the explicit statements. For example, from the following statement about flowering plants, if it has been asserted that x is a flowering plant, then a reasoner can infer that x has a bloom y, and that y has a characteristic which includes a bloom color z:

(forall ((x FloweringPlant))

(exists ((y Bloom)(z BloomColor))(and (hasPart x y)(hasCharacteristic y z))) )

During the reasoning process, the reasoner looks for additional information that it can infer and checks to see if what it believes is consistent. Additionally, since it is generally applying rules of inference, it also checks to make sure it is not in an infinite loop. When some kind of logical inconsistency is uncovered, then the reasoner must determine, from a given invalid statement, whether or not others are also invalid. The process associated with tracking the threads that support determining which statements are invalid is called truth maintenance. Understanding the impact of how truth maintenance is handled is extremely important when evaluating the appropriateness of a particular inference engine for a given task.

If all new information asserted in a knowledge base is monotonic, then all prior conclusions will, by definition, remain valid. Complications can arise, however, if new information negates a prior statement. “Non-monotonic” logical systems are logics in which the introduction of new axioms can invalidate old theorems (McDermott and Doyle, 1980). What is important to understand when selecting an inference engine is whether or not you need to be able to invalidate previous assertions, and if so, how the conflict detection and resolution is handled. Some questions to consider include the following.

• What happens if conclusive information to prove the assumption is not available?

• The assumption cannot be proven?

• The assumption is not provable using certain methods?

• The assumption is not provable in a fixed amount of time?

The answers to these questions can result in different approaches to negation and differing interpretations by non-monotonic reasoners. Solutions include chronological and “intelligent” backtracking algorithms, heuristics, circumscription algorithms, justification or assumption-based retraction, depending on the reasoner and methods used for truth maintenance.

Two of the most common reasoning methods are forward and backward chaining. Both leverage “if-then” rules, for example, “If it is raining, then the ground is wet.” In the forward chaining process, the reasoner attempts to match the “if” portion (or antecedent) of the rule and when a match is found, it asserts the “then” portion (or the consequent) of the rule. Thus, if the reasoner has found the statement “it is raining” in the knowledge base, it can apply the rule above to deduce that “The ground is wet.” Forward chaining is viewed as data driven and can be used to draw all of the conclusions one can deduce from an initial state and a set of inference rules if a reasoner executes all of the rules whose antecedents are matched in the knowledge base.

Backward chaining works in the other direction. It is often viewed as goal directed. Suppose that the goal is to determine whether or not the ground is wet. A backward chaining approach would look to see if the statement, “the ground is wet,” matches any of the consequents of the rules, and if so, determine if the antecedent is in the knowledge base currently, or if there is a way to deduce the antecedent of the rule. Thus, if a backward reasoner was trying to determine if the ground was wet and it had the rule above, it would look to see if it had been told that it is raining or if it could infer (using other rules) that it is raining.

Another type of reasoning, called tableau (sometimes tableaux) reasoning, is based on a technique that checks for satisfiability of a finite set of formulas. The semantic tableaux was introduced in 1950s for classical logic and was adopted as the reasoning paradigm in description logics starting in the late 1990s. The tableau method is a formal proof procedure that uses a refutation approach—it begins from an opposing point of view. Thus, when the reasoner is trying to prove that something is true, it begins with an assertion that it is false and attempts to establish whether this is satisfiable. In our running example, if it is trying to prove that the ground is wet, it will assert that it is NOT the case that the ground is wet, and then work to determine if there is an inconsistency. While this may be counterintuitive, in that the reasoner proposes the opposite of what it is trying to prove, this method has proven to be very efficient for description logic processing in particular, and most description logic-based systems today use tableau reasoning.

Yet another family of reasoning, called logic programming, begins with a set of sentences in a particular form. Rules are written as clauses such as H :- B1, … Bn. One reads this as H or the “head” of the rule is true if B1 through Bn are true. B1-Bn is called the body. There are a number of logic programming languages in use today, including Prolog, Answer Set Programming, and Datalog.

1.7 EXPLANATIONS AND PROOF

When a reasoner draws a particular conclusion, many users and applications want to understand why. Primary motivating factors for requiring support for explanations in the reasoners include interoperability, reuse, trust, and debugging in general. Understanding the provenance of the information (i.e., where it came from and when) and results (e.g., what sources were used to produce the result, what part of the ontology and rules were used) is crucial to analysis. It is essential to know which information sources contributed what to your results, particularly for reconcilliation and understanding when there are multiple sources involved and those sources of information differ. Most large companies have multiple databases, for example, containing customer and account information. In some cases there will be a “master” or “golden” source, with other databases considered either derivative or “not as golden”—meaning, that the data in those source databases is not as reliable. If information comes from outside of an organization, reliability will depend on the publisher and the recency of the content, among other factors.

Some of the kinds of provenance information that have proven most important for interpreting and using the information inferred by the reasoner include:

• identifying the information sources that were used (source);

• understanding how recently they were updated (currency);

• having an idea regarding how reliable these sources are (authoritativeness); and

• knowing whether the information was directly available or derived, and if derived, how (method of reasoning).

The methods used to explain why a reasoner reached a particular conclusion include explanation generation and proof specification. We will provide guidance in some depth on metadata to support provenance, and on explanations in general in the chapters that follow.

1 http://www-ksl.stanford.edu/kst/what-is-an-ontology.html.

2 Extensible Markup Language (XML), see http://www.w3.org/standards/xml/core.

3 The Resource Description Framework (RDF) Vocabulary Description Language (RDF Schema), available at https://www.w3.org/RDF/.

4 See https://schema.org/ for more information.

5 Structured Query Language, see https://docs.microsoft.com/en-us/sql/odbc/reference/structured-query-language-sql?view=sql-server-2017.

6 SPARQL 1.1 Query Language, available at https://www.w3.org/TR/sparql11-overview/.

7 The Rule Mark-up Initiative, see http://wiki.ruleml.org/index.php/RuleML_Home.

8 Jess, the Java Expert System Shell and scripting language, see https://herzberg.ca.sandia.gov/docs/52/.

9 FLORA-2: Knowledge Representation and Reasoning with Objects, Actions, and Defaults, see http://flora.sourceforge.net/.

10 For more information on general first-order logics and their use in ontology development, see Sowa (1999) and ISO/IEC 24707:2018 (2018).

11 For more information on description logics, KR and reasoning, see Baader et al. (2003) and Brachman and Levesque (2004).

Ontology Engineering

Подняться наверх