Читать книгу Designing Geodatabases for Transportation - J. Allison Butler - Страница 17

Data modeling

Оглавление

Data types

Files

Tables

Relationships in relational databases

Object-relational databases

Relationships in object-relational databases

The data-modeling process

Conceptual data models

Logical data models

Physical data models

This chapter covers data modeling, the process of designing a dataset’s structure by adopting a set of abstractions representing the real world. A dataset is a collection of facts organized around entities. An entity is a group of similar things, each of which may be referred to as an instance or a member. For example, Road could be an entity representing all roads, with State Route 50, Interstate 10, Main Street, and Simpson Highway members of that entity. You cannot store the real-world entity in the dataset, so you store a set of descriptive attributes that allow you to identify the entity and understand its characteristics. Attributes can be composed of text, numbers, geometry, images, and other forms of data. If Road is your entity, then facility ID, route number, street name, length, jurisdiction, and pavement condition could be useful attributes. When an attribute involves location, it is considered to be spatial in nature. GIS involves spatial data. Attributes, not entities, determine whether a dataset is spatial.

A database is a dataset stored in an electronic medium. A geodatabase includes spatial data. A user acts upon such a dataset through a database management system, which may also provide various security and data integrity services. A geodatabase is a collection of geographic datasets. The database management systems used for large workgroup and enterprise geodatabases are relational, which means they perform according to a number of rules, called relational algebra, that describe how to read and write information stored in the database. The language shared by relational database management system (RDBMS) products is SQL, which once stood for Structured Query Language. The RDBMS converts SQL statements entered by the user (or generated by a computer application) into relational algebra to perform operations on the data. You do not need to know about RDBMS products, relational algebra, or SQL to do data modeling. What you do need to know is included in this chapter.

Every database is a data model because a model is simply an abstract representation of the real world. A primary concern of data modeling is deciding which abstraction to use. For example, a spatial database may represent a linear transportation facility with a centerline, but the real-world facility is actually an area with one very long axis. We commonly use a centerline because it conveys the primary aspect of the facility: it has length and traverses a space. That centerline can be part of a geometric network for determining the best path between two points, or it can simply be a reference for locating other features on a map.

The information you need about the facility is determined by how the data will be used. The network pathfinding application will need information about connectivity, cost of traveling on a segment, and restrictions to travel. Any geometric representation you created for the network may be highly abstract, perhaps just a straight line between two points. In contrast, a mapping application needs just a line geometry representation, with perhaps some information for symbolizing the line. The scale of display will determine the degree of abstraction allowed for the geometry. Large-scale maps may need detailed road edgelines, while small-scale maps may need only a generalized centerline.

You can alternatively represent the linear facility as a surface, such as might be done for a digital elevation model (DEM) using a triangulated irregular network (TIN), which are both ways to represent a surface for 3D representations, or it could be a set of pixels in a raster image. You might also store the linear facility as a set of address points. You can even store the facility as a set of nonspatial attributes, employing no geometry at all. Each of these abstractions has a place within a transport agency and its variety of spatial-data applications. However, this book will concentrate on vector data forms where lines represent linear facilities, as this is the most common abstraction. Several design proposals show how to accommodate multiple geometric representations for a single entity.

Your choice of which form of abstraction to use is determined by the data’s application. Since larger transportation organizations need many applications, it is likely they will need multiple abstractions. For example, a bridge might be a point feature to some, a linear feature to others, and a polygon feature to yet another group.

Data modeling is the structured process by which you examine the needs of your application and determine the most appropriate abstraction to use. It begins by understanding the application’s requirements for data, which will determine the appropriate level of abstraction, the structure to use in organizing the data, the entities to be created, and the attributes assigned to each entity. In the geodatabase, an entity eventually becomes a class, which is a discrete table or feature that you will define in terms of its properties, behaviors, and attributes. A geodatabase combines data with software in an object-oriented form that takes over much of the workload needed to use and manage the data. A geodatabase is an active part of the ArcGIS platform, not a passive holder of data.

Much more about these concepts is discussed later in this chapter. What you need to know for now is that data modeling, as presented here, is founded on the capabilities and constraints of the geodatabase. However, if you are like most transportation-data users, you already have data in a variety of nongeodatabase forms, so this chapter covers other fundamental data structures along with the basic concepts of database design and data models.

Designing Geodatabases for Transportation

Подняться наверх