Читать книгу Designing Geodatabases for Transportation - J. Allison Butler - Страница 19

Files

Оглавление

Relational databases were not the first kind of electronic data structure. The oldest form of database storage is the file, which consists of a block of data organized into logical groups called fields. Each position in the field is called a column. Files look like a table with their records (rows) that separate content using a special character to signify the end of a logical group of data. Everything is text. There is no inherent requirement for all the records to have the same structure. For example, the first record, often called a header, could state the number of body records or describe the fields in those records. All the intelligence needed to understand the file’s content is in the application that reads and writes records.


Figure 2.2 Files A fixed-length file uses column position to identify specific data content forming attributes. A variable-length file uses the sequential order of fields separated by a predefined special character—one that cannot appear in the data. In both cases, the application using the data must know the specific location of each piece of information.

Files come in two basic forms: fixed-length and variable-length. A fixed-length file uses the position of each character in a record to interpret its meaning. Any leftover space not needed to store the data for that record is filled with spaces, either before or after the actual data in the field. Fields in each record are identified by position. For example, a file specification may declare that record characters (columns) 1 through 47 contain an employee’s name right-justified with leading spaces.

A variable-length file uses the position of a field within the record to identify its content. Variable-length records avoid space filling by using special characters to say where one field stops and another begins. You may have come across this structure when using comma-and tab-delimited text files. The commas or tab characters are the things that separate the records into fields. Usually, there is also a special end-of-file character.

The most common ArcGIS file-based data structure is the shapefile. A shapefile is a kind of spatial database structure consisting of several files. There are more than a hundred recognized shapefile component types, each with its own file extension (the three characters after the dot in a typical file name). To copy a shapefile, you must copy all the component files. The minimum components are the geometry (.shp), the nonspatial attribute data (.dbf), and the spatial index (.sbx). The structure of each component file is optimized for the information it contains. For example, the geometry file (.shp) contains a 100-byte fixed-length file header followed by variable-length records. The variable-length record is composed of an 8-byte, fixed-length record header followed by variable-length record contents. Each record defines a single geometry, with the length of the variable portion being determined by the number of vertices and whether measure (m) and elevation (z) coordinate values are included. The fixed-length record header portion provides a record number and the length of the variable portion.

Coverages, which were the original ESRI data structure, are also based on a database structure consisting of multiple files. Designed to reduce the size of a spatial database, software manipulating coverage data must manage a number of composition relationships inherent in the file structure. A special data-exchange file type was developed to be able to distribute coverages via a single file.

File data structures remain useful today and will continue to be part of GIS datasets long into the future. This book, however, will restrict itself to modeling geodatabases. What you put into and take out of a geodatabase may be a file, but the database to be modeled is a geodatabase.

Designing Geodatabases for Transportation

Подняться наверх