Читать книгу Spatial Analysis - Kanti V. Mardia - Страница 14

Preface

Spatial statistics is concerned with data collected at various spatial locations or sites, typically in a Euclidean space . The important cases in practice are , corresponding to the data on the line, in the plane, or in 3‐space, respectively. A common property of spatial data is “spatial continuity,” which means that measurements at nearby locations will tend to be more similar than measurements at distant locations. Spatial continuity can be modeled statistically using a covariance function of a stochastic process for which observations at nearby sites are more highly correlated than at distant sites. A stochastic process in space is also known as a random field.

One distinctive feature of spatial statistics, and related areas such as time series, is that there is typically just one realization of the stochastic process to analyze. Other branches of statistics often involve the analysis of independent replications of data.

The purpose of this book is to develop the statistical tools to analyze spatial data. The main emphasis in the book is on Gaussian processes. Here is a brief summary of the contents. A list of Notation and Terminology is given at the start for ease of reference. An introduction to the overall objectives of spatial analysis, together with some exploratory methods, is given in Chapter 1. Next is the specification of possible covariance functions (Chapter 2 for the stationary case and Chapter 3 for the intrinsic case). It is helpful to distinguish discretely indexed, or lattice, processes from continuously indexed processes. In particular, for lattice processes, it is possible to specify a covariance function through an autoregressive model (the SAR and CAR models of Chapter 4), with specialized estimation procedures (Chapter 6). Model fitting through maximum likelihood and related ideas for continuously indexed processes is covered in Chapter 5. An important use of spatial models is kriging, i.e. the prediction of the process at a collection of new sites, given the values of the process at a collection of training sites (Chapter 7), and in particular the links to machine learning are explained. Some additional topics, for which there was not space for in the book, are summarized in Chapter 8. The technical mathematical tools have been collected in Appendix A for ease of reference. Appendix B contains a short historical review of the spatial linear model.

The development of statistical methodology for spatial data arose somewhat separately in several academic disciplines over the past century.

1 Agricultural field trials. An area of land is divided into long, thin plots, and different crop is grown on each plot. Spatial correlation in the soil fertility can cause spatial correlation in the crop yields (Webster and Oliver, 2001).
2 Geostatistics. In mining applications, the concentration of a mineral of interest will often show spatial continuity in a body of ore. Two giants in the field of spatial analysis came out of this field. Krige (1951) set out the methodology for spatial prediction (now known as kriging) and Matheron (1963) developed a comprehensive theory for stationary and intrinsic random fields; see Appendix B.
3 Social and medical science. Spatial continuity is an important property when describing characteristics that vary across a region of space. One application is in geography and environmetrics and key names include Cliff and Ord (1981), Anselin (1988), Upton and Fingleton (1985, 1989), Wilson (2000), Lawson and Denison (2002), Kanevski and Maignan (2004), and Schabenberger and Gotway (2005). Another application is in public health and epidemiology, see, e.g., Diggle and Giorgi (2019).
4 Splines. A very different approach to spatial continuity has been pursued in the field of nonparametric statistics. Spatial continuity of an underlying smooth function is ensured by imposing a roughness penalty when fitting the function to data by least squares. It turns out that fitted spline is identical to the kriging predictor under suitable assumptions on the underlying covariance function. Key names here include Wahba (1990) and Watson (1984). A modern treatment is given in Berlinet and Thomas‐Agnan (2004).
5 Mainstream statistics. From at least the 1950s, mainstream statisticians have been closely involved in the development of suitable spatial models and suitable fitting procedures. Highlights include the work by Whittle (1954), Matérn (1960, 1986), Besag (1974), Cressie (1993), and Diggle and Ribeiro (2007).
6 Probability theory and fractals. For the most part, statisticians interested in asymptotics have focused on “outfill” asymptotics – the data sites cover an increasing domain as the sample size increases. The other extreme is “infill asymptotics” in which the interest is on the local smoothness of realizations from the spatial process. This infill topic has long been of interest to probabilists (e.g. Adler, 1981). The smoothness properties of spatial processes underlie much of the theory of fractals (Mandelbrot, 1982).
7 Machine learning. Gaussian processes and splines have become a fundamental tool in machine learning. Key texts include Rasmussen and Williams (2006) and Hastie et al. (2009).
8 Morphometrics. Starting with Bookstein (1989), a pair of thin‐plate splines have been used for the construction of deformations of two‐dimensional images. The thin‐plate spline is just a special case of kriging.
9 Image analysis. Stationary random fields form a fundamental model for randomness in images, though typically the interest is in more substantive structures. Some books include Grenander and Miller (2007), Sonka et al. (2013), and Dryden and Mardia (2016). The two edited volumes Mardia and Kanji (1993) and Mardia (1994) are still relevant for the underlying statistical theory in image analysis; in particular, Mardia and Kanji (1993) contains a reproduction of some seminal papers in the area.

The book is designed to be used in teaching. The statistical models and methods are carefully explained, and there is an extensive set of exercises. At the same time the book is a research monograph, pulling together and unifying a wide variety of different ideas.

A key strength of the book is a careful description of the foundations of the subject for stationary and related random fields. Our view is that a clear understanding of the basics of the subject is needed before the methods can be used in more complicated situations. Subtleties are sometimes skimmed over in more applied texts (e.g. how to interpret the “covariance function” for an intrinsic process, especially of higher order, or a generalized process, and how to specify their spectral representations). The unity of the subject, ranging from continuously indexed to lattice processes, has been emphasized. The important special case of self‐similar intrinsic covariance functions is carefully explained. There are now a wide variety of estimation methods, mainly variants and approximations to maximum likelihood, and these are explored in detail.

There is a careful treatment of kriging, especially for intrinsic covariance functions where the importance of drift terms is emphasized. The link to splines is explained in detail. Examples based on real data, especially from geostatistics, are used to illustrate the key ideas.

The book aims at a balance between theory and illustrative applications, while remaining accessible to a wide audience. Although there is now a wide variety of books available on the subject of spatial analysis, none of them has quite the same perspective. There have been many books published on spatial analysis, and here we just highlight a few. Ripley (1988) was one of the first monographs in the mainstream Statistics literature. Some key books that complement the material in this book, especially for applications, include Cressie (1993), Diggle and Ribeiro (2007), Diggle and Giorgi (2019), Gelfand et al. (2010), Chilés and Delfiner (2012), Banerjee et al. (2015), van Lieshout (2019), and Rasmussen and Williams (2006).

What background does a reader need? The book assumes a knowledge of the ideas covered by intermediate courses in mathematical statistics and linear algebra. In addition, some familiarity with multivariate statistics will be helpful. Otherwise, the book is largely self‐contained. In particular, no prior knowledge of stochastic processes is assumed. All the necessary matrix algebra is included in Appendix A. Some knowledge of time series is not necessary, but will help to set some of the ideas into context.

There is now a wide selection of software packages to carry out spatial analysis, especially in R, and it is not the purpose in this book to compare them. We have largely used the package geoR (Ribeiro Jr and Diggle, 2001) and the program of Pardo‐Igúzquiza et al. (2008), with additional routines written where necessary. The data sets are available from a public repository at https://github.com/jtkent1/spatial-analysis-datasets.

Several themes receive little or no coverage in the book. These include point processes, discretely valued processes (e.g. binary processes), and spatial–temporal processes. There is little emphasis on a full Bayesian analysis when the covariance parameters needed to be estimated. The main focus is on methods related to maximum likelihood.

The book has had a long gestation period. When we started writing the book the 1980s, the literature was much sparser. As the writing of the book progressed, the subject has evolved at an increasing rate, and more sections and chapters have been added. As a result the coverage of the subject feels more complete. At last, this first edition is finished (though the subject continues to advance).

A series of workshops at Leeds University (the Leeds Annual Statistics Research [LASR] workshops), starting from 1979, helped to develop the cross‐disciplinary fertilization of ideas between Statistics and other disciplines. Some leading researchers who presented their work at these meetings include Julian Besag, Fred Bookstein, David Cox, Xavier Guyon, John Haslett, Chris Jennison, Hans Künsch, Alain Marechal, Richard Martin, Brian Ripley, and Tata Subba‐Rao.

We are extremely grateful to Wiley for their patience and help during the writing of the book, especially Helen Ramsey, Sharon Clutton, Rob Calver, Richard Davies, Kathryn Sharples, Liz Wingett, Kelvin Matthews, Alison Oliver, Viktoria Hartl‐Vida, Ashley Alliano, Kimberly Monroe‐Hill, and Paul Sayer. Secretarial help at Leeds during the initial development was given by Margaret Richardson, Christine Rutherford, and Catherine Dobson.

We have had helpful discussions with many participants at the LASR workshops and with colleagues and students about the material in the book. These include Robert Adler, Francisco Alonso, Jose Angulo, Robert Aykroyd, Andrew Baczkowski, Noel Cressie, Sourish Das, Pierre Delfiner, Peter Diggle, Peter Dowd, Ian Dryden, Alan Gelfand, Christine Gill, Chris Glasbey, Arnaldo Goitía, Colin Goodall, Peter Green, Ulf Grenander, Luigi Ippoliti, Anil Jain, Giovanna Jona Lasinio, André Journel, Freddie Kalaitzis, David Kendall, Danie Krige, Neil Lawrence, Toby Lewis, John Little, Roger Marshall, Georges Matheron, Lutz Mattner, Charles Meyer, Michael Miller, Mohsen Mohammadzadeh, Debashis Mondal, Richard Morris, Ali Mosammam, Nitis Mukhopadhyay, Keith Ord, E Pardo‐Igúzquiza, Anna Persson, Sophia Rabe, Ed Redfern, Allen Royale, Sujit Sahu, Paul Sampson, Bernard Silverman, Nozer Singpurwalla, Paul Switzer, Charles Taylor, D. Vere‐Jones, Alan Watkins, Geof Watson, Chris Wikle, Alan Wilson, and Jim Zidek.

John is grateful to his wife Sue for her support in the writing of this book, especially with the challenges of the Covid pandemic. Kanti would like to thank the Leverhulme Trust for an Emeritus Fellowship and Anna Grundy of the Trust for simplifying the administration process. Finally, he would like to express his sincere gratitude to his wife and his family for continuous love, support and compassion during his research writings such as this monograph.

We would be pleased to hear about any typographical or other errors in the text.

30 June 2021

John T. Kent

Kanti V. Mardia

Подняться наверх