Using Cell, Neighborhood, and Zonal Statistics

Using Cell, Neighborhood, and Zonal Statistics

You need statistics to describe your data, to add validity to your research, and to make sound decisions. Traditionally, statistics are used on a random but representative subset and the results are extrapolated to the larger group. In other words, you can ask a question of a subset of the population and make inferences about the entire population from the subset's answers. This subset of the population is called a sample.

Inferential statistics, however, don't always work as well with geographic data. When this is the case, descriptive statistics are applied.

The methods of inferential statistics don't transfer easily to geographic data for two main reasons. First, inferential statistics assume that you want to estimate the characteristics of a population from a sample. With geographic data, however, you often have the entire population to work with, so you use descriptive statistics rather than inferential statistics.

Second, inferential statistics does not include tools for representing geographic data.

ArcGIS™ Spatial Analyst provides a set of statistical functions, which makes descriptive statistics part of your geographic analysis. For example, you can compare the difference between values over time, cell-by-cell, or you can construct a statistical filter to weed out unwanted values. You can also assess past trends or the current status of features, or reveal the underlying structure of the data.

Comparing raster datasets using cell statistics

Statistics are useful for describing certain tendencies in your data. You may want to know the average value, the highest value, or how many different types of values exist in the dataset.

For a single raster dataset, statistics are automatically generated. The minimum, maximum, and mean values, as well as the standard deviation of values are presented in the layer’s properties.

You can also use statistics to create new raster datasets. While the statistical functions are divided into three basic groups (cell statistics, neighborhood statistics, and zonal statistics), each group utilizes the same statistical methods.

Cell statistics allow you to compare two or more raster datasets on a cell-by-cell basis. In other words, cells occupying the same location but belonging to different rasters can be evaluated together using basic descriptive statistics. This is especially useful when comparing time-series data, such as annual changes in land use.

Describing raster datasets using neighborhood and zonal statistics

While you can use statistics to compare corresponding cells from different raster datasets, you can also evaluate a single raster dataset based on neighborhoods or zones.

The Neighborhoods Statistics function considers the values of cells within a specified neighborhood around the processing cell. Neighborhoods are sections of the raster that can be defined in almost any way you want. Neighborhood statistics are output as new raster layers.

The Zonal Statistics function considers the values of cells based on groups of like cells, or zones, in another dataset. Zonal statistics are output as tables.