Working with Point-Count like data

R Skill Level: Intermediate - this activity assumes you have a working knowledge of R

Objectives & Goals

Upon completion of this activity, you will:
  • Generate random points within a polygon
  • Create a buffer around the points
  • Extract raster data to points
  • predict occupancy and generate a surface using raster calculations

Required packages

To complete the following activity you will need the following packages installed: raster sp rgeos

Installing the packages

If installing these packages for the first time consider adding dependencies=TRUE
  • install.packages("raster",dependencies = TRUE)
  • install.packages("rgdal",dependencies = TRUE)
  • install.packages("rgeos",dependencies = TRUE)
  • install.packages("sp",dependencies = TRUE)

  • Download R script Last modified: 2019-09-20 18:26:28

    Point-count like data

    For this activity, let’s assume you’re getting ready to start a new project. In this project the primary form of data collection is through the use of point counts. Let’s assume you already know you want to survey birds within the Coronado National Forest the system in which you want to survey.


    Read in a shapefile that contains information about the national forest boundaries in the United States. The shapefile was downloaded from here

    # Read in Administrative forest boundaries 
    # This shapefile has boundary information
    # for forests the U.S. federal govt is
    # responsible for - i.e., National Forests,
    # National Monuments, etc. 
    NatForest <- raster::shapefile("../Spatial_Layers/S_USA.AdministrativeForest.shp")
    # Take a glance at the file
    ## class       : SpatialPolygonsDataFrame 
    ## features    : 112 
    ## extent      : -150.0079, -65.69967, 18.03373, 61.51899  (xmin, xmax, ymin, ymax)
    ## crs         : +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0 
    ## variables   : 8
    ## names       :                           ADMINFORES, REGION, FORESTNUMB, FORESTORGC,                 FORESTNAME,     GIS_ACRES,      SHAPE_AREA,      SHAPE_LEN 
    ## min values  :                       99010200010343,     01,         01,       0102,  Allegheny National Forest,     27566.061, 0.0120072002574, 0.754287568301 
    ## max values  : e23b8205-a516-4aa3-945e-5dd93dbfe227,     10,         60,       1005, Willamette National Forest, 17684857.9312,   10.5984390953,  378.929375922

    Let’s pull out just the Coronado National Forest. Here we use the grep() function to search for ‘Coronado’ within the FORESTNAME field.

    CNF <- NatForest[grep(x=NatForest$FORESTNAME,pattern="Coronado"),]

    We’re going to need a Digital Elevation Model (DEM) for our analysis. For convience we’ll grab a DEM using the ‘alt’ (elevation) data set available with the raster package. Note - when using the raster to download elevation in the United States - a list is returned. The first element of the list is the elevation (in meters) of the contiguous states. The other elements include Alaska, and Hawaii.

    # Get elevation data using the raster package
    DEM <- raster::getData(name = "alt", country = "United States")
    ## returning a list of RasterLayer objects
    # Save only the elevation in the lower 48
    DEM <- DEM[[1]]

    plot of chunk plot-CNF

    Back to top

    Generate point count locations

    The sp package has a function to generate random points. It has a few options but here we’ll use type = "regular" to generate random points that are systematically aligned. The n is the approximate sample size. We’ll also set the seed for the random number generator so you should get the same locations.

    surveyPts <- sp::spsample(x = CNF, n = 100, type = "regular")

    Let’s take a look at how many sites were generated and where they ended up.

    plot(surveyPts, add = TRUE, pch = 19)

    plot of chunk unnamed-chunk-5

    Back to top

    Generate buffer around points

    Now that we have generate survey points - let’s make a 50m radius around each point.

    Before we go any further we need to know a few things. We need to know if the coordinate reference system (crs) is defined for our survey points and we also need to what the crs is. To make our lives easier if would be good to have the surveyPoints projected into an Equal Area projection in meters. The reason for this is two-fold. First, an equal area projection will conserve area so our buffers should be accurate and second, if it’s in a meters we can specify the width parameter in meters.

    ## class       : SpatialPoints 
    ## features    : 98 
    ## extent      : -111.3911, -108.9843, 31.40503, 32.98191  (xmin, xmax, ymin, ymax)
    ## crs         : +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0

    Project the points into an equal area projection. While we’re at it - let’s project all our layers (DEM, forest boundaries, and survey points).

    # Define the projection in proj4 format
    EqArea <- "+proj=aea 
               +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 
               +units=m +no_defs"
    # project data using the sp package
    surveyPts_m <- sp::spTransform(surveyPts, sp::CRS(EqArea))
    # project forest boundary
    CNF_m <- sp::spTransform(CNF, sp::CRS(EqArea))

    Now for the DEM - it’s quite large and we only need a small portion. Let’s crop the DEM to include only the area around Coronado National Forest.

    DEM <- crop(DEM,CNF)
    # take a look
    ## class      : RasterLayer 
    ## dimensions : 201, 304, 61104  (nrow, ncol, ncell)
    ## resolution : 0.008333333, 0.008333333  (x, y)
    ## extent     : -111.45, -108.9167, 31.33333, 33.00833  (xmin, xmax, ymin, ymax)
    ## crs        : +proj=longlat +ellps=WGS84 
    ## source     : memory
    ## names      : USA1_msk_alt 
    ## values     : 447, 3172  (min, max)
    DEM_m <- projectRaster(DEM, crs = EqArea)

    Now that we’ve projected the data we can generate the 50m radius around each point. To do this make sure to use byid = TRUE.

    surveyCircle <- gBuffer(surveyPts_m, width = 50, byid = TRUE)

    Extract elevation to the survey locations

    pt_elev <- extract(DEM_m, surveyCircle, fun = mean, na.rm = TRUE)

    plot of chunk unnamed-chunk-12

    Back to top

    Simulate count data

    We’ll simulate occupancy data using a function from Kery & Royle 2016 Applied Hierarchical Modeling in Ecology. Here we use the simOcc function saving most of the preset values. We just want the count data so I save only the y ouput.

    # Simulate occupancy data
    # M = number of sites 
    # J = number of occassions
    simCount <- AHMbook::simOcc(M = length(surveyPts), 
                                J = 3, 
                                mean.occupancy = 0.6, 
                                mean.detection = 0.7,
                                show.plot = FALSE)$y
    # see first few rows
    ##      [,1] [,2] [,3]
    ## [1,]    1    1    1
    ## [2,]    0    0    0
    ## [3,]    1    0    0
    ## [4,]    0    1    1
    ## [5,]    1    1    1
    ## [6,]    1    1    1

    Using these simulated data, we’ll determine the relationship between elevation and occupancy while accounting for imperfect detection using the unmarked package.

    # Make unmarked frame for occupancy data
    umf <- unmarked::unmarkedFrameOccu(y = simCount, 
    # run the occupancy model
    occ <- unmarked::occu(~1~pt_elev,umf)

    Back to top

    Map occupancy probability

    Let’s create a map of the predicted occupancy probabilty throughout the Coronado National Forest. We’ll use the parameter estimates from the occupancy model we ran above.

    occMap <- exp(unmarked::coef(occ)[1]+unmarked::coef(occ)[2]*DEM_m)/
    plot(mask(occMap, CNF_m),col = bpy.colors(30))

    plot of chunk unnamed-chunk-16

    Back to top