R function to extract raster data

DATE: 2024-06-20

AUTHOR: John L. Godlee

I have to extract lots of point estimates from raster layers to use as explanatory variables of biomass dynamics measured in vegetation monitoring plots across tropical savannas and dry forests. Some of these raster layers are quite coarse and some of the plot locations are close to the coast, meaning that occasionally a plot will not coincide with a non-NA pixel in a particular raster layer. Here is an example, showing the raster layer in blue and the plots as black dots.

A map illustrating the coarse pixel issue.

I wrote an R function (nearExtract()) that sequentially increases the size of a circular buffer around the point until it overlaps a raster value. The function takes a SpatRaster object (x), an sf points object (y), a function used to aggregate cell values within the buffer (fun), optionally a buffer radius in metres (b), and optionally an increment by which to increase the buffer radius also in metres (bstep). Additional arguments can be passed to terra::extract() using ....

If b is not supplied, the initial buffer radius is set to the mean cell width of the raster layer x.

If bstep is not supplied, the buffer radius increment is set to the size of the initial buffer radius.

If b is a vector with more that one element, only these buffer radii are used and bstep is ignored.

#' Add a circular buffer to a point until a valid raster value is extracted
#'
#' @param x raster layer
#' @param y sf points object 
#' @param b optional, inital buffer radius, or vector of buffer radii in metres
#' @param bstep optional, incremental buffer radius increase in metres
#' @param ... additional arguments passed to `terra::extract()`
#'
#' @return A dataframe as returned by `terra::extract()` with an additional
#'     column `b`, which specifies the radius of the buffer necessary to 
#'     intersect a valid raster pixel.
#' 
nearExtract <- function(x, y, fun = mean, b = NULL, bstep = NULL, ...) { 

  # Function must be specified
  if (!is.function(fun)) { 
    stop("fun must be a valid function")
  }

  # bstep ignored if length(b) > 1
  if (length(b) > 1 & !is.null(bstep)) { 
    warning("length(b) > 1, bstep ignored")
  }

  # If b is a vector
  if (length(b) > 1) {
    # If buffer is a vector
    b <- sort(b)
    b1 <- b[1]
    bstep <- NULL
  } else if (is.null(b)) { 
    # If b is not specified, start with radius equal to raster mean cell size
    b1 <- unlist(sqrt(global(cellSize(x, unit = "km"), "mean") / pi))
  } else { 
    b1 <- b
  }

  # If bstep not specified, use the raster mean cell size
  if (is.null(bstep)) { 
    bstep <- b1
  }

  # Attempt to extract
  val <- terra::extract(x, y, fun = fun, na.rm = TRUE, ...)
  val$b <- NA_real_

  # If extraction failed for any individual
  if (any(is.na(val[,-ncol(val)]))) {
    # Set buffer to initial value
    bi <- b1

    # Sequentially increase buffer diameter until all NA filled
    while (any(is.na(val[,-ncol(val)])) & 
      (if (length(b) > 1) { bi != max(b) } else { TRUE } )) {
      # See progress
      message(bi)

      # Find missing values
      val_fill <- which(apply(val[,-ncol(val)], 1, function(i) { any(is.na(i)) }))

      # Apply buffer to values
      yb <- st_buffer(y[val_fill,], bi)

      # Attempt to extract
      val_new <- terra::extract(x, yb, fun = fun, na.rm = TRUE, ...)

      # Add buffer size to successfully filled values
      val_new$b <- bi
      val_new$b[is.na(val_new[,2])] <- NA_real_

      # Fill missing values 
      val[val_fill,] <- val_new

      # Increase buffer size
      if (length(b) > 1) { 
        # If b is a vector,
        # move to the next largest buffer size
        bi <- b[which(b == bi) + 1]
      } else {
        # Otherwise add bstep
        bi <- bi + bstep
      }
    }
  }

  # Return
  return(val)
}