Skip to contents

This function evaluates the variability of a spatial and temporal interpolation of a variable (e.g., air temperature) using LCZ as background. It supports both LCZ-based and conventional interpolation methods. The function allows for flexible time period selection, cross-validation, and station splitting for training and testing.

Usage

lcz_interp_eval(
  x,
  data_frame = "",
  var = "",
  station_id = "",
  ...,
  extract.method = "simple",
  LOOCV = TRUE,
  split.ratio = 0.8,
  sp.res = 100,
  tp.res = "hour",
  vg.model = "Sph",
  by = NULL,
  Anomaly = FALSE,
  impute = NULL,
  isave = FALSE,
  LCZinterp = TRUE
)

Arguments

x

A SpatRaster object containing the LCZ map. The LCZ map can be obtained using the lcz_get_map() functions.

data_frame

A data frame containing air temperature measurements and station IDs. The data frame must include a date field in hourly or higher resolution format.

var

A character string specifying the name of the variable to interpolate (e.g., "airT" for air temperature).

station_id

A character string specifying the name of the station ID variable in the data frame.

...

Additional arguments for the selectByDate function from the openair package. These arguments allow for flexible selection of specific time periods (year, month, day, hour). Examples include:

  • Year(s): Numeric value(s) specifying the year(s) to select. For example, year = 1998:2004 selects all years between 1998 and 2004 (inclusive), while year = c(1998, 2004) selects only the years 1998 and 2004.

  • Month(s): Numeric or character value(s) specifying the months to select. Numeric examples: month = 1:6 (January to June), or character examples: month = c("January", "December").

  • Day(s): Numeric value(s) specifying the days to select. For instance, day = 1:30 selects days from 1 to 30, or day = 15 selects only the 15th day of the month.

  • Hour(s): Numeric value(s) specifying the hours to select. For example, hour = 0:23 selects all hours in a day, while hour = 9 selects only the 9th hour.

  • Start date: A string specifying the start date in either start="DD/MM/YYYY" (e.g., "1/2/1999") or "YYYY-mm-dd" format (e.g., "1999-02-01").

  • End date: A string specifying the end date in either end="DD/MM/YYYY" (e.g., "1/2/1999") or "YYYY-mm-dd" format (e.g., "1999-02-01").

extract.method

A character string specifying the method used to assign the LCZ class to each station point. The default is "simple". Available methods are:

  • simple: Assigns the LCZ class based on the value of the raster cell in which the point falls. It often is used in low-density observational network.

  • two.step: Assigns LCZs to stations while filtering out those located in heterogeneous LCZ areas. This method requires that at least 80% of the pixels within a 5 × 5 kernel match the LCZ of the center pixel (Daniel et al., 2017). Note that this method reduces the number of stations. It often is used in ultra and high-density observational network, especially in LCZ classes with multiple stations.

  • bilinear: Interpolates the LCZ class values from the four nearest raster cells surrounding the point.

LOOCV

A logical value. If TRUE (default), leave-one-out cross-validation (LOOCV) is used for kriging. If FALSE, the split method into training and testing stations is used.

split.ratio

A numeric value representing the proportion of meteorological stations to be used for training (interpolation). The remaining stations will be used for testing (evaluation). For example, the default 0.8 indicates that 80% of the stations will be used for training and 20% for testing.

sp.res

A numeric value specifying the spatial resolution in meters for interpolation. The default is 100.

tp.res

A character string specifying the temporal resolution for averaging. The default is "hour". Other options include "day", "week", "month", "year", "season", "seasonyear", "monthyear", "weekday", "weekend", or custom intervals like "2 day", "2 week", "3 month", etc.

vg.model

A character string specifying the variogram model for kriging. The default is "Sph". Available models are:

  • Sph: Spherical model.

  • Exp: Exponential model.

  • Gau: Gaussian model.

  • Ste: M. Stein's parameterization.

by

A character string specifying how to split the time series in the data frame. Options include "year", "season", "seasonyear", "month", "monthyear", "weekday", "weekend", "site", or "daylight" (daytime and nighttime). See the type argument in the openair package for more details: https://bookdown.org/david_carslaw/openair/sections/intro/openair-package.html#the-type-option.

Anomaly

If TRUE,the anomalies are calculated. If FALSE (default) the raw air temperatures are used.

impute

A character string specifying the method to impute missing values. Options include "mean", "median", "knn", or "bag".

isave

A logical value. If TRUE, the plot is saved to the working directory.

LCZinterp

A logical value. If TRUE (default), the LCZ interpolation approach is used. If FALSE, conventional interpolation without LCZ is used.

Value

A summary table in CSV format containing time series of observed values, predicted values, and residuals. It also returns an ESRI shapefile (.gpkg) containing metadata of the interpolation.

References

Anjos, M., Targino, A. C., Krecl, P., Oukawa, G. Y. & Braga, R. F. Analysis of the urban heat island under different synoptic patterns using local climate zones. Build. Environ. 185, 107268 (2020). Fenner, D., Meier, F., Bechtel, B., Otto, M. & Scherer, D. Intra and inter ‘local climate zone’ variability of air temperature as observed by crowdsourced citizen weather stations in Berlin, Germany. Meteorol. Z. 26, 525–547 (2017). http://www.gstat.org/

Author

Max Anjos (https://github.com/ByMaxAnjos)

Examples

if (FALSE) { # \dontrun{
# Evaluate air temperature interpolation values using Berlin data and LOOCV
data("lcz_data")
lcz_map <- lcz_get_map_generator(ID = "8576bde60bfe774e335190f2e8fdd125dd9f4299")
lcz_plot_map(lcz_map)
my_interp <- lcz_interp_eval(
  lcz_map, data_frame = lcz_data, var = "airT",
  station_id = "station", year = 2019, month = 9, day = 5,
  sp.res = 100, tp.res = "hour", LOOCV = TRUE,
  vg.model = "Sph", LCZinterp = TRUE, isave = TRUE
)

# Evaluate LCZ-based interpolation using split station (80% training, 20% testing)
my_interp <- lcz_interp_eval(
  lcz_map, data_frame = lcz_data, var = "airT",
  station_id = "station", tp.res = "hour", sp.res = 100,
  year = 2019, month = 9, day = 5, LOOCV = FALSE,
  split.ratio = 0.8, vg.model = "Sph", LCZinterp = TRUE, isave = TRUE
)
} # }