Skip to contents

sus_climate_aggregate() aggregates meteorological data to DATASUS health data using epidemiologically rigorous temporal strategies. The function links each health record to the nearest climate station (by Euclidean distance) and applies the requested temporal window.

Usage

sus_climate_aggregate(
  health_data,
  climate_data,
  climate_var = "all",
  time_unit = "day",
  temporal_strategy = "exact",
  climate_region = "auto",
  window_days = NULL,
  lag_days = NULL,
  offset_days = NULL,
  temp_base = NULL,
  gdd_temp_var = "tair_dry_bulb_c",
  min_obs = 0.7,
  threshold_value = NULL,
  threshold_direction = "above",
  weights = NULL,
  lang = "pt",
  verbose = TRUE
)

Arguments

health_data

A climasus_df object produced by sus_spatial_join(). Must contain columns date (Date), code_muni (character), and geometry column geom.

climate_data

A climasus_df object produced by sus_climate_fill_inmet(). Must contain date (Date or POSIXct), station_code, latitude, longitude, and climate variables.

climate_var

Character vector with climate variables to aggregate. Use "all" (default) to include all available variables.

Available variables are grouped as follows:

Atmospheric pressure:

  • "patm_mb", "patm_max_mb", "patm_min_mb"

Air temperature:

  • "tair_dry_bulb_c", "tair_max_c", "tair_min_c"

Dew point temperature:

  • "dew_tmean_c", "dew_tmax_c", "dew_tmin_c"

Relative humidity:

  • "rh_mean_porc", "rh_max_porc", "rh_min_porc"

Precipitation:

  • "rainfall_mm"

Wind:

  • "ws_2_m_s" (mean wind speed),

  • "ws_gust_m_s" (wind gust),

  • "wd_degrees" (wind direction)

Solar radiation:

  • "sr_kj_m2"

Biometeorological indices:

  • "wbgt_c" (Wet Bulb Globe Temperature)

  • "hi_c" (Heat Index)

  • "thi_c" (Temperature-Humidity Index)

  • "wcet_c" (Wind Chill Equivalent Temperature)

  • "wct_c" (Wind Chill Temperature)

  • "et_c" (Effective Temperature)

  • "utci_c" (Universal Thermal Climate Index)

  • "pet_c" (Physiological Equivalent Temperature)

Thermal indices (degree-days):

  • "cdd_c" (Cooling Degree Days)

  • "hdd_c" (Heating Degree Days)

  • "gdd_c" (Growing Degree Days)

Derived variables:

  • "diurnal_range_c" (daily temperature range)

  • "vapor_pressure_kpa" (saturation vapor pressure)

@section Notes:

  • Not all variables may be available in climate_data. Use names(climate_data) to inspect available columns.

  • When "all" is used, only existing variables in the dataset are selected.

  • Derived and index variables depend on prior processing with sus_climate_fill_inmet() or feature engineering steps.

time_unit

Temporal aggregation unit for raw climate data before join. Options: "day" (default), "week", "month", "quarter", "year", "season". Relevant only when input data are hourly resolution.

temporal_strategy

Temporal matching strategy. Options:

"exact"

Exact date match (same-day temperature for acute heat-related mortality).

"discrete_lag"

Climate value exactly L days before event.

"moving_window"

Mean/sum of sliding window (t-W, t). RECOMMENDED for cumulative exposure.

"offset_window"

Aggregates historical interval (t-W2, t-W1), ignoring recent days.

"distributed_lag"

Creates lag matrix 0 to L for DLNM modeling.

"degree_days"

Calculates Growing Degree Days (GDD) for thermal stress.

"seasonal"

Aggregates by climate season (DJF, MAM, JJA, SON).

"threshold_exceedance"

Counts days exceeding threshold (e.g., heatwaves).

"cold_wave_exceedance"

Counts days below threshold (e.g., cold extremes in southern regions).

"weighted_window"

Weighted mean with decay function.

climate_region

Character. Climate classification for parameter adaptation. Options: "auto" (default, auto-detect by latitude), "tropical" (North, Northeast), "subtropical" (Center-West, Southeast), "temperate" (South).

window_days

Integer. Number of days in window for moving_window, offset_window, degree_days, threshold_exceedance, cold_wave_exceedance, or weighted_window. Required for these strategies.

lag_days

Integer vector. Specific lags for discrete_lag or maximum lag for distributed_lag. Required for these strategies.

offset_days

Integer vector of length 2 for offset_window. Defines historical interval: c(W1, W2) aggregates from t-W2 to t-W1.

temp_base

Numeric. Temperature base for degree_days calculation. If NULL (default), uses region-specific default: 20degC (tropical), 18degC (subtropical), 15degC (temperate). Use 11 degC for Aedes aegypti development, 10degC for Plasmodium.

gdd_temp_var

Character. Temperature column for degree_days. Default: "tair_dry_bulb_c".

min_obs

Numeric (0 to 1). Minimum proportion of valid observations required within window. Default: 0.7 (70%).

threshold_value

Numeric. Threshold for threshold_exceedance or cold_wave_exceedance. If NULL, uses region-specific default.

threshold_direction

Character: "above" (default, > threshold) or "below" (< threshold). For cold_wave_exceedance, automatically set to "below".

weights

Numeric vector (optional) for weighted_window. Defines weight for each day within window, from most recent (position 1) to oldest (position W+1). If NULL, linear decreasing weights are generated automatically.

lang

Language for messages: "pt" (Portuguese), "en" (English), or "es" (Spanish). Default: "pt".

verbose

Logical. If TRUE, displays progress messages. Default: TRUE.

Value

A climasus_df (tibble) with original health data and integrated climate variables as new columns. Geometry is preserved if input is sf.

Details

Temporal Strategies Epidemiological Foundations

The choice of strategy should reflect the hypothesized biological mechanism:

  • exact: Immediate effects (heat stroke, hemorrhagic stroke)

  • discrete_lag: Known delayed effect (e.g., temperature 7 days before influences dengue)

  • moving_window: Cumulative exposure without specific lag.

  • offset_window: Defined incubation period (e.g., temperature 14-7 days before death)

  • distributed_lag: Distributed lag analysis (DLNM); generates exposure matrix for dlnm::crossbasis()

  • degree_days: Thermal threshold for vector development. GDD above temperature base accumulates over window_days.

  • seasonal: Seasonal climate patterns for ecological or long-term studies

  • threshold_exceedance: Counts days exceeding threshold (ideal for heatwaves)

  • cold_wave_exceedance: Counts days below threshold (ideal for cold extremes in southern regions)

  • weighted_window: Weighted mean modeling biological decay of exposure

Regional Climate Considerations

The function automatically adapts to Brazil's diverse climates:

Tropical (North, Northeast): Mean temperature 24-28degC, recommended temp_base 20degC (human health), heatwave threshold Tmax > 35degC, high autocorrelation.

Subtropical (Center-West, Southeast): Mean temperature 18-26degC, recommended temp_base 18degC, heatwave threshold Tmax > 32degC, moderate autocorrelation.

Temperate (South): Mean temperature 12-24degC, recommended temp_base 15degC, heatwave threshold Tmax > 30degC, coldwave threshold Tmin < 5degC (critical), low autocorrelation.

Causal Inference & Look-Ahead Bias

This function uses retroactive windows (t-W, t), never symmetric windows (t-W, t+W). The climate of day t+7 cannot cause a health event on day t. This design prevents look-ahead bias, a common methodological error in environmental epidemiology.

References

Gasparrini, A. (2011). Distributed lag linear and non-linear models in R: the package dlnm. Journal of Statistical Software, 43(8), 1-20.

Silveira, I. H., et al. (2023). Heat waves and mortality in the Brazilian Amazon. International Journal of Hygiene and Environmental Health, 248, 114109.

Marengo, J. A., et al. (2023). A cold wave of winter 2021 in central South America. Climate Dynamics, 61, 3-4.

Examples

if (FALSE) { # \dontrun{
# Prepare data
df_health <- sus_data_import() |>
  sus_data_clean_encoding() |>
  sus_data_standardize() |>
  sus_spatial_join(level = "munic")

df_climate <- sus_climate_inmet(years = 2023, uf = "AM") |>
  sus_climate_fill_inmet(target_var = "all", parallel = TRUE)

# Example 1: Exact match (same-day temperature)
df_exact <- sus_climate_aggregate(
  health_data = df_health,
  climate_data = df_climate,
  climate_var = "tair_dry_bulb_c",
  temporal_strategy = "exact"
)

# Example 2: Moving window (cumulative exposure, RECOMMENDED)
df_moving <- sus_climate_aggregate(
  health_data = df_health,
  climate_data = df_climate,
  temporal_strategy = "moving_window",
  window_days = 14
)

# Example 3: Regional adaptation (auto-detect)
df_regional <- sus_climate_aggregate(
  health_data = df_health,
  climate_data = df_climate,
  climate_region = "auto",
  temporal_strategy = "threshold_exceedance",
  window_days = 7
)

# Example 4: Cold waves (southern regions)
df_coldwave <- sus_climate_aggregate(
  health_data = df_health_south,
  climate_data = df_climate_south,
  climate_region = "temperate",
  temporal_strategy = "cold_wave_exceedance",
  window_days = 7,
  threshold_value = 5
)
} # }