Skip to contents

Performs a spatial join between a data.frame containing SUS data and official Brazilian geographic boundaries (state, municipality, census tract) or geocodes postal codes (CEP). This function is the core spatial engine for the entire climasus4r package and serves as the foundation for all spatial analyses.

Usage

sus_join_spatial(
  df,
  level = "munic",
  join_col = NULL,
  lang = "pt",
  use_cache = TRUE,
  cache_dir = "~/.climasus4r_cache/spatial",
  verbose = TRUE
)

Arguments

df

A data.frame containing health data with a geographic identifier column. The data frame should have a system column created by detect_health_system().

level

Character string specifying the geographic aggregation level. Default is "munic". Options:

  • Administrative: "munic" (municipality). Default.

  • Health: "health_region", "health_facilities", "school"

  • Environmental: "amazon", "biomes", "conservation_units", "disaster_risk_area", "semiarid", "indigenous_land"

  • Urban: "neighborhood", "urban_area", "metro_area", "urban_concentrations", "pop_arrangements",

  • Miscellaneous: "cep" (postal code)

Note: "cep" level is only available for SIH and CNES systems.

join_col

Character string with the name of the column in df containing the geographic identifier. If NULL (default), the function will automatically detect the appropriate column based on the level and common SUS column patterns.

lang

Character string specifying the language for messages. Options: "pt" (Portuguese, default), "en" (English), "es" (Spanish).

use_cache

Logical. If TRUE (default), uses cached spatial data to avoid re-downloads and improve performance.

cache_dir

Character string specifying the directory to store cached files. Default is "~/.climasus4r_cache/spatial".

verbose

Logical. If TRUE (default), prints detailed progress information including cache status, download progress, and join statistics.

Value

An sf object (Simple Features data.frame) with all original columns from df plus a geometry column containing the spatial geometries:

  • /munic/neighborhood: POLYGON or MULTIPOLYGON geometries

  • /cnes/cep: POINT geometries (geocoded locations)

Details

Caching Strategy: Spatial data is cached as Parquet files in ~/.climasus4r_cache/spatial to:

  • Avoid repeated downloads

  • Improve performance (10-100x faster)

  • Enable offline reuse

Code Normalization:

  • Municipality codes: Accepts 6 or 7 digits; 7-digit codes have the verification digit removed automatically

  • CEP: Always zero-padded to 8 characters

  • All join columns are coerced to character type

References

Pereira, R.H.M.; Goncalves, C.N.; et. all (2019) geobr: Loads Shapefiles of Official Spatial Data Sets of Brazil. GitHub repository - https://github.com/ipeaGIT/geobr. Pereira, Rafael H. M.; Barbosa, Rogerio J. (2023) censobr: Download Data from Brazil’s Population Census. R package version v0.4.0, https://CRAN.R-project.org/package=censobr. DOI: 10.32614/CRAN.package.censobr.

Examples

if (FALSE) { # \dontrun{
library(climasus4r)

# Example 1: Link mortality data to municipalities
df_sim <- sus_data_import(uf = "SP", year = 2023, system = "SIM-DO") %>%
  sus_data_standardize(lang = "pt")

sf_sim <- sus_join_spatial(
  df = df_sim,
  level = "munic",
  lang = "pt"
)

# Example 3: Geocode CEP for hospitalization data (SIH only)
df_sih <- sus_data_import(uf = "RJ", year = 2023, system = "SIH-RD") %>%
  sus_data_standardize(lang = "pt")

sf_cep <- sus_join_spatial(
  df = df_sih,
  level = "cep",
  lang = "pt"
)

# Example 4: Census tract level analysis
sf_census <- sus_join_spatial(
  df = df_sim,
  level = "schools",
  lang = "pt"
)
} # }