Spatially Link SUS Data to Brazilian Geographic Units
Source:R/sus_join_spatial.R
sus_join_spatial.RdPerforms a spatial join between a data.frame containing SUS data and official Brazilian geographic boundaries (state, municipality, census tract) or geocodes postal codes (CEP). This function is the core spatial engine for the entire climasus4r package and serves as the foundation for all spatial analyses.
Usage
sus_join_spatial(
df,
level = "munic",
join_col = NULL,
lang = "pt",
use_cache = TRUE,
cache_dir = "~/.climasus4r_cache/spatial",
verbose = TRUE
)Arguments
- df
A
data.framecontaining health data with a geographic identifier column. The data frame should have asystemcolumn created bydetect_health_system().- level
Character string specifying the geographic aggregation level. Default is
"munic". Options:Administrative: "munic" (municipality). Default.
Health: "health_region", "health_facilities", "school"
Environmental: "amazon", "biomes", "conservation_units", "disaster_risk_area", "semiarid", "indigenous_land"
Urban: "neighborhood", "urban_area", "metro_area", "urban_concentrations", "pop_arrangements",
Miscellaneous: "cep" (postal code)
Note:
"cep"level is only available for SIH and CNES systems.- join_col
Character string with the name of the column in
dfcontaining the geographic identifier. IfNULL(default), the function will automatically detect the appropriate column based on theleveland common SUS column patterns.- lang
Character string specifying the language for messages. Options:
"pt"(Portuguese, default),"en"(English),"es"(Spanish).- use_cache
Logical. If
TRUE(default), uses cached spatial data to avoid re-downloads and improve performance.- cache_dir
Character string specifying the directory to store cached files. Default is
"~/.climasus4r_cache/spatial".- verbose
Logical. If
TRUE(default), prints detailed progress information including cache status, download progress, and join statistics.
Value
An sf object (Simple Features data.frame) with all original columns from
df plus a geometry column containing the spatial geometries:
/munic/neighborhood: POLYGON or MULTIPOLYGON geometries
/cnes/cep: POINT geometries (geocoded locations)
Details
Caching Strategy:
Spatial data is cached as Parquet files in ~/.climasus4r_cache/spatial to:
Avoid repeated downloads
Improve performance (10-100x faster)
Enable offline reuse
Code Normalization:
Municipality codes: Accepts 6 or 7 digits; 7-digit codes have the verification digit removed automatically
CEP: Always zero-padded to 8 characters
All join columns are coerced to character type
References
Pereira, R.H.M.; Goncalves, C.N.; et. all (2019) geobr: Loads Shapefiles of Official Spatial Data Sets of Brazil. GitHub repository - https://github.com/ipeaGIT/geobr. Pereira, Rafael H. M.; Barbosa, Rogerio J. (2023) censobr: Download Data from Brazil’s Population Census. R package version v0.4.0, https://CRAN.R-project.org/package=censobr. DOI: 10.32614/CRAN.package.censobr.
Examples
if (FALSE) { # \dontrun{
library(climasus4r)
# Example 1: Link mortality data to municipalities
df_sim <- sus_data_import(uf = "SP", year = 2023, system = "SIM-DO") %>%
sus_data_standardize(lang = "pt")
sf_sim <- sus_join_spatial(
df = df_sim,
level = "munic",
lang = "pt"
)
# Example 3: Geocode CEP for hospitalization data (SIH only)
df_sih <- sus_data_import(uf = "RJ", year = 2023, system = "SIH-RD") %>%
sus_data_standardize(lang = "pt")
sf_cep <- sus_join_spatial(
df = df_sih,
level = "cep",
lang = "pt"
)
# Example 4: Census tract level analysis
sf_census <- sus_join_spatial(
df = df_sim,
level = "schools",
lang = "pt"
)
} # }