sus_climate_inmet() downloads, imports, standardizes, and quality-controls
Brazilian meteorological data from the National Institute of Meteorology (INMET).
The function implements a comprehensive processing pipeline:
Download: Multi-year data with automatic retry and backoff
Parsing: Handles INMET's CSV format with metadata extraction
Standardization: Renames columns to canonical names (see Variables)
Quality Control: Physical consistency checks (see QC Details)
Caching: Two-level (memory + disk) with Parquet format
Parallel Processing: Both between and within years
Usage
sus_climate_inmet(
years = NULL,
uf = NULL,
use_cache = TRUE,
cache_dir = "~/.climasus4r_cache/climate",
parallel = TRUE,
workers = 4,
lang = "pt",
verbose = TRUE
)Arguments
- years
Numeric vector of year(s) to import. Examples:
2020,2020:2024,c(2019, 2021, 2023). Must be between 2000 and current year. IfNULL, imports last 2 years.- uf
Character vector of Brazilian state codes (e.g.,
"AM",c("RJ", "MG")). Case insensitive. IfNULL(default), imports all 27 states (may be slow - see Performance).- use_cache
Logical. If
TRUE(default), implements two-level caching:Session cache: In-memory cache (MD5 hash of parameters)
Disk cache: Parquet files with Zstandard compression
Use
unlink(cache_dir, recursive = TRUE)to clear all caches.- cache_dir
Character. Directory path for disk cache. Default:
"~/.climasus4r_cache/climate". Created automatically.- parallel
Logical. If
TRUE(default), enables two levels of parallelism:Between years: Multiple years downloaded simultaneously
Within year: CSV files for each year parsed in parallel
For single-year imports, only within-year parallelization applies.
- workers
Integer. Number of parallel workers. Default:
4. Ignored ifparallel = FALSE. Usesfuture::multisessionbackend. Automatically capped atavailableCores() - 1.- lang
Character. Message language. One of:
"pt": Portuguese (default)"en": English"es": Spanish
- verbose
Logical. If
TRUE(default), prints detailed progress including:Cache hits/misses
Download progress with retry attempts
QC modifications (rows corrected/removed)
Final statistics
Value
A climasus_df object (subclass of tibble) with class hierarchy:
climasus_df > tbl_df > tbl > data.frame
Data Columns:
station_codeCharacter. INMET 8-digit station identifier (e.g., "A001")
station_nameCharacter. Full station name
regionCharacter. Brazilian region (Norte, Nordeste, etc.)
UFCharacter. State abbreviation
latitude,longitude,altitudeNumeric. Station coordinates (WGS84)
datePOSIXct. Observation timestamp UTC (always hourly)
yearInteger. Year extracted from date
- Climate variables
Numeric. Standardized names (see Variables)
Metadata (accessible via attr(x, "climasus_meta")):
versionPackage version used
timestampImport date/time
source"INMET"
yearsYears imported
ufsStates imported
cache_usedWhether cache was used
qc_statsList with quality control statistics
n_stationsNumber of unique stations
n_observationsTotal rows
temporal_coverageList with
startandenddateshistoryCharacter vector of processing steps
Note
Data frequency: Always hourly. Use
sus_climate_aggregate()for daily/weekly.Timezone: All timestamps are UTC. Convert if needed.
Missing data: Represented as
NA. Usesus_climate_fill_gaps()for imputation.Encoding: All strings converted to UTF-8.
Decimals: Converted from comma (
,) to point (.).
Standardized Meteorological Variables
INMET raw column names vary by year. This function automatically detects and renames them to the following canonical names:
| Canonical Name | Description | Unit | Physical Range |
rainfall_mm | Precipitation total | mm | 0 - 500 |
patm_mb | Mean atmospheric pressure | mb | 700 - 1100 |
patm_max_mb | Max atmospheric pressure | mb | 700 - 1100 |
patm_min_mb | Min atmospheric pressure | mb | 700 - 1100 |
sr_kj_m2 | Solar radiation | kJ/m² | 0 - 40000 |
tair_dry_bulb_c | Mean air temperature | °C | -90 - 60 |
tair_max_c | Max air temperature | °C | -90 - 60 |
tair_min_c | Min air temperature | °C | -90 - 60 |
dew_tmean_c | Mean dew point | °C | -90 - 60 |
dew_tmax_c | Max dew point | °C | -90 - 60 |
dew_tmin_c | Min dew point | °C | -90 - 60 |
rh_mean_porc | Mean relative humidity | % | 0 - 100 |
rh_max_porc | Max relative humidity | % | 0 - 100 |
rh_min_porc | Min relative humidity | % | 0 - 100 |
ws_2_m_s | Wind speed at 2m | m/s | 0 - 100 |
ws_gust_m_s | Wind gust | m/s | 0 - 100 |
wd_degrees | Wind direction | degrees | 0 - 360 |
Quality Control Details
The function applies automatic physical consistency checks:
1. Physical Range Validation:
Values outside physically possible ranges are set to NA:
Temperature: -90°C to 60°C
Pressure: 700 mb to 1100 mb
Humidity: 0% to 100%
Precipitation: 0 mm to 500 mm
Solar radiation: 0 to 40000 kJ/m²
Wind speed: 0 to 100 m/s
Wind direction: 0° to 360°
2. Dew Point Consistency: Calculates theoretical dew point from T and RH using Magnus formula. If |observed - calculated| > 3°C, observed is set to NA.
3. Solar Radiation: Nighttime values (18h-6h) are set to 0 for physical consistency.
Caching System Details
Disk Cache:
Format: Apache Parquet with Zstandard compression (level 6)
Partitioning: By
yearandUFfor fast filteringBackup: Compressed CSV as fallback if Parquet corrupted
Location:
~/.climasus4r_cache/climate/inmet_parquet/
See also
sus_climate_fill_gaps()for ML-based gap fillingsus_join_spatial()for preparing municipality data
Examples
if (FALSE) { # \dontrun{
# Basic import - single state, single year
climate_am <- sus_climate_inmet(
years = 2023,
uf = "AM"
)
# Multi-year import with parallel processing
climate_sp <- sus_climate_inmet(
years = 2020:2024,
uf = "SP",
parallel = TRUE,
workers = 4
)
# Import all Southeast states
climate_se <- sus_climate_inmet(
years = 2023,
uf = c("SP", "RJ", "MG", "ES"),
verbose = TRUE
)
# Inspect available stations
climate_df %>%
dplyr::distinct(station_code, station_name, latitude, longitude)
} # }