Skip to contents

sus_climate_inmet() downloads, imports, standardizes, and quality-controls Brazilian meteorological data from the National Institute of Meteorology (INMET).

The function implements a comprehensive processing pipeline:

  1. Download: Multi-year data with automatic retry and backoff

  2. Parsing: Handles INMET's CSV format with metadata extraction

  3. Standardization: Renames columns to canonical names (see Variables)

  4. Quality Control: Physical consistency checks (see QC Details)

  5. Caching: Two-level (memory + disk) with Parquet format

  6. Parallel Processing: Both between and within years

Usage

sus_climate_inmet(
  years = NULL,
  uf = NULL,
  use_cache = TRUE,
  cache_dir = "~/.climasus4r_cache/climate",
  parallel = TRUE,
  workers = 4,
  lang = "pt",
  verbose = TRUE
)

Arguments

years

Numeric vector of year(s) to import. Examples: 2020, 2020:2024, c(2019, 2021, 2023). Must be between 2000 and current year. If NULL, imports last 2 years.

uf

Character vector of Brazilian state codes (e.g., "AM", c("RJ", "MG")). Case insensitive. If NULL (default), imports all 27 states (may be slow - see Performance).

use_cache

Logical. If TRUE (default), implements two-level caching:

  • Session cache: In-memory cache (MD5 hash of parameters)

  • Disk cache: Parquet files with Zstandard compression

Use unlink(cache_dir, recursive = TRUE) to clear all caches.

cache_dir

Character. Directory path for disk cache. Default: "~/.climasus4r_cache/climate". Created automatically.

parallel

Logical. If TRUE (default), enables two levels of parallelism:

  • Between years: Multiple years downloaded simultaneously

  • Within year: CSV files for each year parsed in parallel

For single-year imports, only within-year parallelization applies.

workers

Integer. Number of parallel workers. Default: 4. Ignored if parallel = FALSE. Uses future::multisession backend. Automatically capped at availableCores() - 1.

lang

Character. Message language. One of:

  • "pt": Portuguese (default)

  • "en": English

  • "es": Spanish

verbose

Logical. If TRUE (default), prints detailed progress including:

  • Cache hits/misses

  • Download progress with retry attempts

  • QC modifications (rows corrected/removed)

  • Final statistics

Value

A climasus_df object (subclass of tibble) with class hierarchy: climasus_df > tbl_df > tbl > data.frame

Data Columns:

station_code

Character. INMET 8-digit station identifier (e.g., "A001")

station_name

Character. Full station name

region

Character. Brazilian region (Norte, Nordeste, etc.)

UF

Character. State abbreviation

latitude, longitude, altitude

Numeric. Station coordinates (WGS84)

date

POSIXct. Observation timestamp UTC (always hourly)

year

Integer. Year extracted from date

Climate variables

Numeric. Standardized names (see Variables)

Metadata (accessible via attr(x, "climasus_meta")):

version

Package version used

timestamp

Import date/time

source

"INMET"

years

Years imported

ufs

States imported

cache_used

Whether cache was used

qc_stats

List with quality control statistics

n_stations

Number of unique stations

n_observations

Total rows

temporal_coverage

List with start and end dates

history

Character vector of processing steps

Note

  • Data frequency: Always hourly. Use sus_climate_aggregate() for daily/weekly.

  • Timezone: All timestamps are UTC. Convert if needed.

  • Missing data: Represented as NA. Use sus_climate_fill_gaps() for imputation.

  • Encoding: All strings converted to UTF-8.

  • Decimals: Converted from comma (,) to point (.).

Standardized Meteorological Variables

INMET raw column names vary by year. This function automatically detects and renames them to the following canonical names:

Canonical NameDescriptionUnitPhysical Range
rainfall_mmPrecipitation totalmm0 - 500
patm_mbMean atmospheric pressuremb700 - 1100
patm_max_mbMax atmospheric pressuremb700 - 1100
patm_min_mbMin atmospheric pressuremb700 - 1100
sr_kj_m2Solar radiationkJ/m²0 - 40000
tair_dry_bulb_cMean air temperature°C-90 - 60
tair_max_cMax air temperature°C-90 - 60
tair_min_cMin air temperature°C-90 - 60
dew_tmean_cMean dew point°C-90 - 60
dew_tmax_cMax dew point°C-90 - 60
dew_tmin_cMin dew point°C-90 - 60
rh_mean_porcMean relative humidity%0 - 100
rh_max_porcMax relative humidity%0 - 100
rh_min_porcMin relative humidity%0 - 100
ws_2_m_sWind speed at 2mm/s0 - 100
ws_gust_m_sWind gustm/s0 - 100
wd_degreesWind directiondegrees0 - 360

Quality Control Details

The function applies automatic physical consistency checks:

1. Physical Range Validation: Values outside physically possible ranges are set to NA:

  • Temperature: -90°C to 60°C

  • Pressure: 700 mb to 1100 mb

  • Humidity: 0% to 100%

  • Precipitation: 0 mm to 500 mm

  • Solar radiation: 0 to 40000 kJ/m²

  • Wind speed: 0 to 100 m/s

  • Wind direction: 0° to 360°

2. Dew Point Consistency: Calculates theoretical dew point from T and RH using Magnus formula. If |observed - calculated| > 3°C, observed is set to NA.

3. Solar Radiation: Nighttime values (18h-6h) are set to 0 for physical consistency.

Caching System Details

Disk Cache:

  • Format: Apache Parquet with Zstandard compression (level 6)

  • Partitioning: By year and UF for fast filtering

  • Backup: Compressed CSV as fallback if Parquet corrupted

  • Location: ~/.climasus4r_cache/climate/inmet_parquet/

See also

Examples

if (FALSE) { # \dontrun{
# Basic import - single state, single year
climate_am <- sus_climate_inmet(
  years = 2023,
  uf = "AM"
)


# Multi-year import with parallel processing
climate_sp <- sus_climate_inmet(
  years = 2020:2024,
  uf = "SP",
  parallel = TRUE,
  workers = 4
)

# Import all Southeast states
climate_se <- sus_climate_inmet(
  years = 2023,
  uf = c("SP", "RJ", "MG", "ES"),
  verbose = TRUE
)


# Inspect available stations
climate_df %>%
  dplyr::distinct(station_code, station_name, latitude, longitude)
} # }