Skip to contents

Generates a comprehensive data quality report for health data, including summaries of missing values, data distributions, date validations, and ICD code frequencies. This function helps identify potential data quality issues before analysis.

Usage

sus_data_quality_report(
  df,
  output_format = "console",
  output_file = NULL,
  check_dates = TRUE,
  check_icd = TRUE,
  top_n = 10,
  lang = "pt"
)

Arguments

df

A data frame containing health data.

output_format

Character string specifying the output format. Options: "console" (default, prints to console), "html" (saves HTML report), "markdown" (saves Markdown report).

output_file

Character string with the path to save the report file. Required if output_format is not "console". If NULL, uses a default filename based on timestamp.

check_dates

Logical. If TRUE (default), performs date validation checks (e.g., future dates, dates before birth).

check_icd

Logical. If TRUE (default), summarizes ICD code distributions.

top_n

Integer. Number of top categories to show in frequency tables. Default is 10.

lang

Character string specifying the language for the report. Options: "en" (English), "pt" (Portuguese, default), "es" (Spanish).

Value

Invisibly returns a list containing the quality metrics. If output_format = "console", prints the report to the console. Otherwise, saves the report to a file.

Details

The data quality report includes:

  • Dataset Overview: Dimensions, column types

  • Missing Values: Count and percentage of NAs by column

  • Demographic Variables: Frequency tables for sex, race, age groups

  • Date Validation: Checks for invalid dates (future, before 1900, etc.)

  • ICD Codes: Top 10 most frequent diagnosis codes

  • Geographic Distribution: Top municipalities

Examples

if (FALSE) { # \dontrun{
library(climasus4r)

# Print report to console
sus_data_quality_report(df, lang = "pt")

# Save HTML report
sus_data_quality_report(
  df,
  output_format = "html",
  output_file = "reports/dq_report.html",
  lang = "en"
)

# Save Markdown report
sus_data_quality_report(
  df,
  output_format = "markdown",
  output_file = "reports/dq_report.md"
)
} # }