Skip to contents

Master function that orchestrates the complete AMR data preprocessing pipeline. Applies standardization, enrichment, and derivation in sequence.

Usage

amr_preprocess(
  data,
  config = NULL,
  phases = "all",
  verbose = TRUE,
  validate = TRUE,
  generate_report = TRUE
)

Arguments

data

Data frame. Raw AMR dataset.

config

Configuration object from amr_config(). If NULL, uses defaults.

phases

Character vector. Which phases to run: "standardize", "enrich", "derive", or "all" (default). Allows partial pipeline execution.

verbose

Logical. Print detailed progress messages. Default TRUE.

validate

Logical. Run validation checks before and after. Default TRUE.

generate_report

Logical. Generate preprocessing report. Default TRUE.

Value

List with class "amr_result": - data: Preprocessed data frame - config: Configuration used - log: List of processing logs and summaries - report: Preprocessing report (if generate_report = TRUE) - metadata: Pipeline execution metadata

Details

Pipeline phases: 1. **Standardization**: Column mapping, value normalization, date parsing 2. **Enrichment**: Derive missing optional variables (Age, LOS, infection type) 3. **Derivation**: Create analytical variables (event IDs, MDR/XDR, weights)

Examples

if (FALSE) { # \dontrun{
# Full pipeline with defaults
result <- amr_preprocess(raw_data)
clean_data <- result$data

# Custom configuration
config <- amr_config(
  hai_cutoff = 3,
  mdr_definition = "CDC",
  fuzzy_match = TRUE
)
result <- amr_preprocess(raw_data, config = config)

# Run only standardization phase
result <- amr_preprocess(raw_data, phases = "standardize")

# Summary
summary(result)
} # }