Checks that all required columns are present and contain valid data. Returns validation results and optionally stops on failure.
Usage
validate_required_fields(
data,
required_cols,
stop_on_failure = TRUE,
allow_na = FALSE,
min_completeness = 0.8
)Arguments
- data
Data frame to validate
- required_cols
Character vector. Required column names.
- stop_on_failure
Logical. If TRUE, stops execution on validation failure. If FALSE, returns validation report. Default TRUE.
- allow_na
Logical. If TRUE, allows NA values in required columns. Default FALSE.
- min_completeness
Numeric. Minimum proportion of non-NA values required (0-1). Default 0.8 (80 percent completeness).
Value
List with validation results:
valid: Logical. Overall validation statusmissing_cols: Character vector of missing columnsincomplete_cols: Data frame of columns below min_completenessmessages: Character vector of validation messages
Examples
if (FALSE) { # \dontrun{
# Stop on failure (default)
validate_required_fields(
data,
required_cols = c("patient_id", "organism_normalized", "antibiotic_normalized")
)
# Get validation report without stopping
validation <- validate_required_fields(
data,
required_cols = c("patient_id", "organism_normalized"),
stop_on_failure = FALSE
)
} # }