CLiREN-LMS
Data Cleaning and Preparation in R

Recoding Categorical Variables

Learning Outcomes

30-45 minutes Applied Step 2 of 10
Outcomes

Learning Outcomes

2 / 10
  • Explain the purpose of data cleaning and preparation in clinical research data management.
  • Distinguish between raw data, cleaned data, analysis-ready data, derived variables, and query outputs.
  • Describe a reproducible R workflow for importing REDCap exports and preparing datasets for review.
  • Explain the role of the REDCap API in automated data export and why API use must be governed carefully.
  • Identify and classify missing data using study-specific definitions and documentation.
  • Recode categorical variables transparently while preserving traceability to original values.
  • Create derived variables in R using protocol-defined rules.
  • Write cleaning scripts that are readable, rerunnable, and suitable for review by another data manager.
  • Produce simple cleaning logs and outputs that support query management, monitoring, and analysis preparation.