CLiREN-LMS
Data Cleaning and Preparation in R

The Purpose of Data Cleaning and Preparation

Table 1

30-45 minutes Applied Step 13 of 23
Table

Table 1

13 / 23
Table

Table 1

TermMeaningExampleGovernance implication
Raw dataData exported or received from the source system without manual alterationREDCap CSV export saved in `data_raw`Preserve as received and protect from accidental editing
Cleaned dataDataset after documented checks, corrections, and transformationsDataset with corrected values after database queries are resolved and re-exportedMust be traceable to source data and cleaning decisions
Analysis-ready dataDataset structured for statistical analysisOne row per participant with derived endpoint variablesRequires statistician and protocol alignment
Derived variableVariable calculated from one or more source variablesAge at enrollment, length of stay, visit window flagRule must be documented and reproducible
Query outputListing of records requiring site or investigator reviewMissing consent dates or impossible date sequencesShould feed into approved query workflow
Cleaning logRecord of checks, findings, decisions, and outputsTable of rule ID, affected records, action taken, statusSupports transparency and review