The Purpose of Data Cleaning and Preparation
Why Cleaning and Preparation Are Different
Accordion
14 / 23
Why Cleaning and Preparation Are Different
Accordion
Why Cleaning and Preparation Are Different
Cleaning identifies and resolves data quality problems
Cleaning focuses on missing values, invalid codes, impossible dates, duplicates, inconsistencies, and other findings that may require review or correction.
Preparation makes data usable for a purpose
Preparation may include selecting variables, standardizing names, converting dates, reshaping repeated measures, joining datasets, creating derived variables, and producing summaries.
Not every transformation is a correction
Recoding 1 and 0 into Yes and No is a preparation step. Changing an incorrect consent date after source review is a correction. The workflow should document which is which.
The source database remains authoritative
When clinical data need correction, the correction should normally occur in the validated source database through the query and audit trail process, then be reflected in a new export.