The Purpose of Data Cleaning and Preparation
Key Idea
Callout
9 / 23
Key Idea
Callout
Key Idea
Data cleaning is the organized process of identifying, investigating, documenting, and resolving problems in research data. It is not a casual activity performed after data collection is complete. It is part of the quality system of a study. In clinical research, cleaning begins before the first participant is enrolled, because the protocol, CRFs, database design, validation rules, completion guidelines, and monitoring plan all determine what kinds of errors are likely to occur and how they will be handled. R becomes useful when those expectations can be translated into transparent, repeatable checks and preparation steps.