CLiREN-LMS
Data Cleaning and Preparation in R

Common Cleaning Risks and How to Avoid Them

Table 1

30-45 minutes Applied Step 6 of 9
Table

Table 1

6 / 9
Table

Table 1

RiskExamplePrevention
Overwriting raw variablesReplacing numeric code with label in same columnCreate new derived or labeled variable
Silent row lossFiltering without counting exclusionsRecord before and after counts
Incorrect datesTreating text dates as real datesParse dates explicitly and inspect results
Wrong recodingReversing `1 = Female`, `2 = Male`Validate against data dictionary
Uncontrolled missing codesTreating `999` as a real valueDefine and convert special missing codes carefully
Unreviewed automationSending query lists without human reviewRequire data manager review before action