CLiREN-LMS
Data Quality Management and Query Resolution

Dimensions of Data Quality

5.2 Dimensions of Data Quality

30-45 minutes Foundational Step 3 of 7
Reading 1

5.2 Dimensions of Data Quality

3 / 7
Clinical research data quality is multidimensional. Different frameworks use different terminology, but several dimensions appear consistently in clinical data management and health data quality literature. Understanding these dimensions helps data managers design checks that go beyond simple missing value counts. Accuracy refers to whether data correctly represent the source observation or real-world event. If a participant's laboratory report shows hemoglobin of 8.5 g/dL but the database records 85 g/dL, the value is inaccurate. Completeness refers to whether expected data are present. If the primary outcome is missing for many participants, the dataset may be incomplete even if other variables are well populated. Validity refers to whether data conform to defined formats, ranges, and allowable values. A visit date entered as free text rather than a date format may be invalid. Consistency refers to whether related values agree logically. A discharge date before admission date is inconsistent. Timeliness refers to whether data are entered, reviewed, corrected, and available within required timeframes. Timely data are particularly important for safety monitoring, adaptive study decisions, and operational dashboards. Uniqueness refers to the absence of unintended duplicates. Duplicate participant records can distort enrollment counts and analyses. Integrity refers to protection from unauthorized, undocumented, or inappropriate change. Integrity depends on audit trails, access control, source traceability, and controlled correction procedures. Another important dimension is interpretability. Data are not useful if future users cannot understand variable definitions, coding, units, collection methods, or missing value meanings. For example, a variable coded as `1`, `2`, and `3` is not interpretable unless the data dictionary explains what those codes mean. Interpretability connects data quality to metadata, FAIR principles, and reproducibility. No single quality dimension is sufficient. A dataset may be complete but invalid if values are outside plausible ranges. It may be valid but inaccurate if values were copied from the wrong source document. It may be accurate and complete but not timely enough for interim safety review. Data managers should therefore design a portfolio of checks that assess multiple dimensions. **Table 5.1: Data Quality Dimensions in Clinical Research**
DimensionMeaningExample quality check
AccuracyValues reflect source documents or true observationsCompare database value with laboratory report
CompletenessExpected data are present or missingness is explainedList participants missing primary outcome
ValidityValues conform to allowed formats, ranges, and codesTemperature must be 30-45 degrees Celsius
ConsistencyRelated values agree logicallyFollow-up date must occur after enrollment date
TimelinessData are available within expected timeframesForms entered within 72 hours of visit
UniquenessRecords are not unintentionally duplicatedIdentify duplicate participant IDs
IntegrityData are protected from unauthorized changeReview audit trail for critical edits
InterpretabilityData can be understood by users and analystsVerify data dictionary and coding definitions