Codebooks and Data Dictionaries
Codebooks and Data Dictionaries
Accordion
5 / 8
Codebooks and Data Dictionaries
Accordion
Codebooks and Data Dictionaries
Part 1
A data dictionary is usually a structured table that defines variables in a database or dataset. It may include variable name, form, field label, field type, choices, validation, branching logic, and required status. In REDCap, the data dictionary is also a build artifact because it can define the database structure.
A codebook is often more explanatory. It helps humans understand a dataset. It may include variable definitions, coding schemes, units, missingness notes, derivation rules, and usage guidance. A codebook for an analysis dataset may explain how `day28_outcome` was derived, how deaths were coded, which participants were excluded, and how repeated visits were summarized.
R can help create a simple codebook:
Part 2
This automated codebook is only a starting point. It does not know clinical definitions or derivation rules unless those are supplied. The data manager should enrich the codebook with labels, definitions, units, and source information.