Clinical Research Data Management

Manual-backed clinical research data management course synchronized from the completed course manual.

12 chapters / 12-week blended course

Course Path

133 lessons across 12 modules.

Create Account to Enroll

Objectives

Apply clinical research data management principles across the study lifecycle.
Design REDCap databases and CRFs from protocol requirements.
Use R for data cleaning, descriptive analysis, visualization, and reporting.
Document, validate, govern, and prepare clinical research data for reuse.

Module 1

Foundations of Clinical Research Data Management

Clinical research is one of the major routes through which societies generate evidence about health, disease, prevention, diagnosis, treatment, and care delivery. It may involve clinical trials of new medicines or vaccines, observational studies of disease progression, surveillance systems that monitor public health events, diagnostic evaluations, implementation studies, registries, or operational research designed to improve health services. Although these studies differ in design and scale, they share a common dependency: their conclusions are only as reliable as the data on which they are based.

Module 2

Protocol Translation and CRF Design

The study protocol is the central scientific, operational, and regulatory document for a clinical research study. It describes why the study is being conducted, what question it intends to answer, who will be included, what procedures will be performed, how participant safety will be protected, what outcomes will be measured, and how the data will be analyzed. For the clinical data manager, the protocol is more than a narrative description of a study. It is the blueprint from which data requirements, case report forms, database structures, validation checks, monitoring reports, and analysis datasets are derived.

Module 3

Database Design in REDCap

REDCap, which stands for Research Electronic Data Capture, is a secure, web-based application widely used for building and managing research databases and online surveys. It is particularly well suited to clinical and translational research because it allows study teams to design data collection instruments, manage records, define user permissions, create reports, export datasets, and preserve audit trails without requiring every data manager to become a software developer. In many research institutions, REDCap has become a standard platform for electronic case report forms, registries, clinical trial databases, surveillance systems, and operational research projects.

Module 4

Data Entry, Validation, and Access Control

Data entry is the process through which clinical research observations, measurements, assessments, and documents become structured data in a research database. Although the phrase may sound simple, data entry is a critical stage in the research lifecycle. It is the point at which protocol-defined information is transferred from clinical practice, laboratory work, participant interviews, field activities, or source records into a system that will eventually support monitoring, analysis, reporting, and archival.

Module 5

Data Quality Management and Query Resolution

Data quality is the degree to which data are fit for their intended use. In clinical research, intended use includes participant safety oversight, protocol compliance, statistical analysis, regulatory reporting, publication, data sharing, and long-term archival. Data quality is therefore not a single property and cannot be judged only by whether a dataset contains values. A dataset may be complete but inaccurate, accurate but late, valid but poorly documented, or internally consistent but not suitable for answering the study question.

Module 6

Introduction to R for Clinical Data Management

Clinical data management is increasingly dependent on the ability to move between data collection systems, statistical software, reporting tools, and documentation workflows. In many studies, the primary database may be implemented in REDCap, OpenClinica, Medidata Rave, Castor, or another electronic data capture system. However, data managers often need to perform tasks that go beyond the point-and-click interface of the database. They may need to compare exports, check data consistency across instruments, generate query lists, reconcile laboratory files, prepare monitoring reports, summarize missingness, inspect patterns across sites, or document a cleaning decision in a way that can be repeated later. R is valuable because it allows these tasks to be written as reusable scripts rather than performed manually each time [@rcore2024r; @wickham2023r4ds].

Module 7

Data Cleaning and Preparation in R

Data cleaning is the organized process of identifying, investigating, documenting, and resolving problems in research data. It is not a casual activity performed after data collection is complete. It is part of the quality system of a study. In clinical research, cleaning begins before the first participant is enrolled, because the protocol, CRFs, database design, validation rules, completion guidelines, and monitoring plan all determine what kinds of errors are likely to occur and how they will be handled. R becomes useful when those expectations can be translated into transparent, repeatable checks and preparation steps.

Module 8

Data Analysis in R

Descriptive analysis is the process of summarizing data so that a study team can understand what has been collected. In clinical research, it is often associated with final reports, manuscripts, or statistical analysis plans. However, descriptive analysis is also central to data management. A data manager needs to know how many participants have been enrolled, how many records are incomplete, whether follow-up outcomes are missing, whether sites have similar patterns of data entry, whether adverse events are being reported consistently, and whether numeric values fall within plausible clinical ranges. These questions are descriptive before they are inferential.

Module 9

Data Visualization and Dashboards

Tables are essential in clinical research, but they are not always the best way to detect patterns. Visualization allows a study team to see distributions, trends, outliers, site differences, missingness patterns, and operational bottlenecks. A good graph can show that one site has delayed follow-up entry, that a laboratory value has an implausible cluster, that enrollment slowed after a protocol amendment, or that query resolution improved after retraining. Visualization is therefore not merely a presentation tool; it is a data management and monitoring tool.

Module 10

Reporting and Reproducibility

Clinical research reports are often updated repeatedly. A weekly data quality report may be generated every Friday. An enrollment report may be reviewed at every trial management meeting. A query summary may be shared with sites monthly. A manuscript table may be updated whenever the database changes. If these reports are produced manually, the risk of inconsistency is high. Reproducible reporting reduces that risk by connecting the report directly to the code and data used to generate it.

Module 11

Data Documentation and Metadata

A dataset without documentation is fragile. Even if the data are accurate, future users may not understand what variables mean, how values were coded, what population is represented, which records were excluded, which dates were derived, or what missing codes mean. Clinical research data are often used long after collection ends. They may support manuscripts, audits, secondary analyses, data sharing, regulatory review, or future pooled analyses. Documentation makes these uses possible.

Module 12

Final Project Preparation, Presentation, and Course Integration

The final project is the capstone of the course. Its purpose is to bring together the major skills developed across the previous chapters: protocol interpretation, CRF design, REDCap database development, validation rules, data entry workflow, data quality management, R-based cleaning, descriptive analysis, visualization, reporting, documentation, and governance. The project should demonstrate not only that the learner can use tools, but that they can use them responsibly within a clinical research data management workflow.