Interpreting Descriptive Outputs
8.8 Interpreting Descriptive Outputs
Reading 1
3 / 8
8.8 Interpreting Descriptive Outputs
Interpretation is a professional responsibility. R can calculate a percentage, but it cannot know whether the percentage is operationally concerning, clinically meaningful, or expected. Data managers should interpret descriptive outputs by asking what the table represents, what denominator was used, what data were missing, what timing applies, and whether the result matches expectations.
Suppose a table shows that Site A has 20 percent missing day 28 outcomes and Site B has 5 percent missing day 28 outcomes. This may suggest that Site A needs follow-up. But the interpretation changes if most Site A participants were enrolled recently and their outcomes are not yet due. It changes again if Site B has fewer participants, or if Site A serves a population with more transfers and deaths. A good summary table should help the team ask better questions, not replace judgment.
Descriptive summaries should also be compared with external expectations. If a neonatal sepsis study reports no deaths after enrolling hundreds of critically ill neonates, the result may be possible but should be reviewed. If an adult hypertension study reports a median systolic blood pressure of 20 mmHg, the issue is almost certainly a unit, entry, or import problem. If a site reports no adverse events while similar sites report many, under-reporting may be possible. These checks combine statistical awareness with clinical and operational understanding.
Another interpretation issue is missing data. A table of percentages can be misleading if missing values are excluded silently. For example, if 80 participants have known outcome status and 20 are missing, reporting that 75 percent survived among known outcomes may obscure the fact that 20 percent of outcomes are missing. Depending on the context, the table should show both outcome distribution and missingness.
```r
outcome_interpretation_table <- prepared_data |>
mutate(
outcome_display = replace_na(day28_outcome, "Missing")
) |>
count(outcome_display, name = "n") |>
mutate(
percent_of_all = round(100 * n / sum(n), 1)
)
outcome_interpretation_table
```
This table uses all participants as the denominator and displays missing outcomes explicitly. It may be more appropriate for a data management review than a table that excludes missing outcomes.
Descriptive outputs should be checked against prior reports. Sudden changes may indicate real study progress, but they may also indicate export problems, coding changes, script changes, or data corrections. Version control and report dates help explain such changes.
| Interpretation question | Why it matters |
|---|---|
| What is the denominator? | Percentages are meaningless without it |
| Are missing values shown? | Missingness affects trust and interpretation |
| What is the unit of observation? | Rows may represent participants, visits, events, or tests |
| Is the timing appropriate? | Outcomes may not yet be due |
| Are categories expected? | Unexpected levels may indicate coding problems |
| Are values clinically plausible? | Extreme values may indicate errors or important cases |
| Does the result match prior reports? | Sudden changes require explanation |
Good interpretation is cautious, contextual, and documented. It avoids both underreaction and overreaction.