Data Analysis in R

Interpreting Descriptive Outputs

8.8 Interpreting Descriptive Outputs

30-45 minutes Applied Step 3 of 8

Reading 1

8.8 Interpreting Descriptive Outputs

3 / 8

Interpretation is a professional responsibility. R can calculate a percentage, but it cannot know whether the percentage is operationally concerning, clinically meaningful, or expected. Data managers should interpret descriptive outputs by asking what the table represents, what denominator was used, what data were missing, what timing applies, and whether the result matches expectations. Suppose a table shows that Site A has 20 percent missing day 28 outcomes and Site B has 5 percent missing day 28 outcomes. This may suggest that Site A needs follow-up. But the interpretation changes if most Site A participants were enrolled recently and their outcomes are not yet due. It changes again if Site B has fewer participants, or if Site A serves a population with more transfers and deaths. A good summary table should help the team ask better questions, not replace judgment. Descriptive summaries should also be compared with external expectations. If a neonatal sepsis study reports no deaths after enrolling hundreds of critically ill neonates, the result may be possible but should be reviewed. If an adult hypertension study reports a median systolic blood pressure of 20 mmHg, the issue is almost certainly a unit, entry, or import problem. If a site reports no adverse events while similar sites report many, under-reporting may be possible. These checks combine statistical awareness with clinical and operational understanding. Another interpretation issue is missing data. A table of percentages can be misleading if missing values are excluded silently. For example, if 80 participants have known outcome status and 20 are missing, reporting that 75 percent survived among known outcomes may obscure the fact that 20 percent of outcomes are missing. Depending on the context, the table should show both outcome distribution and missingness. ```r outcome_interpretation_table <- prepared_data |> mutate( outcome_display = replace_na(day28_outcome, "Missing") ) |> count(outcome_display, name = "n") |> mutate( percent_of_all = round(100 * n / sum(n), 1) ) outcome_interpretation_table ``` This table uses all participants as the denominator and displays missing outcomes explicitly. It may be more appropriate for a data management review than a table that excludes missing outcomes. Descriptive outputs should be checked against prior reports. Sudden changes may indicate real study progress, but they may also indicate export problems, coding changes, script changes, or data corrections. Version control and report dates help explain such changes.

Interpretation question	Why it matters
What is the denominator?	Percentages are meaningless without it
Are missing values shown?	Missingness affects trust and interpretation
What is the unit of observation?	Rows may represent participants, visits, events, or tests
Is the timing appropriate?	Outcomes may not yet be due
Are categories expected?	Unexpected levels may indicate coding problems
Are values clinically plausible?	Extreme values may indicate errors or important cases
Does the result match prior reports?	Sudden changes require explanation

Good interpretation is cautious, contextual, and documented. It avoids both underreaction and overreaction.