Skip to content

Longitudinal, PRO, and Repeated-Measures Methods

Definition

Longitudinal and repeated-measures methods estimate treatment effects on outcomes measured multiple times per subject across scheduled visits (symptom scores, QoL scales, biomarkers, tumor burden). In oncology, these analyses most commonly target patient-reported outcome (PRO) endpoints and exploratory biomarker trajectories.

Per ICH E9(R1) (Final, 2019), the estimand framework requires that longitudinal analyses make explicit how intercurrent events (ICEs) such as treatment discontinuation, rescue therapy, progression, or death are handled, since these events give rise to missing data that "needs to be addressed as a missing data problem in the statistical analysis" once the estimand is fixed.

Per the FDA Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics guidance (Final, December 2018), symptom endpoints may include "specific symptom endpoints" or "composite symptom endpoints, such as the myelofibrosis symptom assessment form," and time-to-event symptom analyses.

PRO Endpoints in Oncology: Instrument Selection and Labeling Claims

PRO instruments must be fit-for-purpose, validated in the target tumor population, and pre-specified with a clear conceptual framework per the FDA Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims guidance (Final, 2009) and the FDA Core Patient-Reported Outcomes in Cancer Clinical Trials guidance (Draft, 2021 / Final, 2024). Core oncology instruments and their typical claim targets:

Instrument Domain Typical oncology use Labeling claim precedent
EORTC QLQ-C30 + disease modules (LC13, BR23, CR29, OV28, MY20) HRQoL, functioning, symptoms Global QoL and functional scales in Phase 3 metastatic trials Supportive labeling (e.g., enzalutamide mCRPC QoL maintenance)
FACT-G / FACT-L / FACT-P / FACT-B HRQoL, disease-specific U.S. pivotal trials; physical/functional/social/emotional Supportive labeling in prostate, lung, breast
MDASI (core + modules) Symptom burden Symptom improvement endpoints Limited stand-alone; supportive
MF-SAF / MPN-SAF TSS Myelofibrosis symptoms TSS50 response at Week 24 Primary efficacy labeling (ruxolitinib, COMFORT-I)
BPI (Brief Pain Inventory) Pain severity / interference Bone-pain endpoints in mCRPC, bone metastases Pain-response labeling (radium-223, abiraterone)
PRO-CTCAE Symptomatic AE Tolerability / Project Optimus dose-finding Supportive tolerability characterization
EQ-5D-5L Health utility HEOR and reimbursement; not efficacy labeling Not an efficacy claim driver

Core principles for instrument selection:

  • Concept of interest must match the claim (symptom improvement ≠ HRQoL maintenance ≠ functional benefit).
  • Content validity demonstrated in the specific tumor population and line of therapy.
  • Recall period and administration frequency aligned with the estimand's time horizon.
  • Psychometric evidence (reliability, construct validity, responsiveness) pre-submitted; meaningful-change threshold (MID) anchored in that population.
  • Missing data strategy is part of fit-for-purpose evaluation — instruments with chronic high attrition cannot support a labeling claim even with valid psychometrics.

Labeling-claim archetypes:

  1. Symptom improvement claim — primary or co-primary symptom endpoint (e.g., MF-SAF TSS50); requires blinded design or strong sensitivity analyses, pre-specified responder definition anchored to MID, and reference-based/tipping-point sensitivity for MNAR.
  2. Symptom delay / time-to-deterioration claim — composite TTDD with death as event; stratified log-rank, competing-risk sensitivity.
  3. HRQoL maintenance claim — MMRM on global QoL across visits with pre-specified primary visit; typically supports descriptive labeling rather than efficacy.
  4. Tolerability characterization (non-claim) — PRO-CTCAE bother/interference descriptives; informs prescribing information safety section.

Regulatory Position

  • ICH E9(R1) (Final, 2019) — the main estimator must be aligned with the estimand, and "to explore the robustness of inferences from the main estimator to deviations from its underlying assumptions, a sensitivity analysis should be conducted." Missing-data-specific sensitivity analyses (e.g., reference-based imputation, delta adjustment) are required whenever the primary analysis invokes MAR.
  • ICH E8(R1) (Final, 2021) — patient-centered quality factors and pre-specified statistical analysis plans are "critical to quality" in confirmatory oncology studies, including PRO endpoints.
  • FDA Cancer Endpoints Guidance (Final, 2018) — supports symptom/PRO endpoints for regular approval when clinically meaningful, well-defined, and protected from bias (blinding, complete capture, pre-specified analysis). PROs rarely support accelerated approval alone; they most often support labeling claims or serve as key secondary endpoints alongside OS/PFS.

When to Use

  • Symptom-directed endpoints — myelofibrosis (MF-SAF Total Symptom Score 50% response; Jakafi), CRPC bone pain, myelodysplastic syndrome fatigue, cachexia interventions.
  • QoL/functioning — EORTC QLQ-C30, FACT-G, and disease-specific modules (QLQ-LC13 NSCLC, QLQ-BR23 breast, FACT-L, FACT-P) as secondary endpoints in metastatic Phase 3 trials.
  • Biomarkers — longitudinal ctDNA, PSA kinetics (PCWG3), tumor size (sum of diameters, RECIST 1.1) for exposure-response and dose-optimization (Project Optimus).
  • Settings — primarily metastatic/advanced disease; maintenance and supportive-care trials; neoadjuvant symptom burden; post-transplant GVHD symptom tracking.

Design Considerations

Model choice

Scenario Preferred model Rationale
Two timepoints (baseline + one post-baseline) ANCOVA adjusting for baseline Most efficient; no covariance to model
≥3 visits, continuous outcome, MAR plausible MMRM with visit as categorical fixed effect Uses all partial data; valid under MAR; no explicit imputation
Binary/count repeated outcomes GLMM or GEE Logit/log link; robust SE for GEE
Trajectory shape of interest Random-slope LMM Models subject-specific growth

MMRM specification (primary template)

Change from baseline ~ treatment + visit + treatment×visit + baseline + baseline×visit + stratification covariates, with unstructured (UN) covariance within subject, Kenward–Roger denominator df, REML estimation. R: mmrm::mmrm() or nlme::gls() with corSymm; SAS: PROC MIXED with REPEATED / TYPE=UN.

MMRM assumptions

  • MAR conditional on observed outcomes, baseline, and covariates in the model
  • Covariance structure: UN preferred when visits ≤ ~6; fallback Toeplitz or AR(1) for many visits or convergence failure
  • Visits treated as categorical; schedule aligned across arms (analysis windows pre-specified)
  • Estimand-linked: MMRM estimates the hypothetical effect if subjects had remained on treatment and assessed per schedule — requires explicit ICE strategy declaration

Assessment schedule

  • Pre-specify PRO completion windows (e.g., ±7 days of imaging visit), order of administration (PRO before clinical contact to avoid bias), and analysis visits.
  • Compliance thresholds (commonly ≥70% baseline, ≥60% at analysis visit) should be monitored; deviations trigger sensitivity analyses.

Alpha allocation & multiplicity

  • PRO endpoints are typically secondary: alpha recycled via graphical testing (Bretz/Maurer) or fixed-sequence gatekeeping after primary PFS/OS success.
  • Across visits: one designated primary visit (e.g., Week 24) for hypothesis testing; other visits descriptive. Avoid "significance at any visit" without adjustment.

Intercurrent Events

For PRO/repeated-measures endpoints in oncology, the dominant ICEs are:

  1. Disease progression with treatment discontinuation

    • Strategy: Hypothetical (what would scores be if subjects remained on treatment) for symptom-improvement claims; Treatment policy (follow-up regardless) for disease-related symptom endpoints.
    • Statistical consequence: Hypothetical → MMRM under MAR + reference-based/delta sensitivity; Treatment policy → requires continued PRO collection post-progression.
    • SAP template: "For patients who discontinue treatment due to progression, the hypothetical strategy is applied; data collected after discontinuation are set to missing and imputed under MAR via MMRM, with reference-based (J2R) imputation as sensitivity."
  2. Death

    • Strategy: Composite (deterioration or death = failure) for time-to-deterioration PROs; While-on-treatment for symptom scores among survivors.
    • Statistical consequence: Composite → survival-style analysis (Kaplan–Meier, Cox); while-on-treatment → conditional on survival and biased toward healthier patients.
    • SAP template: "Time to definitive deterioration is analyzed as a composite endpoint where death without prior deterioration is treated as an event."
  3. Rescue/subsequent anticancer therapy

    • Strategy: Treatment policy for health-utility/QoL; Hypothetical for symptom-pharmacology claims.
    • Statistical consequence: Treatment policy requires continued data collection post-switch; hypothetical requires censoring and MAR/MNAR sensitivity.
    • SAP template: "Data following initiation of subsequent systemic anticancer therapy are retained and included under the treatment-policy strategy for QoL endpoints; under the hypothetical strategy for targeted symptom endpoints, post-switch data are excluded and imputed."

Missing Data Strategy by Estimand

Estimand strategy Primary analysis Sensitivity analyses
Treatment policy Direct-likelihood MMRM or MI using all observed data incl. post-ICE Tipping-point analysis; pattern-mixture with varying MAR extrapolation
Hypothetical MMRM under MAR (discard post-ICE data) Reference-based (J2R, CIR, CR); delta adjustment
Composite (time-to-deterioration) KM + stratified log-rank, Cox Competing-risk (Gray's test) treating death as competing
While-on-treatment MMRM on observed windows Restrict to subjects with ≥X visits; selection-model sensitivity

Multiple Imputation under MAR

  1. Impute missing outcomes from posterior predictive distribution conditional on observed data, treatment, baseline, covariates (M ≥ 50 imputations).
  2. Analyze each completed dataset via ANCOVA/MMRM.
  3. Pool using Rubin's rules. R: mice, mi, Hmisc::aregImpute.

Reference-Based Imputation under MNAR

Core idea (Carpenter/Roger): after ICE in the active arm, impute as if the subject followed the control-arm trajectory. Controlled, conservative departures from MAR aligned with the hypothetical estimand.

  • J2R (Jump to Reference) — post-ICE mean jumps to control mean immediately.
  • CIR (Copy Increments in Reference) — preserves pre-ICE level, future increments follow reference.
  • CR (Copy Reference) — full trajectory replaced by reference group's.

R: mimix, RefBasedMI, rbmi (Roche). SAS: PROC MI with MNAR statement.

Delta-Adjusted (Tipping-Point) Sensitivity

Imputed values in the experimental arm are shifted by +δ (worse). δ is increased until the treatment effect loses significance; the "tipping point" is compared to plausible clinical magnitudes. Reported as range of δ sustaining significance.

Composite and While-on-Treatment Approaches for PRO Deterioration

  • Time to Deterioration (TTD) — first ≥MID worsening from baseline (e.g., ≥10-point on QLQ-C30 functional scale). Death without prior deterioration counted as event (composite).
  • Time to Definitive Deterioration (TTDD) — requires confirmed worsening at two consecutive visits without subsequent recovery.
  • Responder analyses — proportion achieving ≥MID improvement sustained at primary timepoint.
  • While-on-treatment means — restrict analysis window to on-treatment visits; interpret cautiously (selection bias toward responders).

Practical Issues

  • Multiplicity across visits: prespecify one primary analysis visit; use hierarchical or graphical alpha for multi-visit claims.
  • Change-from-baseline interpretability: report least-squares mean difference with 95% CI and anchor to minimal important difference (MID); avoid over-interpreting transient between-visit differences.
  • Death/progression handling: never "carry forward" observed scores from before death (LOCF is discouraged by ICH E9(R1) and FDA); use composite endpoints or explicit estimand-aligned imputation.
  • Open-label bias: PROs in open-label trials are susceptible to response bias; supportive blinded endpoints and sensitivity analyses required.
  • Baseline balance: strong baseline-outcome correlation → always include baseline and baseline×visit.
  • Convergence: UN covariance with many visits/small N may fail; use Toeplitz fallback and document in SAP.

SAP Template — Primary MMRM with Sensitivity Hierarchy

Primary analysis (hypothetical estimand). Change from baseline in [QLQ-C30 Global Health Status] at Week 24 is analyzed using a mixed model for repeated measures including fixed effects for treatment, scheduled visit (categorical), treatment-by-visit interaction, baseline score, baseline-by-visit interaction, and stratification factors [...]. An unstructured covariance matrix models within-subject errors with Kenward–Roger denominator degrees of freedom. The between-treatment LS-mean difference at Week 24 with two-sided 95% CI is the primary estimate. Data after treatment discontinuation or initiation of subsequent anticancer therapy are set to missing (hypothetical strategy).

Sensitivity 1 — Covariance robustness. Repeat MMRM with Toeplitz covariance.

Sensitivity 2 — MAR via MI. Multiple imputation (M=100) under MAR using treatment, baseline, covariates, and observed post-baseline values; analyze each completed dataset by ANCOVA at Week 24; pool via Rubin's rules.

Sensitivity 3 — MNAR reference-based. Jump-to-Reference (J2R) imputation of post-ICE values in the experimental arm using the control arm as reference (rbmi, M=100).

Sensitivity 4 — Delta-adjusted tipping point. Impute under MAR, then add δ ∈ {0, 2, 4, …, 20} to imputed values in the experimental arm; report the smallest δ at which the Week-24 LS-mean difference loses 95% significance, interpret against MID=10.

Sensitivity 5 — Composite supportive. Time to definitive deterioration (confirmed ≥10-point worsening) with death as event; Kaplan–Meier and stratified log-rank.

Supportive — Treatment policy. Repeat primary MMRM retaining all observed data regardless of ICE occurrence.

Decision rule: primary claim is based on Primary analysis; claim is considered robust if sensitivities 1–4 agree in direction with Primary, and the tipping-point δ exceeds the MID.

Regulatory Precedent

Fewer than 3 endpoint-specific precedents are available from the provided retrieval context. One explicit example is cited in the FDA 2018 Cancer Endpoints guidance:

NCT# Trial Drug Indication Endpoint Outcome
NCT00934544 COMFORT-I Ruxolitinib Myelofibrosis MF-SAF Total Symptom Score ≥50% reduction at Week 24 (composite symptom endpoint, MMRM supportive) Supported regular approval (2011) — cited in FDA 2018 endpoint guidance as exemplar of composite symptom endpoint

Additional context-supported references (from ICH E9(R1) and estimand literature): RECORD-1 everolimus (crossover-adjusted OS via RPSFT) illustrates estimand-aligned sensitivity for related analyses but is not a PRO precedent.

Limitations and Pitfalls

  • MAR is unverifiable — informative dropout (progression, death, toxicity) is the norm in oncology, so reference-based and delta sensitivity analyses are effectively mandatory, not optional.
  • LOCF is deprecated — biases results and violates ICH E9(R1) alignment principles; persists only as historical reference.
  • Over-reliance on while-on-treatment estimates — creates survivorship bias; should never stand alone for symptom-benefit claims.
  • PRO open-label bias — unblinded designs cannot distinguish pharmacologic effect from expectation; strong regulatory skepticism for labeling claims.
  • Ceiling/floor effects in QoL instruments limit MMRM validity; consider rank-based or mixture approaches.
  • Multiplicity abuse — "significance at any visit" claims without pre-specified primary visit will not survive review.
  • Schedule misalignment — differential assessment frequency between arms corrupts visit-by-visit inference.

Source: ICH E9(R1) Addendum on Estimands and Sensitivity Analysis (2019, Final); ICH E8(R1) General Considerations for Clinical Studies (2021, Final); FDA Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics (December 2018, Final); MMRM, reference-based imputation, and estimand-oncology literature summaries. Status: Final guidance (all three regulatory sources) Compiled from retrieved FDA chunks + literature summaries