Response, Binary, and Disease-Control Endpoint Methods
Definition
Binary response endpoints in oncology summarize each patient's treatment outcome as a dichotomous (yes/no) indicator, typically derived from pre-specified tumor-assessment criteria applied over a defined evaluation window.
The FDA's Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics (2018 final guidance) defines the principal families:
"ORR is defined as the proportion of patients with tumor size reduction of a predefined amount and for a minimum time period. Response duration usually is measured from the time of initial response until documented tumor progression."
"Generally, the FDA has defined ORR as the sum of partial responses plus CRs. When defined in this manner, ORR is a direct measure of a drug antitumor activity, which can be evaluated in a single-arm trial."
"CR is defined as no detectable evidence of tumor. CR is generally measured through imaging studies (e.g., CT scans) or through histopathologic assessment (e.g., bone marrow)."
Related binary endpoints used in oncology trials include:
- Disease Control Rate (DCR) — proportion achieving CR + PR + stable disease (SD) for a minimum duration; typically exploratory, not accepted alone for approval.
- Pathological Complete Response (pCR) — absence of invasive disease in breast and bladder specimens at surgery; endpoint in neoadjuvant settings (see FDA 2020 pCR guidance).
- MRD Negativity — absence of detectable residual clonal disease below a defined sensitivity threshold (e.g., 10⁻⁵, 10⁻⁶) by NGS, flow cytometry, or ctDNA. Per FDA's November 2024 final guidance, MRD negativity in multiple myeloma is acceptable as an intermediate endpoint reasonably likely to predict clinical benefit for accelerated approval (Source: ctdna_mrd_endpoints_summary, 2024–2025).
Regulatory Position
Approval pathways supported:
- Accelerated approval: ORR (with durability) is "the most commonly used surrogate endpoint in support of accelerated approval" (FDA 2018 Final).
- Regular (traditional) approval: Durable ORR has supported traditional approval when effect size is large and the setting is refractory single-arm (e.g., crizotinib for ROS1+ NSCLC; hormonal breast cancer agents).
- Neoadjuvant settings: pCR supports accelerated approval in high-risk early breast cancer (FDA 2020 pCR guidance, Final).
- MRD (hematologic): Accepted as intermediate surrogate for accelerated approval in multiple myeloma (FDA 2024 Final guidance); under evaluation for CLL and ALL.
Key quotes:
"A large improvement in progression-free survival (PFS) or high, substantiated durable ORR has been used to support traditional approval in select malignancies, but magnitude of effect, relief of tumor-related symptoms, and duration of effect are factors." (FDA 2018, Final)
"Treatment effect measured by ORR can be a surrogate endpoint to support accelerated approval, a surrogate endpoint to support traditional approval, or it can represent direct clinical benefit." (FDA 2018, Final)
"ORR has been used as the basis only for accelerated approval for NSCLC." (FDA NSCLC 2020, Final)
"When the primary study endpoint is based on tumor measurements (e.g., PFS or ORR), tumor assessments generally should be verified by central reviewers blinded to study treatments." (FDA 2018, Final)
When to Use
- Single-arm registrational studies in refractory settings, where spontaneous regressions do not occur and large ORR can be attributed to the drug (e.g., crizotinib in ROS1+ metastatic NSCLC; targeted therapies in rare driver-defined subsets).
- Hormonal breast cancer — ORR supported accelerated approval historically (FDA 2018 explicit example).
- Relapsed/refractory hematologic malignancies — CR and MRD-negative CR in multiple myeloma, DLBCL, FL, CLL, ALL; response criteria per IMWG, Lugano, iwCLL, NCCN.
- Neoadjuvant breast/bladder cancer — pCR as primary endpoint supporting accelerated approval (KEYNOTE-522 pattern).
- Phase 2 go/no-go decisions — ORR or DCR for proof of activity across all solid tumors before Phase 3.
- Not appropriate as sole primary endpoint in adjuvant settings (no measurable disease) or maintenance (response already achieved).
Design Considerations
Assessment criteria and schedule
- Solid tumors: RECIST 1.1 for cytotoxic/targeted agents; iRECIST for immune checkpoint inhibitors to account for pseudoprogression.
- Lymphoma: Lugano 2014 (PET-based for FDG-avid histologies).
- Multiple myeloma: IMWG 2016 uniform response criteria; MRD assessed via next-generation flow/sequencing at 10⁻⁵ minimum.
- Prostate: PCWG3.
- CNS: RANO / iRANO.
- Schedule: typically every 6–12 weeks while on treatment; maintain same schedule across arms to prevent assessment-time bias.
IRC (Independent Review Committee) requirements
Per FDA 2018 (Final): IRC blinded central review is recommended when ORR or PFS is the primary endpoint, particularly in open-label trials or diseases with imprecise tumor margins (pleural, peritoneal, bone). IRC adjudicates:
- Response category (CR/PR/SD/PD)
- Date of first response
- Confirmation of response at the next scheduled scan
- Progression date (for DOR)
Confirmation rules
- RECIST 1.1: for single-arm trials and where ORR is primary, CR or PR must be confirmed ≥4 weeks later by a repeat scan.
- Lugano: confirmation typically via repeat PET/CT.
- In randomized trials, unconfirmed response may be primary; pre-specify this choice.
Pre-specified handling rules (estimand attributes)
- Evaluation window: specify the window during which best overall response (BOR) is counted (e.g., "from first dose through 24 weeks" or "until PD/new therapy").
- Non-responder imputation (NRI): patients who discontinue before an adequate assessment are counted as non-responders (primary analysis in most ORR SAPs).
- Unevaluable (UE) response: missing post-baseline scan → non-responder by default.
Primary analysis options
| Method | Setting | R function |
|---|---|---|
| Exact (Clopper–Pearson) 95% CI on proportion | Single-arm, small N | binom::binom.confint(x, n, methods="exact") |
| Wilson score CI | Preferred over Wald; better coverage at boundary | binom.confint(..., methods="wilson") |
| Two-sided Fisher's exact test | Small N two-arm ORR comparison | fisher.test() |
| Cochran–Mantel–Haenszel (CMH) | Stratified two-arm ORR test | mantelhaen.test() |
| Logistic regression | Covariate-adjusted ORR with stratification factors | glm(resp ~ trt + strata, family=binomial) |
| Stratified Miettinen–Nurminen (MN) risk difference | Stratified difference in proportions with CI | DescTools::BinomDiffCI, StratCMH |
| Stratified Mantel–Haenszel odds ratio | Stratified OR with Breslow–Day homogeneity | epitools::oddsratio.wald(..., method="mh") |
| Exact unconditional (Chan–Zhang, Barnard) | Sparse events, small N | Exact::exact.test |
Sample size drivers
- Null ORR (historical control)
p₀, targetp₁, α, power. - Simon two-stage (single-arm, Phase 2):
clinfun::ph2simon(pu, pa, ep1, ep2). - Two-proportion:
stats::power.prop.test, or exact viaExact::power.exact.test. - Example: Phase 2 single-arm NSCLC,
p₀ = 0.20,p₁ = 0.40, α=0.05 one-sided, 80% power → Simon optimal two-stage: stop at stage 1 if ≤3/13 respond; continue to 43 with ≥13 responses needed (typical range).
Effect sizes supporting approval
- Historical single-arm accelerated approvals: durable ORR ≥ 20–30% with prior-therapy failure and median DOR ≥ 6 months.
- ROS1+ NSCLC crizotinib: ORR ~72% supported accelerated, then traditional approval.
- BTK inhibitors in MCL/CLL: ORR 65–90%.
Intercurrent Events (ICH E9(R1))
IE 1: Subsequent anticancer therapy initiated before confirmation scan
- Strategy: Composite (patient classified as non-responder if new therapy begins before confirmation) or While-on-treatment (response evaluated only up to therapy switch).
- Statistical consequence: Response status fixed at switch; censoring for DOR occurs at switch date.
- SAP language: "Patients initiating subsequent anticancer therapy prior to confirmation of response will be classified as non-responders for the ORR analysis (composite strategy)."
IE 2: Early discontinuation due to toxicity before first post-baseline assessment
- Strategy: Composite (non-responder) for primary; Hypothetical (as if toxicity did not occur) for sensitivity.
- Statistical consequence: Primary ORR analysis uses NRI; hypothetical analysis uses multiple imputation conditional on observed trajectories.
- SAP language: "Patients who discontinue treatment due to AE prior to first post-baseline tumor assessment will be counted as non-responders in the primary analysis; a multiple-imputation sensitivity analysis under an MAR assumption will be provided."
IE 3: Death before first response assessment
- Strategy: Composite (non-responder).
- Statistical consequence: Death treated as failure; reported separately for transparency.
- SAP language: "Patients who die prior to the first protocol-specified tumor assessment will be counted as non-responders."
Regulatory Precedent
| NCT# | Trial | Drug | Indication | Endpoint | Outcome |
|---|---|---|---|---|---|
| NCT04129502 | TAK-788 first-line EGFR exon 20 NSCLC | Mobocertinib | Metastatic NSCLC EGFR ex20ins | Secondary: Confirmed ORR by BIRC/RECIST 1.1 (primary PFS) | Phase 3, active; ORR IRC-adjudicated per FDA 2018/2020 guidance |
| NCT01828112 | ASCEND-5 | Ceritinib vs chemo | ALK+ NSCLC post-crizotinib | Secondary: ORR per BIRC | Supported use; PFS primary, ORR confirmatory |
| NCT00012051 | HOVON R-DHAP + auto-SCT | Rituximab + high-dose chemo | Relapsed B-NHL (DLBCL/FL/PMBCL) | Secondary: Response rate | Lugano/Cheson-based response |
Notable single-arm ORR-based approvals referenced in FDA 2018 guidance: crizotinib for ROS1+ metastatic NSCLC (single-arm, durable ORR); gefitinib for NSCLC (durable ORR/2003); hormonal agents in breast cancer.
MRD precedent (FDA 2024 Final guidance): MRD-negativity accepted as intermediate endpoint for multiple myeloma accelerated approval; pending formal label precedents.
Limitations and Pitfalls
- Open-label bias: Investigator-assessed ORR is biased upward in open-label trials. FDA 2018 (Final): "tumor assessments generally should be verified by central reviewers blinded to study treatments."
- Assessment-schedule bias: Differential imaging frequency across arms inflates apparent response in the more-frequently-assessed arm; enforce identical schedules (FDA NSCLC 2020 Appendix D).
- Imprecise margins: Pleural effusions, bone disease, peritoneal carcinomatosis yield unreliable RECIST measurements (FDA 2018).
- Confirmation inflation: Requiring confirmation reduces apparent ORR by 10–30% vs unconfirmed.
- Sparse-event settings: Normal-approximation CIs (Wald) have poor coverage when x=0 or x=n; use exact/Wilson.
- Small biomarker subgroups: Multiplicity across biomarker strata requires pre-specified hierarchy or closed-testing; single-arm biomarker-defined ORR does not support causal claims without a control.
- DOR coupling: ORR without durable DOR (≥ 6 months) rarely supports approval; always pair reporting.
- Surrogacy: "Treatment effects on ORR have not been demonstrated to [predict OS] in NSCLC" (FDA 2020). Trial-level surrogate validity is disease- and mechanism-specific.
- Missing data: NRI is conservative but can bias toward null in differential-dropout settings; provide tipping-point sensitivity.
- MRD caveats: Assay sensitivity heterogeneity (10⁻⁴ vs 10⁻⁶), sampling depth, and BM-only vs imaging discordance require pre-specification.
SAP Template — Binary Response Primary Endpoint
PRIMARY ENDPOINT:
Confirmed Objective Response Rate (ORR) = proportion of randomized
patients achieving a best confirmed response of CR or PR per RECIST 1.1,
as assessed by blinded independent central review (BICR), within the
24-week evaluation window from randomization.
ESTIMAND:
- Population: ITT (all randomized).
- Variable: Binary indicator of confirmed CR or PR.
- Treatment: Randomized treatment arm.
- Population-level summary: Risk difference (experimental − control)
and odds ratio, each stratified by randomization factors.
- Intercurrent event strategies:
* Subsequent anticancer therapy before confirmation → composite
(non-responder).
* Treatment discontinuation for AE before first assessment → composite.
* Death before first assessment → composite (non-responder).
PRIMARY ANALYSIS:
- Point estimate: stratified Mantel–Haenszel risk difference with
Miettinen–Nurminen 95% CI, stratified by [PD-L1 TPS ≥50% vs <50%,
region, ECOG 0 vs 1].
- Hypothesis test: Cochran–Mantel–Haenszel χ² at two-sided α = 0.025,
under H0: ORR_exp = ORR_ctrl.
- Secondary: stratified odds ratio with Breslow–Day homogeneity check.
SENSITIVITY / SUPPORTIVE ANALYSES:
1. Unstratified Clopper–Pearson exact 95% CI per arm.
2. Per-protocol population.
3. Investigator-assessed ORR (concordance with BICR).
4. Multiple-imputation under MAR for unevaluable patients
(tipping-point at π = 0 to 1 imputation for experimental arm).
5. Subgroup forest plot with interaction p-values (descriptive).
MULTIPLICITY:
ORR tested first in a hierarchical sequence [ORR → DOR → OS];
downstream endpoints tested only if ORR significant at 0.025.
SAMPLE SIZE:
Assume ORR_ctrl = 0.25, ORR_exp = 0.45; α = 0.025 one-sided; power = 0.90.
Stratified CMH → n ≈ 250 (125 per arm). R: power.prop.test()
cross-checked against Exact::power.exact.test.
MRD SUB-STUDY (if applicable):
MRD negativity at 10⁻⁵ by NGS at cycle 9; reported with exact 95% CI;
descriptive between-arm comparison; not part of formal alpha.
Backlinks
Source: FDA Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics (2018); FDA Non-Small Cell Lung Cancer: Developing Drugs and Biologics for Treatment (2020); FDA Use of Circulating Tumor DNA for Early-Stage Solid Tumor Drug Development / MRD guidance (2024); ClinicalTrials.gov records (NCT04129502, NCT01828112, NCT00012051). Status: Final guidance for all cited FDA documents. Compiled from retrieved FDA chunks + ClinicalTrials.gov records.