Master Protocols: Basket, Umbrella, and Platform Trials

Scope: design, statistical, and regulatory considerations for oncology master protocols. Audience: biostatisticians and clinical scientists writing master-protocol SAPs and engaging FDA. Anchored in the FDA Master Protocols: Efficient Clinical Trial Design Strategies to Expedite Development of Oncology Drugs and Biologics guidance (final, 2022) and the literature summary raw/literature/master_protocols_summary.md.

Definition

The FDA 2022 Master Protocols guidance (final) frames a master protocol as a single overarching protocol designed to evaluate multiple investigational drugs and/or multiple disease populations in parallel sub-studies under a common operational and statistical infrastructure. Three sub-types are recognized:

Basket trial — one investigational therapy is tested in patients sharing a common molecular alteration but spanning multiple histologies. The biomarker, not the histology, is the unit of enrollment.
Umbrella trial — patients with a single tumor type are stratified by biomarker status and assigned to biomarker-matched investigational arms, often sharing a common control.
Platform trial — a perpetual master protocol in which arms enter and exit over time according to pre-specified rules, typically supported by Bayesian response-adaptive randomization and interim futility/efficacy stopping.

Master protocols are intended to accelerate development by sharing screening, infrastructure, and (where appropriate) controls across sub-studies, while still permitting sub-study-level inference suitable for regulatory submission.

Regulatory Position

Master protocols can support Accelerated Approval, Regular Approval, Breakthrough Therapy, and Fast Track designations, depending on the endpoint, control structure, and pre-specified statistical plan of each sub-study.

Key positions from FDA 2022 Master Protocols guidance (final) and the 2019 Adaptive Designs for Clinical Trials of Drugs and Biologics guidance (final):

Pre-specification is mandatory. Adaptive features (arm dropping, sample-size re-estimation, response-adaptive randomization, Bayesian borrowing) must be specified before unblinding, with documented operating characteristics from simulation.
Type I error must be controlled across all confirmatory comparisons within the master protocol. The set of confirmatory hypotheses, the multiplicity strategy (Bonferroni, hierarchical/gatekeeping, Bayesian hierarchical model with calibrated priors), and the simulated family-wise error rate must be in the SAP.
Each sub-study seeking approval requires its own analysis plan. A sub-study supports a marketing application in the same way a stand-alone trial would; it does not inherit approval from the umbrella/platform.
Operational firewall. The IDMC, statistical analysis center, and trial management organization must maintain blinding across sub-studies; results from one sub-study should not inform decisions in another except via pre-specified rules.
Shared control arms are acceptable when randomization, eligibility, and follow-up windows are common, but contemporaneity must be addressed: only concurrently randomized controls should contribute to the primary comparison; non-concurrent controls (e.g., from earlier calendar windows) require pre-specified down-weighting or sensitivity analysis.

ICH E20 (Adaptive Clinical Trials, draft 2024) and EMA's reflection paper on master protocols reinforce these requirements.

When to Use

Basket — appropriate when a tumor-agnostic mechanism is plausible (e.g., NTRK fusions, BRAF V600E, MSI-H, RET fusions, NRG1 fusions). Used for histology-spanning Phase 2 signal-finding (e.g., NCI-MATCH sub-protocols) and for accelerated-approval submissions of tumor-agnostic indications (e.g., the larotrectinib and entrectinib programs for NTRK fusion–positive solid tumors).
Umbrella — appropriate in a histology with rich, mutually exclusive driver biomarkers and sufficient screening volume to populate parallel matched arms. Canonical setting: previously treated squamous NSCLC (LUNG-MAP / SWOG S1400). Also used in MDS, AML (Beat AML), and recurrent glioblastoma.
Platform — appropriate when (i) multiple agents are anticipated to enter sequentially over years, (ii) a stable disease population and primary endpoint exist, and (iii) the operational sponsor (often academic-industry consortium) can sustain perpetual governance. Examples: I-SPY 2 (locally advanced breast cancer, neoadjuvant pCR), GBM AGILE (newly diagnosed and recurrent glioblastoma, OS), STAMPEDE (advanced prostate cancer, OS — multi-arm multi-stage / MAMS).

Master protocols are less appropriate when (a) the disease is too rare to populate concurrent arms, (b) endpoints differ substantially across candidate therapies, or (c) sponsors require strict isolation of their data.

Design Considerations

Basket trials — tumor-agnostic design and Bayesian hierarchical modeling

Primary endpoint: ORR by RECIST 1.1 (solid tumors) or Lugano (lymphoma), assessed by BICR for accelerated-approval-supporting sub-studies; investigator assessment acceptable for signal-finding.
Statistical unit: the biomarker-defined cohort. Each histology cohort is typically a single-arm Simon two-stage or Bayesian optimal design.
Borrowing across histologies — BHM / BHSEM / EXNEX. When response rates are assumed exchangeable across histology baskets, a Bayesian Hierarchical Model (Berry, Broglio, Groshen, Berry 2013) borrows information toward a common mean, increasing power in small baskets at the cost of bias if exchangeability fails. Variants:
BHM (calibrated): common τ² (between-basket variance) tuned via pre-trial simulation to bound Type I error per basket (e.g., ≤ 10%).
BHSEM (Berry et al.): Bayesian Hierarchical Single-arm Exchangeability Model — exchangeability prior with explicit cut-points for declaring activity per basket.
EXNEX (Neuenschwander et al. 2016): mixture prior placing weight on an exchangeable component and a non-exchangeable component, allowing a basket to "opt out" if its data are inconsistent with the others. Now standard in BMS, Roche, and Novartis basket programs.
Exchangeability assumption. Borrowing is only valid if biological response is similar across histologies. Pre-trial: justify with biology (e.g., common pathway dependence). In-trial: monitor heterogeneity (e.g., I² across baskets, posterior predictive checks). Pre-specify a fallback to independent-basket analysis if heterogeneity exceeds a threshold.
Operating characteristics to simulate: (a) Type I error per basket under global null, (b) Type I error per basket under "one true positive among several nulls" (the borrowing-induced false-positive scenario), (c) power per basket under a homogeneous alternative, (d) bias and coverage of the posterior interval per basket.
R packages: bhmbasket, basket, RBesT (for EXNEX-style mixture priors), rstan/brms for custom hierarchical models.

Umbrella trials — multiple arms, one tumor type, shared control

Statistical unit: tumor histology, with biomarker-matched sub-studies nested within.
Shared control architecture. Two designs:
Common-control umbrella (LUNG-MAP, BATTLE-2): a single non-match / standard-of-care arm receives all patients screened biomarker-negative or biomarker-unmatched. Each sub-study is a randomized comparison of its experimental arm vs. the shared control, restricted to the contemporaneous control patients enrolled during that sub-study's accrual window.
Sub-study-specific control (FOCUS4 in colorectal): each biomarker stratum has its own randomized control arm; no cross-stratum borrowing.
Concurrency rule: to avoid time-trend bias, only controls randomized during the same calendar window as a given experimental arm contribute to that arm's primary analysis. Non-concurrent controls may be used in sensitivity analyses with pre-specified down-weighting.
Biomarker assignment ordering: umbrella protocols must pre-specify a hierarchy when patients carry multiple eligible biomarkers (e.g., LUNG-MAP's gating to immunotherapy-eligible vs. targeted-therapy-eligible sub-studies). Without this, post-hoc cohort assignment introduces selection bias.
Endpoints: typically PFS or OS for randomized sub-studies (RECIST 1.1, BICR for accelerated-approval-bearing sub-studies). pCR for neoadjuvant umbrellas.

Platform trials — perpetual design, MAMS, response-adaptive randomization

Multi-Arm Multi-Stage (MAMS): STAMPEDE-style design with one shared control and multiple experimental arms; pre-planned interim analyses drop arms for futility (intermediate endpoint) and confirm efficacy at later stages (definitive endpoint, e.g., OS). Type I error controlled by Dunnett-style adjustment or simulation-based critical values.
Response-adaptive randomization (RAR): I-SPY 2 and GBM AGILE shift allocation toward arms with higher posterior probability of efficacy in each biomarker stratum. Design choices:
Burn-in period of equal randomization (typically the first 20–40 patients per arm) to stabilize early estimates.
Allocation cap (e.g., minimum 10% to control) to preserve identifiability and avoid Simpson-style time-trend confounding.
Pre-specified graduation/futility thresholds (I-SPY 2: ≥ 85% posterior predictive probability of success in a 300-patient confirmatory trial → graduation).
Adding new arms. Adding an arm mid-trial is permitted but requires a pre-specified amendment process; the new arm's primary analysis uses concurrently randomized controls only. Pre-existing arms' analyses are unaffected.
Type I error across concurrent comparisons. Strategies:
Per-arm α (most common in regulatory-bearing platforms): each arm tested at its own α (e.g., 0.025 one-sided), with the family-wise rate explicitly characterized via simulation. FDA generally accepts this when each arm is treated as a stand-alone confirmatory comparison.
Strong FWER control (Bonferroni, Hochberg, gatekeeping): used when a single regulatory action depends on the joint result of several arms.
Bayesian decision rules (I-SPY 2, GBM AGILE): operating characteristics simulated under nulls and alternatives; FDA evaluates by simulated frequentist FWER, not by Bayesian posterior alone.
R packages and software: gsDesign and rpact for MAMS critical values; BATools, bcrm, OCTOPUS, and FACTS (commercial) for Bayesian platform simulation.

Intercurrent Events

The 2-3 most common ICEs in master protocols are shared with conventional trials (treatment discontinuation, subsequent therapy, death), but two are distinctive:

Re-assignment to another sub-study after biomarker re-screening (umbrella/platform).
- Strategy: Treatment Policy for the originating sub-study (analyzed as randomized regardless of subsequent re-assignment); Hypothetical for any sub-study attempting to estimate effect "as if no re-assignment occurred."
- Statistical consequence: Treatment-policy keeps the patient in the originating sub-study's ITT denominator; hypothetical analyses censor at re-assignment and may use IPCW or g-methods to adjust for informative re-assignment.
- SAP language: "Patients who progress on the originating sub-study and are re-assigned to a subsequent biomarker-matched sub-study under the master protocol will be analyzed in the originating sub-study under a treatment-policy strategy. A supplementary hypothetical-strategy analysis censors follow-up at the date of re-assignment, with sensitivity to informative re-assignment via stabilized IPCW."
Arm-dropping due to platform-level futility (platform).
- Strategy: Treatment Policy for patients enrolled before drop; Hypothetical (as-if-completed-planned-treatment) is generally not appropriate because dropping is a design feature, not an external event.
- Statistical consequence: Final analysis of a dropped arm uses all randomized patients; the truncated information must be reflected in the variance estimate (e.g., via sequential design boundaries).
- SAP language: "Should an arm be dropped at a pre-specified interim futility look, all patients randomized to that arm prior to the drop will contribute to the final analysis under a treatment-policy strategy. The critical value for the final test will be adjusted for the early stopping rule per the pre-specified group-sequential design."
Crossover between biomarker-matched experimental and shared-control arms (umbrella).
- Strategy: Treatment Policy primary; Hypothetical "no-crossover" supplementary, estimated via RPSFT or 2SRST when OS is the endpoint.
- Statistical consequence: Treatment-policy attenuates effect under crossover; hypothetical estimators recover the "no-switch" effect under their identifying assumptions (see Intercurrent Events in Oncology Trials and Sensitivity Analysis Playbook for Oncology Trials).
- SAP language: "OS will be analyzed under a treatment-policy strategy (primary). A supplementary hypothetical-strategy estimate adjusting for post-progression crossover will be reported using RPSFT with sensitivity to the re-censoring rule."

Regulatory Precedent

NCT#	Trial	Drug / Sub-study	Indication	Endpoint	Outcome
NCT02465060	NCI-MATCH (EAY131) — master	Multiple targeted agents matched to mutations	Tumor-agnostic, advanced solid tumors and lymphoma	ORR per sub-study	Active; multiple sub-study read-outs (e.g., copanlisib NCT06400238 [PTEN], erdafitinib NCT06351371 [FGFR], osimertinib NCT06303167 [EGFR]); informs labeling and accelerated pathways.
NCT02154490	LUNG-MAP (SWOG S1400)	Multiple biomarker-matched arms with shared control	Previously treated squamous NSCLC	PFS / OS per sub-study	Active platform; pembrolizumab non-match sub-study (S1400I) and durvalumab+tremelimumab arm reported; protocol amended to expand immunotherapy sub-studies.
NCT01042379	I-SPY 2	Multiple neoadjuvant agents (Bayesian RAR)	Locally advanced HER2-stratified breast cancer	pCR (Bayesian predictive probability of Phase 3 success)	≥ 7 graduations to date (e.g., neratinib, pertuzumab, veliparib); evolved to I-SPY 2.2 SMART design.
NCT03970447	GBM AGILE	Multiple agents, Bayesian RAR	Newly diagnosed and recurrent glioblastoma	OS by stratum	Active platform; first regorafenib results reported; design recognized by FDA in seamless Phase 2/3 framework.
—	TAPUR (ASCO)	FDA-approved targeted agents matched to mutations	Tumor-agnostic, advanced solid tumors	ORR per histology × biomarker	Active basket registry; histology-level read-outs published; NCT identifier per sub-study (master sponsor: ASCO).

Sub-protocol NCTs cited above (NCT06400238, NCT06351371, NCT06303167) are confirmed in ingest/ctg_index/ctg_basket_umbrella_index.json. Master-protocol NCT identifiers (NCT02465060, NCT02154490, NCT01042379, NCT03970447) are widely reported in the literature summary raw/literature/master_protocols_summary.md; verify before citing in regulatory submissions.

Limitations and Pitfalls

Exchangeability failure in basket designs. Bayesian hierarchical borrowing inflates Type I error in inactive baskets when one or two baskets are strongly active (the "spillover" effect). EXNEX or robust mixture priors mitigate but do not eliminate this. Always pre-specify and report per-basket simulated Type I error under the "one-true-positive" scenario, not just the global null.
Shared-control non-concurrency bias. Using historical or non-concurrent controls within an umbrella inflates the apparent effect when standard-of-care improves over calendar time (very common in NSCLC and melanoma during the immunotherapy era). Restrict primary analyses to concurrent controls.
Type I error inflation from arm addition. Adding arms mid-platform without pre-specified rules and without restricting to concurrent controls invalidates frequentist error control. Document the amendment process before unblinding.
Operational unblinding. Sub-study results often become public (e.g., I-SPY 2 graduations) while the platform continues. This is acceptable only if the platform's IDMC and statistical center maintain firewalls and if analyses of ongoing arms cannot be informed by graduated-arm data.
Regulatory acceptance of pCR or ORR sub-study results. Accelerated approval requires the surrogate endpoint be "reasonably likely to predict clinical benefit" in the specific indication; pCR is validated for neoadjuvant HER2+ and TNBC breast cancer (FDA 2014 guidance), not universally. Each sub-study must justify its surrogate.
Sample size and power per sub-study. Borrowing increases power but the primary basket must still be adequately powered as a stand-alone analysis when borrowing assumptions fail; pre-specify a minimum per-basket sample size.
Multiplicity across "graduation" decisions. Platform graduations are intermediate decisions, not regulatory approvals; do not interpret them as confirmatory. Confirmatory inference still requires either a pre-specified Phase 3 (I-SPY model) or an integrated seamless Phase 2/3 design with pre-specified Type I error allocation (GBM AGILE model).

Backlinks

ICH E9(R1) Estimand Framework Intercurrent Events in Oncology Trials Sensitivity Analyses for Estimands Adaptive Trial Designs in Oncology Group Sequential Designs (GSD) Multiplicity Control in Oncology Trials Simulation-Based Power Analysis Interim Analysis and DSMB Operations FDA Approval Pathways in Oncology Multiple Endpoints and Alpha Allocation Sensitivity Analysis Playbook for Oncology Trials

Sources: FDA Master Protocols: Efficient Clinical Trial Design Strategies to Expedite Development of Oncology Drugs and Biologics (2022, final guidance); FDA Adaptive Designs for Clinical Trials of Drugs and Biologics (2019, final guidance); ICH E20 Adaptive Clinical Trials (draft 2024); literature summary raw/literature/master_protocols_summary.md; ClinicalTrials.gov index ingest/ctg_index/ctg_basket_umbrella_index.json for sub-protocol NCTs. Status: Final (FDA 2022 Master Protocols, FDA 2019 Adaptive Designs); Draft (ICH E20). Compiled from retrieved FDA chunks + ClinicalTrials.gov records + literature summary.