AML Trial Design Patterns
Definition
Acute myeloid leukemia (AML) is a heterogeneous hematologic malignancy defined by clonal expansion of myeloid blasts (≥20% in bone marrow or peripheral blood per WHO criteria). AML trial design is stratified by: treatment intent (intensive induction-eligible vs. low-intensity), molecular subgroup (FLT3-ITD/TKD, IDH1/2, NPM1, CBF AML, secondary AML, TP53-mutated), treatment line (1L induction/consolidation, 2L relapsed/refractory), and patient fitness (age, ECOG PS, comorbidities).
CTG AML Phase 2/3 dataset (489 trials): ORR/CR primary in ~94/489 (19%), OS primary ~25/489 (5%), EFS primary ~15/489 (3%). Most trials are single-arm Phase 2 for accelerated approval via CR/CRi rate; Phase 3 trials increasingly use OS as primary.
Standard response criteria: ELN 2022 (European LeukemiaNet) for AML; IWG response criteria for AML/MDS. Key categories: CR (complete remission — blasts <5%, ANC ≥1000, platelets ≥100,000), CRi (CR with incomplete count recovery), CRh (CR with partial count recovery), PR (partial remission), morphologic leukemia-free state (MLFS).
Regulatory Position
FDA AML guidance (2022, Final): OS is the preferred primary endpoint for randomized Phase 3 AML trials. CR/CRi rate is accepted for accelerated approval in relapsed/refractory settings. EFS (composite of failure to achieve CR, relapse, or death) is acceptable as co-primary with OS.
FDA AML 2022 explicit IE position: "Subsequent treatments like HSCT or anti-AML drugs should be considered part of the overall treatment regimen and not censored in the primary analysis." — Treatment policy for HSCT is mandated; censoring at HSCT is not aligned with FDA's view of the clinical question.
Accelerated approval via CR rate: Multiple AML drugs approved via accelerated approval based on CR/CRi rate in single-arm studies (enasidenib IDH2, ivosidenib IDH1, gilteritinib FLT3). Confirmatory trials required.
Midostaurin precedent (2017): First FLT3-targeted therapy approved in AML based on OS benefit in randomized Phase 3 (RATIFY trial, OS HR 0.78). Established randomized OS as the gold standard for 1L AML approval.
Status: FDA AML Guidance 2022 = Final; FDA Cancer Endpoints 2018 = Final
When to Use Each Endpoint
By Setting and Line of Therapy
| Setting | Primary Endpoint | Key Secondary | Notes |
|---|---|---|---|
| 1L intensive-eligible (FLT3, IDH1/2, targeted) | OS | EFS, CR/CRi rate | RATIFY (midostaurin OS HR 0.78); VIALE-A (venetoclax+aza OS HR 0.66) |
| 1L intensive-ineligible (venetoclax combinations) | OS | CR/CRi rate, EFS | VIALE-A, VIALE-C patterns |
| 1L CBF AML (CR-focused) | CR/CRi rate (Phase 2); OS (Phase 3) | EFS, MRD negativity | High CR rates make OS impractical in early trials |
| 2L relapsed/refractory — targeted (FLT3, IDH1/2) | OS (Phase 3); CR/CRi (Phase 2) | EFS | QuANTUM-R (quizartinib, OS HR 0.76) |
| 2L relapsed/refractory — accelerated approval | CR/CRi rate (single-arm) | CR duration, OS | Enasidenib, ivosidenib, gilteritinib initial approvals |
| Post-HSCT maintenance | DFS/RFS or OS | MRD negativity | Azacitidine maintenance post-HSCT |
| AML with MDS-related changes | EFS or OS | CR rate, MRD | CPX-351 approval based on OS HR 0.69 |
Response Endpoint Hierarchy for AML
- CR (complete remission): No blasts, ANC ≥1000, platelets ≥100,000 — the primary response milestone; accepted for accelerated approval
- CRi (CR with incomplete recovery): CR criteria except for counts — combined CR/CRi is the standard composite response endpoint
- MLFS (morphologic leukemia-free state): Blast clearance without count recovery — intermediate milestone, not accepted alone for regulatory approval
- PR (partial remission): 50% reduction in blasts — limited regulatory use; combined with CR for ORR in some trials
- MRD-negative CR: CR + bone marrow MRD <10⁻³ (flow cytometry) or <10⁻⁴ (PCR) — emerging primary endpoint; no final FDA guidance yet for AML (see Emerging Endpoints in Oncology Trials)
Design Considerations
HSCT as Intercurrent Event
Hematopoietic stem cell transplantation is the most significant IE in AML trials:
FDA position (AML 2022): HSCT should NOT be censored in primary OS analysis. Treatment policy strategy mandatory. Rationale: achieving sufficient response depth to be eligible for HSCT is itself a measure of drug benefit — censoring removes evidence of that benefit.
Design implication: Trials must track HSCT dates, conditioning regimen, and post-transplant events. Post-HSCT OS, relapse, and non-relapse mortality must be captured. The SAP must explicitly state that HSCT is not a censoring event in any primary analysis.
SAP language:
"Hematopoietic stem cell transplantation (HSCT) will be handled using the treatment policy strategy. HSCT and all post-transplant events (relapse, death from any cause, including transplant-related mortality) will be included in all primary efficacy analyses. Patients who proceed to HSCT will not be censored at the time of transplantation."
EFS Definition in AML
EFS (event-free survival) is a composite endpoint in AML. The composite variable strategy incorporates multiple failure modes as equal-weight events:
Standard EFS event list:
- Failure to achieve CR or CRi by Day 60 (or by end of induction)
- Relapse from CR/CRi
- Death from any cause
Statistical consequence: The control arm EFS rate at 24 months must be estimated for sample size. In relapsed/refractory AML, 24-month EFS rates of 5–15% are typical, requiring large event counts.
FDA 2022 position on EFS: Acceptable as co-primary with OS in randomized AML trials. EFS alone may support accelerated approval in specific settings.
Molecular Subgroup Stratification
All modern AML Phase 3 trials must pre-specify molecular eligibility and stratification:
- FLT3-ITD/TKD status (allelic ratio matters for some trials)
- IDH1/2 mutation status
- NPM1 mutation, cytogenetic risk (ELN 2022: favorable/intermediate/adverse)
- Age stratification (< vs. ≥60 or 65 years)
- Secondary AML (prior MDS/MPN/therapy-related) vs. de novo
Companion diagnostic requirements: FLT3 inhibitor trials require validated CDx (midostaurin: LeukoStrat CDx FLT3 Mutation Assay); IDH inhibitor trials require IDH1/2 CDx.
Response Assessment Schedule (AML-Specific)
- Bone marrow biopsy: Day 14–16 (optional hypoplasia check), Day 21–28 (end of induction Cycle 1), Day 42 (Cycle 2 if needed), then every 1–3 cycles during treatment
- Blood counts: Daily or every 2 days during induction; weekly during consolidation
- MRD assessment (if applicable): At CR achievement, after consolidation, at suspected relapse
- Central morphology review: Required for regulatory submissions; local assessment for treatment decisions
Sample Size Considerations
- OS-primary Phase 3: Event-driven design. In 1L AML, 2-year OS rates: standard SOC ~40–50%; experimental ~55–65%. Required events: 300–450 for 80% power at HR 0.75.
- CR/CRi rate Phase 2: Historical control rate ~15–25% in 2L AML; target rate ≥35–40% for meaningful improvement. Single-arm: 50–80 patients for 80% power (one-sample proportion test).
- EFS co-primary: Must power independently for both OS events and EFS events; sample size driven by whichever requires more events.
Intercurrent Events
Death Without Documented Remission Assessment
In intensively-treated AML patients, early deaths (Day 1–28) may occur before response assessment. This is not a missing data problem — it is a clinical outcome (treatment failure or induction mortality).
Strategy: Composite — early death counts as EFS failure and as CR failure (non-responder). The composite strategy ensures that early deaths are not censored or imputed.
SAP language: "Patients who die before their first post-baseline bone marrow assessment will be classified as non-responders (treatment failure). Death before response assessment is counted as an EFS event dated to the date of death."
Salvage Therapy Between Cycles
AML patients who do not achieve CR after induction may receive salvage therapy (MEC, FLAG-IDA) before discontinuing study drug:
Strategy: Treatment policy — the clinical question is "what happens when patients are assigned to this treatment strategy?" which naturally includes rescue therapy decisions. Censoring at salvage therapy would remove exactly the patients who are not responding — creating informative censoring.
Protocol-Defined Bone Marrow Assessment Windows
AML response is assessed at specific protocol-defined timepoints. Late or missing assessments affect CR rate analysis:
Missing response assessment: Missing data (not IE) — requires pre-specified missing data handling. Options:
- Non-responder imputation (NRI): most conservative and FDA-preferred for binary response endpoints
- LOCF: not appropriate for AML (response changes over time)
- Multiple imputation: requires MAR assumption justification
SAP language: "For the primary CR/CRi rate analysis, patients without a documented response assessment by Day 60 will be classified as non-responders (non-responder imputation)."
Regulatory Precedent
| Trial | Drug | Setting | Primary Endpoint | Result | Approval |
|---|---|---|---|---|---|
| RATIFY | Midostaurin + SOC | 1L FLT3+ | OS | HR 0.78 (4-year OS 51% vs 44%) | Regular (2017) |
| VIALE-A | Venetoclax + azacitidine | 1L ineligible-intensive | OS | HR 0.66 (12-month OS 66% vs 45%) | Regular (2020) |
| VIALE-C | Venetoclax + decitabine/azacitidine | 1L (LDAC) | CR/CRi rate (secondary OS) | CR/CRi 48% vs 13%; OS HR 0.75 | Supplement (2021) |
| QuANTUM-R | Quizartinib vs. chemo | 2L FLT3-ITD+ | OS | HR 0.76 | Regular (2023) |
| CPX-351 | Liposomal daunorubicin/cytarabine | 1L AML-MRC | OS | HR 0.69 | Regular (2017) |
| Enasidenib | IDH2 inhibitor | 2L IDH2-mutant | CR/CRi rate (single-arm) | CR/CRi 23%; CR alone 19% | Accelerated (2017) |
| Ivosidenib | IDH1 inhibitor | 2L IDH1-mutant | CR/CRi rate (single-arm) | CR/CRi 30%; CR alone 21% | Accelerated (2018) |
| Gilteritinib (ADMIRAL) | FLT3 inhibitor | 2L FLT3+ | OS | HR 0.64 | Regular (2019) |
Limitations and Pitfalls
OS dilution by subsequent lines: AML patients who relapse may receive 3–4 subsequent regimens including novel targeted agents, which dilute the OS signal in ITT analysis. This is an inherent limitation of OS as a primary endpoint — the treatment policy strategy accepts this dilution as reflecting real-world conditions.
Early death confounding CR rate: High early death rates (5–15% in 1L AML) reduce observable CR rates. A drug may achieve high CR in responders but poor overall CR rate due to early mortality. EFS simultaneously captures both — an argument for EFS over CR rate alone.
MRD heterogeneity: MRD thresholds and assay sensitivities vary across trials (10⁻³, 10⁻⁴, 10⁻⁵). Without standardized MRD definitions, comparisons across trials are difficult. FDA has not yet issued final AML-specific MRD guidance (contrast with myeloma draft January 2026).
Single-arm accelerated approval fragility: Multiple AML drugs approved via single-arm CR rate have had confirmatory trials fail or not start. FDA 2022 AML guidance and FDORA 2022 now require confirmatory trials to be underway before accelerated approval is granted.
Complex molecular heterogeneity: FLT3-ITD allelic ratio affects midostaurin benefit but is not uniformly reported. IDH1 R132H vs. other IDH1 mutations have different inhibitor responses. Pre-specification of molecular eligibility criteria at the gene and variant level is required.
Backlinks
- Oncology Endpoint Overview
- Overall Survival (OS)
- Response-Based Endpoints (ORR, CR, DOR)
- DFS and EFS Endpoints
- FDA Approval Pathways in Oncology
- ICH E9(R1) Estimand Framework
- Intercurrent Events in Oncology Trials
- Emerging Endpoints in Oncology Trials
Source: FDA Guidance for Industry: Acute Myeloid Leukemia (2022, Final); FDA Cancer Endpoints 2018 (Final) Status: FDA AML 2022 = Final Compiled from FDA AML 2022 + CTG AML Phase 2/3 dataset (489 trials) + estimand_framework_oncology_review.md §3.3