<< Back to Agenda
Slide 1: Moving from Observational Studies to Clinical Trials: Why do
We Sometimes Get It Wrong?
Joseph Lau, MD Rapporteur
SLIDE 2: Evaluating Study Outcomes: Biomarkers, Intermediate Endpoints,
and Surrogate Endpoints
o
- Ross Prentice, PhD
- Surrogate Endpoint Definition and Application
- Stuart Baker, ScD
- Recent Approaches to Surrogate Endpoint Validation
- David Ransohoff, MD
- New Complexity: The "Omics" Revolution
- Daniel Hayes, MD
- Methods of Biomarker Validation
SLIDE 3: Ross Prentice, PhD: Surrogate Endpoint Definition and
Application
- Surrogate outcome definition
- Conceptual framework for associations of treatment, surrogate, and
true endpoint
- Proposed meta-analysis approach of borrowing information in prior
studies of similar treatments in similar populations
SLIDE 4: Stuart Baker, ScD: Recent Approaches to Surrogate Endpoint
Validation
- Process of validating markers or endpoints
- Hypothesis testing framework
- Estimation framework
- Recommended meta-analysis estimation approach to validate surrogate
endpoint
- Real examples?
- Has this method been validated empirically?
- Other approaches? Bayesian method?
SLIDE 5: David Ransohoff, MD: New Complexity: The "Omics"
Revolution
- Promises and disappointments of cancer markers
- Rules of evidence not well developed
- Current overly optimistic interpretation of "omics" data
- Bias as threat to validity
SLIDE 6: Daniel Hayes, MD: Methods of Oncology Biomarker
Validation
- Many proposed tumor markers
- Most inadequately validated
SLIDE 7: A few comments on other sessions
- Current concepts
- How best to evaluate existing evidence?
- How to design better future studies?
SLIDE 8:Issues evaluating evidence: an EBM-er's perspective
- Evidence is seldom single sourced (basic science, animal, human
observations, human experiments)
- Observational studies vs RCTs
- Surrogates vs clinical outcomes
- Mega-trials vs (meta-analyses) small trials
- Large RCTs vs large RCTs
- Methodological quality of the studies
- Publication bias
SLIDE 9:Comparisons of RCTs with NROS
- BMJ 1998; Oxman et al.
- NEJM 2000; Concato et al.
- NEJM 2000; Benson et al.
- JAMA 2001; Ioannidis et al.
SLIDE 10:Comparison of RCTs and NROS in meta-analyses
Ioannidis et al. JAMA 2001:286:821-830
-
- A total of 45 topics were considered.
- They were identified from comprehensive searches of MEDLINE, The
Cochrane Library, previous relevant publications and personal archives - c.
3,000 meta-analyses were screened.
- The 45 topics included 408 primary studies with available binary data
(240 RCTs and 168 NROS)
- NROS included 71 prospective studies, 40 retrospective cohort
studies, 25 case-control studies, 29 studies with historical controls, and 3
studies with unclear designs
SLIDE 11: Odds ratio in non-randomized studies (chart)
SLIDE 12: Comparisons between randomized and non-randomized evidence.
(Chart)
Ioannidis J. et al. JAMA 2001;286:821-830
SLIDE 13: Comparisons between randomized and non-randomized evidence.
(Chart)
Ioannidis J. et al. JAMA 2001;286:821-830.
SLIDE 14: Heterogeneity in RCTs and in NROS
Ioannidis et al. JAMA 2001;286:821-830
- Statistically significant heterogeneity between randomized trials was
seen in 9 of 39 topics with at least 2 RCTs included
- Statistically significant heterogeneity between the non-randomized
studies was seen in 13 of 32 topics with at least 2 NROS included
- The estimated between-study heterogeneity tended to be smaller among
RCTs than among NROS (p=0.032)
o
SLIDE 15; Comparison of the magnitude of treatment effects
Ioannidis J. et al. JAMA 2001;286:821-830.
- In 25 of 45 cases, the non-randomized studies showed a larger
treatment effect for the experimental treatment than the randomized trials. The
opposite occurred in 14 cases, but it was a data artifact in 3 of them. In 6
topics there was either no clear-cut experimental arm or the effects were
similar (p=0.009).
SLIDE 16: Discrepancies between RCTs and NROS
Ioannidis J. et al. JAMA 2001;286:821-830.
- Discrepancies beyond chance were observed in 12 of 45 cases by fixed
effects and in 7 of 45 cases by random effects
- In these discrepancies, almost always the treatment effect was more
favorable in NROS
- When limiting analyses to prospective studies, there were
disagreements in 2 of 26 topics
- (8%)
SLIDE 17: Conclusions
Ioannidis J. et al. JAMA 2001;286:821-830.
- Treatment effects in RCTs and observational studies on the same topic
tend to be highly correlated
- Nevertheless, discrepancies do occur in about 1 out of 6 cases, even
when between-study heterogeneity is accounted for
- Typically, discrepant pairs tend to show more favorable results in
observational studies
- Discrepancies in the absolute magnitude of effect (="how much it
works") are very common
SLIDE 18: Conclusions (cont)
Ioannidis J. et al. JAMA 2001;286:821-830.
- Observational studies exhibit larger variability in their treatment
effects than RCTs
- Discrepancies are more common when retrospective observational
designs are considered
- Both RCTs and NROS must be carefully scrutinized for sources of
genuine heterogeneity and bias
- RCTs and NROS should not be seen as mutually exclusive domains of
research
SLIDE 19: Comparisons of Large RCTs with Meta-analyses of small
trials
- Villar et al. Lancet 1995
- Cappelleri et al. JAMA 1996
- LeLorier et al. NEJM 1997
SLIDE 20: Some Issues in the Comparisons of Meta-Analysis and Large
Trial
Ioannidis et al. JAMA 1998
- Definition of large (arbitrary, power)
- Source of meta-analyses (why done?)
- Source of large trials
- Types of outcomes ( 1o, 2o )
- Meta-analysis statistics (FEM, REM)
- Definition of agreement (p-value, corr.)
- Reasons for disagreement
SLIDE 21: Comparison of 30 meta-analyses of RCTs with largest
corresponding trial (Chart)
SLIDE 22: Meta-analyses vs. Mega-trials
Cappelleri JC, Ioannidis JPA, deFerranti SD, Schmid CH, Aubert M,
Chalmers TC, Lau J. Large trials versus meta-analyses of smaller trials: How do
their results compare? JAMA 1996; 276:1332-38.
SLIDE 23: Large trials versus meta-analysis of smaller
trials(Table)
SLIDE 24: Large trials vs meta-analysis of smaller trials: How do their
results compare ?
- By random effect calculations, agreements found between large and
smaller trials in:
- 90% selected by sample size approach (1,000); 82% by statistical
power approach
- Twice as many disagreements appeared when the variability among large
studies and the variability among smaller studies was not considered (fixed
effects calculations).
SLIDE 25: Comparisons Between Large Trials and Meta-Analyses of Small
Trials Capelleri Protocol-Statistical Power Rule (61 Comparisons)(Chart)
SLIDE 26: Comparisons Between Large Trials and Meta-Analyses of Small
Trials Capelleri Protocol-1000 Size Rule (79 Comparisons) (Chart)
SLIDE 27: Large Trials vs Meta-Analysis of Smaller Trials: How do their
results compare ? (cont.) Cappelleri et al, JAMA 1996
- Of 15 disagreements between results of large and smaller trials using
the random effects model, plausible explanations were identified in 10
meta-analyses:
5 with differences in the control rate between large and smaller
trials
4 with specific protocol or study differences
1 with potential publication bias
2 other disagreements were not clinically important tentative reasons
could be identified for 2 of the remaining 3 disagreements
SLIDE 28: Large trials vs meta-analysis of smaller trials: How do their
results compare ?
- Meta-analyses of smaller studies are generally comparable with
results from large studies.
- Differences can be attributed to insufficient sample sizes, control
rates, or protocols.
- These reasons are not mutually exclusive.
- Publication bias is a possibility but has never been proven to be a
factor.
- Need to explore reasons for heterogeneity.
SLIDE 29: Some characteristics of clinical trials used in the protocols
of comparision of large trials and meta-analyses of small trials (Flow chart)
SLIDE 30: Discrepancies between megatrials. Furukawa et al. J Clin
Epidem 2000;53:1193-99.
Why should large trials be the reference standard?
What do we know about the agreements among large trials on the same
problem?
SLIDE 31: Discrepancies between megatrials. Furukawa et al. J Clin
Epidem 2000;53:1193-99.
- "megatrial" defined as >1,000 patients
- 289 pairs identified in Cochrane Library
- 79/289 (27%) pairs were statistically significantly different from
each other
- 133 comparisons in LeLorier article
- 36/133 (27%)were statistically significantly
- different
SLIDE 32: Discrepancies between megatrials. Furukawa et al. J Clin
Epidem 2000;53:1193-99.
- Agreement among megatrials was approximately as large as that
reported between meta-analyses and megatrials
- If we were to base the recommendation for the treatment in question
on the primary outcome, 53% (Cochrane set) and 31% (LeLorier set) of the
treatment recommendation by a megatrial was not confirmed by a later megatrial.
- On the other hand, 30% to 47% of the treatments once found
ineffective or harmful in a megatrial were shown to be beneficial by a later
megatrial.
SLIDE 33: Insights from these empirical studies
- Heterogeneity of treatment effects is common among clinical trials,
whether they are large or small; RCTs or observational studies
- Meta-analysis of small trials (dis)agree with large trials
approximately as often dis(agreement) among large trials themselves
- We need to understand the cause of heterogeneity in clinical trials
and learn how to handle them in meta-analysis
SLIDE 34: Controversy due to quality assessment: Screening mammography
RCTs
- Gotszche and Olsen. Lancet 2000;355:129
- A 1999 study found no decrease in breast cancer mortality in Sweden,
where screening has been recommended since 1985
- Reviewed methodological quality of mammography trials and repeated a
meta-analysis
SLIDE 35: Controversy : Screening Mammography RCTs
-
- 8 trials identified
- Baseline imbalances were found in 6 of 8 trials
- 2 adequately randomized trials found no effect of screening on on
breast cancer mortality
- pooled risk ratio 1.04 (95% CI 0.84 - 1.27)
- 6 inadequately randomized trials found significant effect
- Pooled risk ratio 0.75 (95% CI 0.67 - 0.83)
- -
SLIDE 36; Relative risk of death from breast cancer in screening
versus control groups (Table)
SLIDE 37: Mammography screening trials according to methodological
quality (Table)
SLIDE 38: Definition of Poor Quality
- Based on Randomization adequacy
- Based on minor differences in mean age
- Failed to consider other explanations for difference in mean
ages
- Failed to consider other measures of quality
SLIDE 39: POLICY RESULTS
- Switzerland decided to not cover screening mammography
- NCI wavers on value of screening mammograms
- Women and doctors more confused about value of test
SLIDE 40: Picture of older man
SLIDES 41-46: Another 5 pictures
SLIDE 47: Treatment effect observed (reported) in a RCT
SLIDE 48:Estimating a single treatment effect across multiple
trials in a meta-analysis (Chart with greeks/formulas)
<< Back to Agenda