Notes
Slide Show
Outline
1
How likely is a positive finding to be wrong?


  • Sholom Wacholder, DCEG, NCI
  • January, 2005, Bethesda
2
Introduction
  • Dr. Zerhouni’s Analogy:
    • M & M conference and this meeting
    • How can we change practice to do better in the future?


  • Dr. Ransohoff: Should rules of evidence be changed?


  • I will not talk
    • Asking the right question
      • Determination and assessment
        • Exposure
        • Endpoint
    • Getting the timing right
    • Bias reduction
      • Design
      • Fieldwork
      • Analysis
  • Instead, I want to question a venerable convention





3
Standard practice
  • P-values, confidence intervals
    • Do not convey the level of uncertainty about the hypothesis when the statistical test is “significant”
    • Do not provide a sense of the chance that the significant finding is “wrong”
  • Is 5% the best criterion for significant two-sided p-value?
  • Do all CI’s need to be 95% CIs?
4
Decision making
  • Clinic
    • Do I screen?
    • Screening modality
    • Decision rule on the basis of results of the screen?
  • Statistics
    • Should I launch a study?
    • Study design
      • Sample size
    • How do I act on the basis of results of a study?
  • Parallelism
    • Browner & Newman, JAMA, 1987
5
Statistical decisions
  • Accept or reject null hypothesis?
  • Stop a randomized trial to protect participants from excess of serious adverse events?
  • Recommend a change in behavior to reduce risk
  • Act as if a hypothesis is no longer viable
    • Based on accumulated negative evidence
6
Basis of Statistical Decisions
  • Loss from wrong decisions
    • Two kinds of loss
      • False positive decision
      • False negative decision
      • Just like PPV and NPV
        • Positive and negative predictive value
    • Expected loss depends on
      • Likelihood of each type of wrong decisions
        • Relative magnitude is enough
        • Depends on context
      • Probability the hypothesis is true
        • Unknowable ... but
7
Standard statistical decision making
  • α=0.05 is universal
    • Standard sample size determination
      • Sample size vs power for α=0.05
    • Analysis
    • Prior probability not considered formally
    • Loss from bad decisions not considered
  • è, probability that positive report is a false positive is not considered
8
Example of algebra of false positives for speculative HA
  • Chance alternative hypothesis HA is true = 0.1%=1/1,000=0.001
  • If HA false è 5% chance of rejection
  • If HA true è 100% chance of rejection
  • Pr( reject & HA false)=0.999*0.05 ≈ 0.050
  • Pr( reject & HA true) =0.001*1.00 = 0.001
  • FPRP = Pr( HA false | rejection)
    • ≈ 0.050/(0.001+0.050)  ≈ 98%


9
How likely is a positive finding to be wrong?
  • Calculate FPRP (JNCI, 2004)
    • False positive report probability
    • Analogous to 1-PPV
    • Uses p-value, power and prior probability that there is an association
  • Base decision on FPRP-based test of “noteworthiness”
    • “reject” if FPRP < 0.2 (or 0.5, perhaps)
    • Or choose α so that FPRP < 0.2 if test rejected
  • Interpretation
    • Bayesian
    • Frequentist, but with study-specific α-level
      • Based on prior probability of hypothesis and power
      • Accounting for “loss” from wrong decisions





10
Essential Formula
  •  FPRP: FALSE POSITIVE-REPORT PROBABILITY
  • Prior:      π = Pr( association)
  • Power: 1-β = Pr( Rejection | association )
  • Size:        α = Pr( Rejection | no association)
  • FPRP = Pr( No association | Rejection)
11
Essential Formula II
  • Prior:      π = Pr( association )
  • Power: 1-β = Pr( Reject | association)
  • Size:        α = Pr( Reject | no association)
  • FNRP: False Negative Report Probability
  •           FNRP= Pr( No association | No Rejection)
12
 
13
 
14
 
15
 
16
Key question
  • What is optimal tradeoff between  power and protection from false positives?
    • Universal 95% CI, p<0.05 equally inappropriate for low prior probabilities
    • Bonferroni provides insidious incentive
      • Don’t explore additional hypotheses or subgroups
  • Or tradeoff between FPRP vs FNRP
    • False negative report probability
  • Tradeoff may be different for different audiences
    • Researchers
    • Public
17
Implication
  • Vary the alpha level depending on how likely X is to cause D
    • Bayes approach
    • FPRP: 4-step program
      • Wacholder et al., JNCI, 2004
      • Simple calculation from p-value
      • Spreadsheet for reader, editor
18
Other sources of false positives
  • Confounding
    • For detecting subtle effects
      • Especially confounding by indication in clinical epidemiology
        • HRT
    • P-values are misleading
    • Power reduced
  • Poor field work
    • Low response, follow-up compliance rates
    • Poor exposure assessment
      • Lowers power
        • Raises FPRP
    • Differential
19
Final thoughts (1)
  • Observational studies crucial in prevention and clinical epidemiology
    • To motivate trials
    • Ethics
    • Feasibility
      • Post-marketing surveillance
20
Final thoughts (2)
  • To reduce false positives in epidemiologic studies
    • Improve study design
    • Improve study practice
    • Improve statistical approaches
      • Including
        • Explicit consideration of probability that a positive report is a false positive due to random variation
  • We scientists cannot figure out how to communicate with public until we figure out what we need to communicate with each other