|
1
|
- William R. Shadish
- University of California, Merced
|
|
2
|
|
|
3
|
- In general, I’m going to discuss the logical, design, and statistical
tools that we use to make causal inferences, and point to problems in
each of those areas that cause us to make inferential errors
- Most of the issues are not specific to complex or behavioral
interventions, though I will point out some particular problems in that
area.
|
|
4
|
|
|
5
|
- Causal inference is always, first and foremost, a matter of logic.
- Mill’s Canons (temporal precedence, covariation, alternatives)
- Counterfactual Models (e.g., Rubin’s model)
- This means causal inference is possible with nonrandomized designs
- Smoking and cancer in humans
- Dietary treatment of PKU (short time series)
- Campbell’s (1975) Degrees of Freedom and the Case Study
- But the conditions facilitating causal inference in case studies are
rare
|
|
6
|
- Statistical Conclusion Validity (did the treatment covary with the
outcome?)
- Internal Validity (did the treatment affect the outcome?)
- Construct Validity (what labels or constructs best represent what we
did?)
- External Validity (to what does the effect generalize?)
|
|
7
|
- Threats to Internal Validity
- Ambiguous Causal Precedence
- History
- Maturation
- Attrition
- Selection
- Regression
- Testing
- Instrumentation
- The idea is to rule each of these out as alternative explanations for
the observed effect
- Similar lists of threats to the other validities
- Similar lists in other literatures, such as Sackett (1979) for case
control studies
|
|
8
|
- Many scientists are surprisingly unaware of basic causal logic
- Our ability to enumerate specific contextual threats to validity is
limited.
- Need research programs to identify threats
- Need ways to quantify qualitative threats to validity
- Scientists are just people, and people are not good at making causal
inferences based on logic alone.
|
|
9
|
- Randomized experiments (lots of examples of their feasibility with
behavioral interventions, though some special problems do arise)
- Strong quasi-experiments
- Interrupted time series (on groups or individuals)
- regression discontinuity (cutoff assignment)
- Weaker quasi-experiments
- Nonequivalent comparison group designs
- Observational Studies
- Meta-Analysis
- Of randomized experiments
- Of mixed methods work
|
|
10
|
- Specious arguments against randomization
- Occasional premature randomized experiments
- Misplaced faith in one large randomized trial over many small randomized
trials. Both are important, but:
- Many small trials allows exploration of generalization
- Under-use of stronger quasi-experiments
- Failure to take full advantage of meta-analysis to explore
generalization of effects
- Response surface modeling using multimethod databases
- Routine exploration of generalization over moderators
- Misplaced faith in single meta-analyses
- They are only as good as the data going into them
- E.g., most negative effects for Vitamin E occurred in studies using
both Vit E and beta-carotene
|
|
11
|
- In Randomized Experiments
- Attrition Analyses
- Incomplete treatment implementation
- Analysis of nested data
- In Quasi-Experiments
- Selection Bias Modeling
- Propensity Score and Hidden Bias Analysis
- In Meta-Analysis
- Response Surface Modeling
|
|
12
|
- Lack of awareness of recent developments for both randomized and
nonrandomized experiments.
- E.g., continued failure to take nesting into account, leading to Type I
errors
- Misplaced preference for single point answers rather than bracketing
- Lack of clear methods or software to do some of the analyses (e.g.,
response surface modeling in meta-analysis)
|
|
13
|
- Misplaced faith in “singletons”
- Single point estimates
- Single studies
- Single meta-analyses
- Misanalyzed data (especially nesting and attrition)
- Failure to take generalization as seriously as causation in design and
analysis
|
|
14
|
- We do want to minimize preventable problems, but:
- Experiments are about DISCOVERY of effects of treatments. We should not
be upset that the effects are not always what we anticipated.
- We rarely know when the “last word” is in (even for HRT or
beta-carotene).
|