STA 320:Design and Analysis of Causal Studies
Dr. Kari Lock Morgan and Dr. Fan LiDepartment of Statistical ScienceDuke University
Office Hours and Support
Dr. Kari Lock Morgan,firstname.lastname@example.orgOffice Hours: M 3-4pm, W 3-4pm, F 1-3pm in OldChem216Dr. Fan Li,email@example.comOffice Hours: M 10:30-1130am, W 2-3pm in OldChem122TA:WenjingShi,firstname.lastname@example.orgOffice Hours:tbdin OldChem211AStatistics Education CenterSun, Mon, Tues, Wed, Thurs 4 – 9 pm in OldChem211A
Causal Inference for Statistics and Social SciencesBy Guido W.Imbensand Donald B. RubinNot yet publishedDraftpdfavailable on Sakai (do not share)Will distribute other relevant papers
Most weeks will either have a homework due or a quizOne midterm exam: 3/5Final projectGradingHomework: 20%Quizzes: 20%Midterm: 30%Final project: 30%
stat.duke.edu/courses/Spring14/sta320.01/Class materials will be posted hereTextbook, relevant papers, etc. will be posted on Sakai
Causality and Potential Outcomes
What is causality???In this class we will learn…a formal framework for causal effectshow to design studies to estimate causal effectshow to analyze data to estimate causal effects
Rubin Causal Model
In the class, we’ll use theRubin Causal Modelframework for causal effectsKey points:Causality is tied to an action (intervention)Causal effect as a comparison of potential outcomesAssignment mechanism
What is the effect of thetreatmenton theoutcome?If the opposite treatment had been received, how would the outcome differ?Example:treatment: choosing organic produceoutcome: get cancer? (yes or no)Question: Does choosing organic produce decrease risk of cancer?
For most of this course, we will consider only two possible treatments:Active treatment (“treatment”)Control treatment (“control”) – often just not getting the treatmentTwo treatments (including the alternative to specified treatment) must be well defined
What is the effect of studying on test scores?Does texting while driving cause accidents?Does exercising in the morning give you more energy during the day?Do students learn better in smaller classes?Did the hook-up happen because alcohol was involved?Come up with your own!
Causalityis tied to anaction(or manipulation, treatment, orintervention)“no causation without manipulation” –manipulation need not be performed, but should be theoretically possibleTreatments must be plausible as a (perhaps hypothetical) interventionGender? Age? Race?
Not Clearly Defined Causal Questions
Are parents more conservative then their children because they are older?What is the causal effect of majoring in statistical science on future income?Did she get hired because she is female?Can you make these into well-defined causal questions?
Key question: whatwouldhave happened, under the opposite treatment?A potential outcome is the value of the outcome variable for a given value of the treatmentOutcome variable: YY(treatment): outcome under treatmentY(control): outcome under control
Y(organic) = cancer or no cancerY(non-organic) = cancer or no cancerPossibilities:Y(organic) = no cancer, Y(non-organic) = cancerY(organic) = Y(non-organic) = cancerY(organic) = Y(non-organic) = no cancerY(organic) = cancer, Y(non-organic) = no cancerFormulate the potential outcomes for your example
Thecausal effectis the comparison of the potential outcome under treatment to the potential outcome under controlFor quantitative outcomes, we often take a difference:Causal Effect = Y(treatment) – Y(control)
Possibility 1:Y(organic) = no cancer, Y(non-organic) = cancerCausal effect for this individual: choosing organic produce prevents cancer, which he otherwise would have gotten eating non-organicPossibility2:Y(organic) =cancer, Y(non-organic) = cancerCausal effect for this individual:he would get cancer regardless, so no causal effect
Y = test scoreY(study) – Y(don’t study)Example: Y(study) = 90, Y(don’t study) = 60Causal effect = 90 – 60 = 30. Studying causes a 30 point gain in test score for this unit.Y = accident (y/n)Y(texting)vsY(not texting)Y = energy during the dayY(morning exercise) – Y(no morning exercise)Y = hook-up (y/n)Y(alcohol)vsY(no alcohol)
These are unit-level causal effectUnit: the person, place, or thing upon which the treatment will operate,at a particular point in timeNote: the same person at different points in time = different unitsThe causal effect will probably not be the same for each unit
The definition of a causal effect does not depend on which treatment is observedCausaleffects, in their definition, do not relate to probability distribution for subjects “who got different treatments”, or to coefficients of modelsNot a before-after comparison. Potential outcomes are a “what-if” comparison.Potential outcomes may depend on other variables (not just the treatment)
Fundamental problem in estimating causaleffects:atmost one potential outcomeobserved for each unitThe other potential outcomelies in an unobservedcounterfactualland…whatwouldhave happened, under a differenttreatmentFor treated units:Y(treatment) = observed, Y(control) = ???For control units:Y(treatment) = ???, Y(control) = observed
For thedefinitionof a causal effect: compare potential outcomes for a single unitHowever, in reality, we can only seeonepotential outcome for each unitForestimationof a causal effect wewill need to consider multipleunits, some who have been exposed to the treatment, and some to the control
Need to comparesimilarunits, some exposed to active treatment, and some to controlCan be same units at different points in time, or different units at the same point in timeIn this class we will focus primarily on comparing different (but similar) units at the same point in time
Causality is tied to an action (treatment)Potential outcomes represent the outcome for each unit under treatment and controlA causal effect compares the potential outcome under treatment to the potential outcome under control for each unitIn reality, only one potential outcome observed for each unit, so need multiple units to estimate causal effects