Difference-in-differences regression visualization
Describes the intuition behind difference-in-differences (DD) regressions using Gruber, Levine, and Staiger (1999) as an example
Gruber, Levine, and Staiger (1999) say their regressions control for
- State fixed effects
- Time fixed effects
- A quadratic time trend
- Some other stuff
The first two controls are what make this DD. The details of fixed effects and regression controls are too complex to explain precisely here, but visualizing what they do may help.
Data
We start with some data on birth rates. Their data include birth rates for all states from 1965 through 1979, but we can simplify things by looking just at two time series of birth rates: one for the repeal states and one for the non-repeal states:
Controlling for state fixed effects
Notice that the repeal states almost always have lower birth rates than the non-repeal states? If we want to compare birth rates in the two sets of states to estimate causal effects of abortion legalization, this stable difference will be a problem. Thus, we remove the difference as shown in the figure below. This is what we mean by “controlling for” state (group) fixed effects. This is (approximately) what including in their regression equations does.
Controlling for time fixed effects
The same idea applies to time fixed effects. Birth rates were higher in some years and lower in others. Much of this change over time is likely not due to abortion access because birth rates fall over time in both sets of states. We remove the general effect of time by subtracting off the average birth rate in each year, as in the figure below. The figure starts where the previous figure ended.
The differences left over between the repeal and non-repeal states are entirely due to factors that change over time differently for the two sets of states. One factor that changed in the repeal states at different times than in the non-repeal states is abortion legality. If there were no other large changes over time that tended to differ between the repeal and and non-repeal states, then the differences at the end of the previous figure are likely because of abortion legalization.
Controlling for state-specific time trends
One of the main reasons for using regression DD is to control for group-specific time trends. If birth rates were already falling faster in repeal states than in non-repeal states, then we might expect them to keep falling faster after 1970 even if the repeal states did not legalize abortion then. Like with the fixed effects, we can subtract off those differences in trends. The animation below modifies the original data slightly to make the trends easier to see.
The animation above controls for a linear time trend, which is just a straight line. Gruber, Levine, and Staiger control for a quadratic trend. A quadratic curve is a parabola, so this allows the trend to be curved.