Doubly Robust Estimator

DRE is a combined method from REG and IPW

In the previous two sections, we used REG and IPW to recover the mean outcome when the population received treatment (A=1) and when the population did not receive the treatment (A=0) and form ACE formulas. We assumed we specified the correct model for both approaches, which allowed us to build an unbiased model. However, the assumption does not always hold, and it is often unrealistic and incapable of defining the model that involves all relevant confounding variables, measured and unmeasured.

To address this limitation that both estimators have, it will be nice to combine the outcome regression model with a model for the exposure to estimate the causal effect of an exposure on an outcome, which forms the double robust estimator. It combines these 2 approaches such that only 1 of the 2 models needs to be correctly specified to make unbiasedness holds. We construct a combined estimator such that if we construct one model correctly (REG or IPW), our combined estimator will produce unbiased estimates of the average causal effect. More importantly, we don’t need to know which of the two models is correctly specified (REG or IPW); just that at least one is correctly specified is enough. In simple words, it means that we don’t put all of our eggs in one basket with one approach, rather we reduce our risk of getting a biased model by having two shots at getting the answer correct.

Properties of DRE

Because of its hybrid feature, DRE has better flexibility and robustness to model misspecification, which allows it to be less biased and useful in situations with uncertainty about the adequacy of the adjustment for confounding. Since this method has been applied in various fields to estimate the average causal effects of interventions, policies, and programs that are closely related to our lives. Therefore, understanding the principles of double robustness estimation and applying this method to real-world data is critical for drawing causal inferences from observational data.

In the following section, we are going to use this DRE model that contains the binary treatment variable A and the continuous outcome variable Y to prove the unbiasedness in different conditions.

Below are equations showing how DRE estimates the average causal effect.

\[ \begin{gathered} \widehat{D R E}_{a=1}=\frac{1}{n} \sum_{i=1}^{n}\left(\frac{A_{i}\left(Y_{i}-\widehat{Y}_{i}^{a=1}\right)}{\widehat{P S}_{i}}+\widehat{Y}_{i}^{a=1}\right) \\ \widehat{D R E}_{a=0}=\frac{1}{n} \sum_{i=1}^{n}\left(\frac{\left(1-A_{i}\right)\left(Y_{i}-\widehat{Y}_{i}^{a=0}\right)}{1-\widehat{P S}_{i}}+\widehat{Y}_{i}^{a=0}\right) \\ \widehat{D R E}_{a=1}-\widehat{D R E}_{a=0}=\frac{1}{n} \sum_{i=1}^{n}\left(\frac{A_{i}\left(Y_{i}-\widehat{Y}_{i}^{a=1}\right)}{\widehat{P S}_{i}}+\widehat{Y}_{i}^{a=1}\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\frac{\left(1-A_{i}\right)\left(Y_{i}-\widehat{Y}_{i}^{a=0}\right)}{1-\widehat{P S}_{i}}+\widehat{Y}_{i}^{a=0}\right) \end{gathered} \]

The variables in the DRE above are as follows:

Under correct specification of either the Outcome Regression (REG) or Inverse Probability Weighting Model (IPW), the following equalities hold:

\[ \begin{aligned} E\left[\widehat{D R E}_{a=1}\right] & =E\left[Y^{a=1}\right] \\ E\left[\widehat{D R E}_{a=0}\right] & =E\left[Y^{a=0}\right] \\ E\left[\widehat{D R E}_{a=1}-\widehat{D R E}_{a=0}\right] & =E\left[Y^{a=1}\right]-E\left[Y^{a=0}\right] \end{aligned} \]

Proof of DRE’s unbiasedness

We have talked about how DRE is robust and unbiased when either REG and IPW model is incorrect. Let’s take it in more depth and see how this works! We will divide this into two scenarios: 1. when IPW is incorrect 2. when REG is incorrect.

When IPW is misspecified, but REG is correct

The first situation is that when we have IPW model misspecified:

\[ \frac{1}{n} \sum_{i=1}^{n}\left(\frac{E\left[Y_{i} \mid A=1\right]}{\widehat{P S}_{i}}\right) \neq E\left[Y^{a=1}\right]\\ \frac{1}{n} \sum_{i=1}^{n}\left(\frac{E\left[Y_{i} \mid A=0\right]}{1-\widehat{P S}_{i}}\right) \neq E\left[Y^{a=0}\right] \]

But our REG model is correct:

\[ \frac{1}{n} \sum_{i=1}^{n} \hat{Y}_{i}^{a}=E\left[Y^{a}\right] \]

Then the true ACE will become:

\[ \scriptsize{\begin{aligned} E\left[\widehat{\left[D R E_{a=1}\right.}-\widehat{\left. D R E_{a=0}\right]}\right] &= E\left[\frac{1}{n} \sum_{i=1}^{n}\left(\frac{A_{i}\left(Y_{i}-\hat{Y}_{i}^{a=1}\right)}{\widehat{P S_{i}}}+\hat{Y}_{i}^{a=1}\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\frac{\left(1-A_{i}\right)\left(Y_{i}-\hat{Y}_{i}^{a=0}\right)}{1-{\widehat PS_{i}}}+\hat{Y}_{i}^{a=0}\right)\right] \\ & =\frac{1}{n} E\left[\sum_{i=1}^{n}\left(\frac{A_{i}\left(Y_{i}-\hat Y_{i}^{a=1}\right)}{\widehat{PS_{i}}}+\hat Y_{i}^{a=1}\right)\right]-\frac{1}{n} E\left[\sum_{i=1}^{n}\left(\frac{\left(1-A_{i}\right)\left(Y_{i}-\hat Y_{i}^{a=0}\right)}{1-{\widehat {PS}_{i}}}+\hat{Y}^{a=0}\right)\right] \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(E\left[\frac{A_{i}\left(Y_{i}-\hat Y_{i}^{a=1}\right)}{\widehat {PS_{i}}}\right]+E\left[\hat Y_{i}^{a=1}\right]\right)-\frac{1}{n} \sum_{i=1}^{n}\left(E\left[\frac{\left(1-A_{i}\right)\left(Y_{i}-\hat Y_{i}^{a=0}\right)}{1-{\widehat P S_{i}}}\right]+E\left[\hat Y_{i}^{a=0}\right]\right) \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{\widehat {PS_{i}}}\right) E\left[\left [Y_{i}|A_{i}=1\right)-E\left[\hat Y_{i}^{a=1}|A=1\right]+\hat{Y}_{i}^{a=1}\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{1-{\widehat {PS_{i}}}}\right) E\left[Y_i|A_{i}=0]-E[\hat Y_{i}^{a=0}|A=0\right]+\hat{Y}_{i}^{a=0}\right)\right. \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{\widehat{PS}_{i}}\right)\left(\hat{Y}_{i}^{a=1}-\hat{Y}_{i}^{a=1}\right)+\hat{Y}_{i}^{a=1}\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{1-\widehat{PS}_{i}}\right)\left(\hat{Y}_{i}^{a=0}-\hat{Y}_{i}^{a=0}\right)+\hat{Y}_{i}^{a=0}\right) \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{\widehat{PS}_{i}}\right)\left(0\right)+\hat{Y}_{i}^{a=1}\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{1-\widehat{PS}_{i}}\right)\left(0\right)+\hat{Y}_{i}^{a=0}\right) \\ & =\frac{\sum_{i=1}^{n} \hat{Y}_{i}^{a=1}}{n}-\frac{\sum_{i=1}^{n} \hat{Y}_{i}^{a=0}}{n} \\ & =E\left[Y^{\alpha=1}\right]-E\left[Y^{\alpha=0}\right] \end{aligned}} \]

When REG is misspecified, but IPW is correct

The second situation is that when we have REG model misspecified: \[ \frac{1}{n} \sum_{i=1}^{n} \hat{Y}_{i}^{a} \neq E\left[Y^{a}\right] \] But our IPW model is correct: \[ \begin{aligned} & \frac{1}{n} \sum_{i=1}^{n}\left(\frac{E\left[Y_{i} \mid A=1\right]}{\widehat{P S_i}}\right)=E\left[Y^{a=1}\right] \\ & \frac{1}{n} \sum_{i=1}^{n}\left(\frac{E\left[Y_{i} \mid A=0\right]}{1-\widehat{P S}_{i}}\right)=E\left[Y^{a=0}\right] \\ \end{aligned} \] First, let’s rearrange DRE’s formula a bit:

\[ \scriptsize{ \begin{aligned} \widehat{D R E}_{a=1}-\widehat{D R E}_{a=0}&=\frac{1}{n} \sum_{i=1}^n\left(\frac{A_i\left(Y_i-\hat{Y}_i^{a=1}\right)}{\widehat{P S}_i}+\hat{Y}_i^{a=1}\right)-\frac{1}{n} \sum_{i=1}^n\left(\frac{\left(1-A_i\right)\left(Y_i-\hat{Y}_i^{a=0}\right)}{1-\widehat{P S}_i}+\hat{Y}_i^{a=0}\right) \\ & =\frac{1}{n} \sum_{i=1}^n\left(\frac{A_i Y_i}{\widehat{P S}_i}-\frac{A_i \hat{Y}_i^{a=1}}{\widehat{P S}_i}+\hat{Y}_i^{a=1}\right)-\frac{1}{n} \sum_{i=1}^n\left(\frac{\left(1-A_i\right) Y_i}{1-\widehat{P S}_i}-\left(\frac{\left(1-A_i\right) \hat{Y}_i^{a=0}}{1-\widehat{P S}_i}\right)+\hat{Y}_i^{a=0}\right) \\ & =\frac{1}{n} \sum_{i=1}^n\left(\frac{A_i Y_i}{\widehat{P S_i}}-\left(\frac{A_i \widehat{Y}_i^{a=1}}{\widehat{P S_i}}-\widehat{Y}_i^{a=1}\right)\right)-\frac{1}{n} \sum_{i=1}^n\left(\frac{\left(1-A_i\right) Y_i}{1-\widehat{P S_i}}-\left(\frac{\left(1-A_i\right) \hat{Y}_i^{a=0}}{1-\widehat{P S_i}}-\widehat{Y}_i^{a=0}\right)\right) \\ & =\frac{1}{n} \sum_{i=1}^n\left(\frac{A_i Y_i}{\widehat{P S}_i}-\hat{Y}_i^{a=1}\left(\frac{A_i}{\widehat{P S}_i}-1\right)\right)-\frac{1}{n} \sum_{i=1}^n\left(\frac{\left(1-A_i\right) Y_i}{1-\widehat{P S}_i}-\hat{Y}_i^{a=0}\left(\frac{\left(1-A_i\right)}{1-\widehat{P S_i}}-1\right)\right) \\ & =\frac{1}{n} \sum_{i=1}^n\left(\frac{A_i Y_i}{\widehat{P S}_i}-\hat{Y}_i^{a=1}\left(\frac{A_i-\widehat{P S}_i}{\widehat{P S}_i}\right)\right)-\frac{1}{n} \sum_{i=1}^n\left(\frac{\left(1-A_i\right) Y_i}{1-\widehat{P S}_i}-\hat{Y}_i^{a=0}\left(\frac{\left(1-A_i\right)-\left(1-\widehat{P S}_i\right)}{1-\widehat{P S}_i}\right)\right) \\ & \end{aligned}} \]

Then the true ACE becomes: \[ \scriptsize{\begin{aligned} E\left[\widehat{D R E_{a=1}}-\widehat{D R E_{a=0}}\right]&=E\left[\frac{1}{n} \sum_{i=1}^{n}\left(\frac{A_{i} Y_{i}}{\widehat{P S_{i}}}-\hat{Y}_{i}^{a=1}\left(\frac{A_{i}-\widehat{P S}_{i}}{\widehat{P S}_{i}}\right)\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\frac{\left(1-A_{i}\right) Y_{i}}{1-\widehat{P S_{i}}}-\hat{Y}_{i}^{a=0}\left(\frac{\left(1-A_{i}\right)-\left(1-\widehat{P S_{i}}\right)}{1-\widehat{P S} _{i}}\right)\right)\right] \\ & =\frac{1}{n} E\left[\sum_{i=1}^{n}\left(\frac{A_{i} Y_{i}}{\widehat{P S_{i}}}-\hat{Y}_{i}^{a=1}\left(\frac{A_{i}-\widehat{P S_{i}}}{\widehat{P S_{i}}}\right)\right)\right]-\frac{1}{n} E\left[\sum_{i=1}^{n}\left(\frac{\left(1-A_{i}\right) Y_{i}}{1-\widehat{P S}_{i}}-\hat{Y}_{i}^{a=0}\left(\frac{\left(1-A_{i}\right)-\left(1-\widehat{P S_{i}}\right)}{1-\widehat{P S_{i}}}\right)\right)\right] \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(E\left[\frac{A_{i} Y_{i}}{\widehat{P S}_{i}}\right]-E\left[\hat{Y}_{i}^{a=1}\left(\frac{A_{i}-\widehat{P S}_{i}}{\widehat{P S} _{i}}\right)\right]\right)-\frac{1}{n} \sum_{i=1}^{n}\left(E\left[\frac{\left(1-A_{i}\right) Y_{i}}{1-\widehat{P S} _{i}}\right]-E\left[\hat{Y}_{i}^{a=0}\left(\frac{\left(1-A_{i}\right)-\left(1-\widehat{P S_{i}}\right)}{1-\widehat{P S_{i}}}\right)\right]\right) \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{\widehat{P S}_{i}}\right) E\left[Y_{i}|A=1\right]-\left(\frac{\hat{Y}_{i}^{a=1}}{\widehat{P S_{i}}}\right) E\left[A_{i}-\widehat{P S}_{i}\right]\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{1-\widehat{P S_{i}}}\right) E\left[Y_{i}|A=0\right]-\left(\frac{\hat{Y}_{i}^{a=0}}{1-\widehat{P S_{i}}}\right) E\left[\left(1-A_{i}\right)-\left(1-\widehat{P S_{i}}\right)\right]\right) \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{\widehat{P S}_{i}}\right) E\left[Y_{i}|A=1\right]-\left(\frac{\hat{Y}_{i}^{a=1}}{\widehat{P S}_{i}}\right)\left(E\left[A_{i}\right]-\widehat{P S_{i}}\right)\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{1-\widehat{P S}_{i}}\right) E\left[Y_{i}|A=0\right]-\left(\frac{\hat{Y}_{i}^{a=0}}{1-\widehat{P S}_{i}}\right)\left(E\left[\left(1-A_{i}\right)\right]-\left(1-\widehat{P S_{i}}\right)\right)\right) \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{\widehat{P S}_{i}}\right) E\left[Y_{i} \mid A=1\right]-\left(\frac{\widehat{Y}_{i}^{a=1}}{\widehat{P S}_{i}}\right)\left(\widehat{P S}_{i}-\widehat{P S}_{i}\right)\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\left(\frac{1}{1-\widehat{P S}_{i}}\right) E\left[Y_{i} \mid A=0\right]-\left(\frac{\hat{Y}_{i}^{a=0}}{1-\widehat{P S}_{i}}\right)\left(\left(1-\widehat{P S}_{i}\right)-\left(1-\widehat{P S_{i}}\right)\right)\right) \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(\frac{E\left[Y_{i} \mid A=1\right]}{\widehat{P S}_{i}}-\left(\frac{\hat{Y}_{i}^{a=1}}{\widehat{P S}_{i}}\right)(0)\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\frac{E\left[Y_{i} \mid A=0\right]}{1-\widehat{P S}_{i}}-\left(\frac{\hat{Y}_{i}^{a=0}}{1-\widehat{P S}_{i}}\right)(0)\right) \\ & =\frac{1}{n} \sum_{i=1}^{n}\left(\frac{E\left[Y_{i} \mid A=1\right]}{\widehat{P S}_{i}}\right)-\frac{1}{n} \sum_{i=1}^{n}\left(\frac{E\left[Y_{i} \mid A=0\right]}{1-\widehat{P S}_{i}}\right) \\ & =E\left[Y^{a=1}\right]-E\left[Y^{a=0}\right] \end{aligned}} \]

In either case, we prove that DRE’s unbiasedness in estimating the true average causal effects holds even though either of IPW/REG has been wrongly specified (but the other has to be correctly modeled). But it fails to work if both inverse-probability weighting model (IPW) and outcome regression model (REG) are incorrect.