Propensity score

Jump to navigation Jump to search

Editor-In-Chief: C. Michael Gibson, M.S., M.D. [1]

Please Take Over This Page and Apply to be Editor-In-Chief for this topic: There can be one or more than one Editor-In-Chief. You may also apply to be an Associate Editor-In-Chief of one of the subtopics below. Please mail us [2] to indicate your interest in serving either as an Editor-In-Chief of the entire topic or as an Associate Editor-In-Chief for a subtopic. Please be sure to attach your CV and or biographical sketch.

Overview

In medical research, observational studies do not allow investigators to have control over treatment assignment. As a result, covariates (i.e. age, sex) between treatment groups may differ significantly, which causes biased estimates of treatment effects.

For example, when analyzing the differences in outcomes among patients that received a lung transplant in a study versus those that did not, the lung transplant cohort may have been older and had a lower body weight than the cohort that did not receive a transplant. Since the two cohorts were unbalanced on these two covariates, the estimate for lung transplant will be biased.

Propensity scores help to eliminate this bias.

Definition

In the analysis of treatment effects, suppose that we have a binary treatment T, an outcome Y, and background variables X. The propensity score is defined as the conditional probability of treatment given background variables:

<math>p(x) \ \stackrel{\mathrm{def}}{=}\ \Pr(T=1 | X=x).</math>

The propensity score was introduced by Rosenbaum and Rubin (1983) to provide an alternative method for estimating treatment effects when treatment assignment is not random, but can be assumed to be unconfounded. Let Y(0) and Y(1) denote the potential outcomes under control and treatment, respectively. Then treatment assignment is (conditionally) unconfounded if treatment is independent of potential outcomes conditional on X. This can be written compactly as

<math>T \perp Y(0), Y(1) | X\,</math>

where <math>\perp</math> denotes statistical independence.

Rosenbaum and Rubin showed that if unconfoundedness holds, then

<math>T \perp Y(0), Y(1) | p(X).</math>

While it is cognitively impossible to use the definition above for determining whether unconfoundedness holds in any specific situation, Pearl (2000) has shown that a simple graphical criterion called backdoor provides an equivalent definition of unconfoundedness.

Application

There are three commonly used ways to incorporate propensity scores into an analysis of treatment effects: matching, stratification, and regression adjustment. In each of these methods, the propensity score is created in the same manner, but the way in which the score is used varies. One common way to estimate the propensity score is with logistic regression of treatment predicted by clinically relevant or significant baseline covariates. The advantage of using a propensity score in addition to a logistic regression of treatment predicted by covariates is that the propensity score creates a randomized way of comparing the treatment group to the control group. When paired based on propensity, each subject is equally likely (i.e. had the same probability) to receive a given treatment.

Matching

One to one matching can be difficult, especially when there are numerous covariates to match. It is easier to match using propensity scoring because the propensity score is a scalar assigned to each patient that incorporates the effect of all covariates in the model.

The first step is to match a treated subject with a control subject based on their respective propensity scores. Exact matching of scores would be nearly impossible, so a range of values must be determined. According to Rosenbaum and Rubin [1], a quarter of a standard deviation of the logit of the propensity score is an appropriate range. Once the subjects are paired, pre and post-matching baseline characteristics between means of covariates for the treatment and control groups are compared. If the post-matching comparison of means is more similar than the pre-matching comparison, the propensity matching has reduced the bias of the treatment effect.

Stratification

Another way to consider propensity is through stratification. Using this method, the propensity score is calculated and then divided into groups. Rosenbaum and Rubin[2] suggest that the propensity be stratified into quintiles because this usually eliminates over 90% of the bias in each covariate. Means of the baseline characteristics between treated subjects and controls are compared pre and post stratification. For post stratification comparison of means, an adjustment is made using a categorical variable representing the propensity quintiles. One way to determine an overall treatment effect is to individually model treatment predicted by the propensity score for each quintile and then combine the estimates determined by each quintile. Another way is to model outcome predicted by treatment and either the raw propensity score or the propensity quintile. A subset of covariates can also be included in this model.

Regression Adjustment

Regression Adjustment is also a useful way to incorporate propensity scoring. With this method, a regression of the outcome using a large set of background covariates is performed to obtain the propensity score. Then once the propensity scores are obtained, another regression of the outcome predicted by treatment group and propensity score is used to analyze treatment effect. A subset of important covariates can also be included in this model. Both models, the model with the subset of covariates and the model without the subset, should yield the same conclusions. Stratification and regression adjustment methods can be combined and may produce more accurate results than any one individual method from above.

STATA Code

For the purposes of these examples, the data is entered as one line per subject.

Subject ID Transfusion status Sex Age BMI
1 0 male 91 31.5
2 1 female 45 33.7
3 0 female 33 25


Matching

Generating Propensity Score

xi: logistic lungtransplant age sex bmi

predict propensity

  • Note: Now we divide the propensity score into ranges to match on. To develop the ranges, look at the distribution of the propensity values.

gen propensity_class=1 if propensity<0.1

replace propensity_class=2 if 0.1<=propensity & propensity<0.2

replace propensity_class=3 if 0.2<=propensity & propensity<0.3

replace propensity_class=4 if 0.3<=propensity & propensity<0.4

replace propensity_class=5 if 0.4<=propensity & propensity<0.5

replace propensity_class=6 if 0.5<=propensity & propensity<0.6

replace propensity_class=7 if 0.6<=propensity & propensity<0.7

replace propensity_class=8 if 0.7<=propensity & propensity<0.8

replace propensity_class=9 if 0.8<=propensity & propensity<0.9

replace propensity_class=10 if 0.9<=propensity & propensity<=1

save "c:\transplant.dta", replace

use "c:\transplant.dta", clear

keep if lungtransplant==1

sort propensity_class

save "c:\transplant_yes.dta", replace

use "c:\transplant.dta", clear

keep if lungtransplant==0

rename id id_no

rename lungtransplant lungtransplant_no

rename age age_no

rename sex sex_no

rename bmi bmi_no

sort propensity_class

save "c:\transplant_no.dta", replace

merge propensity_class using "c:\transplant_yes.dta"

tab _merge

keep if _merge==3

drop _merge

save "c:\matched_cohort.dta", replace

Now the dataset is arranged as such:

Propensity_class Id Id_no Lungtransplant Lungtransplant_no Age Age_no sex sex_no bmi bmi_no
1 100 203 1 0 91 88 female female 31.5 32
2 101 215 1 0 45 47 male male 33.7 35
3 102 145 1 0 33 31 male female 25 22.5

use "c:\matched_cohort.dta", clear

keep id lungtransplant age sex bmi

save "c:\matched_cohort_yes.dta", replace

use "c:\matched_cohort.dta", clear

keep id_no lungtransplant_no age_no sex_no bmi_no

rename id_no id

rename lungtransplant_no lungtransplant

rename age_no age

rename sex_no sex

rename bmi_no bmi

save "c:\matched_cohort_no.dta", replace

append using "c:\matched_cohort_yes.dta"

stset days2death, failure(death)

stcox lungtransplant

Stratification

Generating Propensity Score

pscore lungtransplant age sex bmi, pscore(mypscore) blockid(myblock)

Incorporating Propensity Score Stratification in the Model

stset days2death, failure(death)

stcox lungtransplant, strata(myblock)

Regression

Generating Propensity Score

logistic lungtransplant age sex bmi

predict propensity

Incorporating Propensity Score in the Model

stset days2death, failure(death)

  • Model 1- Death predicted by lung transplant status (0/1) and propensity score*

stcox lungtransplant propensity

  • Model 2- Death predicted by lung transplant status (0/1), propensity score and a set or subset of important covariates*

stcox lungtransplant age sex bmi propensity


References

  1. Rosenbaum, P. R. and Rubin, D. B. "Constructing a control group using multivariate matched sampling methods that incorporate the propensity score," American Statistician, 39, 33-38 (1985)
  2. Rosenbaum, P. R. and Rubin, D. B. "Reducing bias in observational studies using subclassication on the propensity score," Journal of the American Statistical Association, 79, 516-524 (1984)

Additional Resources

  • D'Agostino, R. B. (2007). "Propensity Scores in Cardiovascular Research," Circulation;115;2340-2343.
  • D'Agostino, R. B. (1998). "Tutorial in Biostatistics: Propensity Score Methods for Bias Reduction in the Comparison of a Treatment to a non-Randomzied Control Group," Statistics in Medicine. 17, 2265-2281.
  • Pearl, J. (2000). Causality: Models, Reasoning, and Inference, Cambridge University Press.
  • Rosenbaum, P. R., and Rubin, D. B., (1983), "The Central Role of the Propensity Score in Observational Studies for Causal Effects," Biometrika 70, 41-55.

Template:WH Template:WS