Integrating SAS® and R to Perform Optimal Propensity Score Matching
Posted: 3 May 2016
In studies where randomization is not possible, imbalance in baseline covariates (confounding by indication) is a fundamental concern. Propensity score matching (PSM) is a popular method to minimize this potential bias, matching individuals who received treatment to those who did not, to reduce the imbalance in pre-treatment covariate distributions.
PSM methods continue to advance, as computing resources expand. Optimal matching, which selects the set of matches that minimizes the average difference in propensity scores between mates, has been shown to outperform less computationally intensive methods. However, many find the implementation daunting. SAS/IML® software allows the integration of optimal matching routines that execute in R, e.g. the R nbpMatching package. This paper walks through performing optimal PSM in SAS® through implementing R functions. It covers the propensity score creation in SAS, the matching procedure, and the post-matching assessment of covariate balance using SAS/STAT® 13.2 and SAS/IML procedures.
In studies where randomization is not possible, statistical methods can be employed to control for potential bias. One method used to control this bias in observational studies is propensity score matching, where individuals who receive a treatment are matched to those who do not in order to reduce the imbalance in pre-treatment covariates (D’Agostino 1998, Rosenbaum and Rubin 1985). Propensity score matching methods continue to advance, as computing resources expand. Optimal matching, which selects the set of matches that minimizes the average difference in propensity scores between mates, has been shown to outperform less computationally intensive methods (Rosenbaum 1989). SAS/IML® software allows the integration of optimal matching routines that execute in R. This allows for a single program to include both SAS® and R code, seamlessly integrating the two languages.
As a driving example, we use the Right Heart Catheterization dataset.This dataset was used to assess the effectiveness of right heart catheterization (RHC) in the initial care of critically ill patients (Connors 1996). For the purpose of this paper, we are performing a propensity score analysis to match RHC patients to non-RHC patients using 39 covariates. We will perform the propensity score analysis and optimal matching as well as assess the covariate balance pre and post matching.