APP下载

Biostatistics in Psychiatry (6)Estimating treatment effects in observational studies

2011-04-12JuliaLINYingLU

上海精神医学 2011年6期

Julia Y. LIN*, Ying LU,2

· Research Methods ·

Biostatistics in Psychiatry (6)Estimating treatment effects in observational studies

Julia Y. LIN1*, Ying LU1,2

In randomized treatment studies the randomization of subjects to the different treatment conditions ensures that the treatment groups are comparable in their baseline characteristics—measured or unmeasured—so we can confidently attribute differences in treatment outcomes to the assigned treatments. In contrast,subjects in observational studies are not randomly assigned to the treatment groups so differences in treatment outcomes could be due to differences in baseline characteristics between the treatment groups.For example, if we wished to compare the outcome of high-intensity treatment for depression (i.e., many visits in the prior 12 months) versus the outcome of low-intensity treatment for depression (i.e., few visits in the prior 12 months) and included subjects from both primary care and specialty mental health clinics,any observed differences in the outcomes for lowintensity and high-intensity treatment could be due to differences in the proportions of subjects that were treated in the two types of clinics. When the treatments being compared (e.g., low versus high intensity of care)and other factors that can affect the outcome (e.g., type of clinic or patient characteristics) are associated with each other, there is confounding. Confounding makes it difficult to determine whether the treatment of interest truly causes the outcome because the apparent treatment effect could be partly due to its association with the confounding variables. Without appropriate adjustment for confounding variables one may come to biased and misleading conclusions about the effect of the treatment of interest.

Regression adjustment

The most common way to adjust for confounding variables and reduce bias is through the use of multiple regression models that regress the outcome of interest on a set of covariates that includes the treatment indicator (e.g., intervention vs. control) and the measures of identified confounding variables. The simplest multiple regression model for this analysis is an additive model, which includes the main effects for treatment and other covariates but excludes potential interactions between treatment and other covariates.This model assumes implicitly that the difference in the treatment effect between the intervention and control groups is the same for all patients irrespective of other covariaties. Under the additive model the interpretation of the regression coefficient for the treatment indicator is the effect of the treatment on the outcome after adjusting for (or holding constant) the other variables in the model.

It is important to note that difference in the treatment effect between the two groups may differ for patients with different characteristics (e.g., the difference in the outcome for high versus low intensity treatment could vary depending on the severity of depression at baseline), in which case we say that there is an interaction between treatment and the patient characteristics. The multiple regression model for this analysis needs to include main effects for treatment and other covariates and variables that account for the interactions between treatment and other covariates.In the presence of treatment-covariate interactions,the regression coefficients for the previously described multiple regression model with only main effects will not adequately reflect treatment effect. (Editor’s note:Further discussions on treatment-covariate interactions will be given in a future column in this series.)

Propensity score methods of adjustment

Another way to adjust for confounding variables is to use propensity score methods. Propensity score, in the context of intervention studies, is the probability(propensity) of being in a particular treatment group(e.g., intervention) versus the other treatment group(e.g., standard care). There are several methods of using propensity scores to adjust for confoundingvariables;the most common are matching, stratification,regression model adjustment, and weighting.

In the previous example comparing the intensity of care for depression that was potentially confounded by the type of clinic the treatment was provided in (primary care vs. specialty), one way to adjust for the type of clinic is through stratifying or matching, which would allow one to compare outcomes between high intensity and low intensity groups in the same clinic type. There are instances when one would like to match or stratify on more than one variable. However, it may be difficult to match on several variables when the sample size is limited. For example, if we are trying to match on 5 dichotomous variables, the sample would need to be divided into 32 (25) strata, so a very large sample would be needed to find matches within all 32 groups.Even more strata may be needed when matching on continuous variables such as age.

Propensity score matching allows one to match on several variables with a single score. To estimate propensity score, fit a logistic regression model with the treatment group (intervention vs. control) as the dependent variable and the list of potentially confounding variables that one wishes to match on as the independent variables; the algorithm used to compute the propensity score for membership in the intervention group uses the coefficients from the resulting regression model. Most statistical packages that perform logistic regression model analysis have options for producing estimated probabilities of the dependent variable (e.g., being in the intervention group) or estimated probabilities on the logit scale, both of which can be used for propensity score matching.The next step is to have a 1:1 or 1:n match of study subjects from the two groups with similar propensity scores. There are many ways to find matches, including nearest neighbor matching and caliper matching[1]. The goal is to balance the confounding factors between the groups, so after matching is completed the similarities in the distributions of the matched variables between the groups should be assessed[2]. When using the propensity score matching method some subjects may not be selected as matches and dropped from the analysis because there is no subject with a similar propensity score in the opposite group; inclusion of such dissimilar cases in the analysis could bias the results.

Propensity scores can also be used in other ways.In stratification the study sample is divided into strata(typically 5 strata) based on their estimated propensity scores and the outcomes are compared between the intervention and the control groups within each of the strata, taking into account the different sample sizes of the groups within strata. The estimated propensity score can also be included as a covariate in the regression model of the outcome to adjust for confounding variables. Another option is to use the inverse of the propensity score to weight the outcomes when comparing outcomes in the treatment groups(“inverse probability weighting”). Use of the various propensity score methods in cardiovascular research are summarized by D’Agostino[3]and by Lunceford and Davidian[4]; a review of their use in psychiatric research can be found in Van der Weele[5].

Comparison of the pros and cons of regression adjustment and propensity score matching

Adjustment using regression analysis is easy to implement with widely available statistical software packages but these regression models often require assumptions about the relationship (e.g., linearity)between independent and dependent variables that may or may not be appropriate. Adjustment of variables through matching does not require such an assumption so it is a more conservative method.

Adjustment through propensity score methods may be more appropriate when there is little overlap in the distributions of confounding variables between the comparison groups. For example, when the majority of subjects in the intervention group are younger and the majority of subjects in the control group are older.In this type of situation inferences of treatment effects from typical regression models would be extrapolations,and may be misleading. This inappropriate use of regression models can be hard to detect because unsuspecting users (and readers) are usually not informed about differences in the distributions of the variables included in the regression model.

One limitation that affects both regression adjustment and propensity score methods is that either method can only adjust for confounding variables that are observed and measured[6]. If there are any important variables that explain the relationship between the treatment and the outcome that are not measured, then neither method would be very helpful.

Conclusion

In observational intervention studies where study subjects are not randomly assigned to treatment arms,selection bias is often a concern. Typically, research analysis statistically adjusts for observed confounding variables by including those variables in regression models. However, when the required relationship between the independent and the dependent variables is not met, or when there is little overlap in the distributions of the confounding variables between the comparison groups, it may be more appropriate to use propensity score methods to adjust for confounding.

1. Rosenbaum PR. Discussing hidden bias in observational studies.Anna Intern Med, 1991, 115(11):901-905.

2. Austin PC. Balancing diagnostics for comparing the distribution of baseline covariates between treatment groups in propensityscore matched samples. Stat Med, 2009, 28(25):3083-3107.

3. D'Agostino RB. Propensity score in cardiovascular research.Circulation, 2007, 115(17): 2340-2343.

4. Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment effects:a comparative study. Stat Med, 2004, 23(19):2937-2960.

5. VanderWeele T. The use of propensity score methods in psychiatric research. Int J Methods Psychiatr Res, 2006, 15(2):95-103.

6. Rosenbaum PR, Rubin DB. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am Stat, 1985, 39(1):33-38.

10.3969/j.issn.1002-0829.2011.06.010

1US Department of Veterans Affairs Cooperative Studies Program Coordinating Center, Palo Alto VA Health Care System, Palo Alto, CA, USA;

2Department of Health Research and Policy, Stanford University, Palo Alto, CA, USA

*Correspondence: Julia.Lin@va.gov

Julia Lin is a Biostatistician at the US Department of Veterans Affairs Cooperative Studies Program Coordinating Center at the Palo Alto VA Health Care System. Her research interests include design and analysis of clinical trials and causal modeling methods, with particular interest in the area of psychiatry. E-mail: Julia.Lin@va. gov.

Professor Ying Lu is Director of the US Department of Veterans Affairs Cooperative Studies Program Coordinating Center at the Palo Alto VA Health Care System and Professor of Biostatistics in the Department of Health Research and Policy at Stanford University. His research interests are clinical trials designs and data analysis, statistical methods to evaluate medical diagnosis, medical decision making, meta-analysis, and radiology. E-mail: Ying.Lu@va. gov.