Background:
We will study treatment response in the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) cohort. Specifically, the proposed research is a secondary data analysis to predict treatment response based on a large set of covariates using novel ensembling machine learning methods. Previous work has focused on a limited number of covariates and less flexible estimation methods. The goal is to develop a better understanding of risk factors for treatment-resistant depression by producing a “risk calculator” for clinical use. This risk calculator will allow clinicians to assess the probability a patient has for successful treatment response at presentation.
Methods Overview:
Given the size and complexity of clinical data, standard parametric methods may not be suitable. Machine-learning techniques have improved abilities for detecting interaction, nonlinear, and higher-order effects. Machine learning methods aim to “smooth” over the data similarly to parametric regression procedures, but they make fewer assumptions and adapt more flexibly to the data. We will produce a “risk calculator” considering all potential predictors (“full set”) as well as a “risk calculator” with the best small subset of variables (e.g., 10) such that the performance of the smaller set is the closest to the performance of the full set amongst all small sets.
Project Lead: Sherri Rose
Tag: Methodology