(Note: if you are reading the PDF of this file from the git repository, it may be out of date. It is included in the repo for convenience but the canonical writeup is in the org file in the repo).
Chronic lower back pain (cLBP) is the leading cause of disability world wide and, because of an aging population, the problem is getting worse [cite:@HARTVIGSEN20182356]. Treatment practices in the United States in particular have indicated a link between cLBP and the opiod epidemic [cite:@Shipton2018]. Ashar et al investigates the effectiveness of pain reprocessing therapy (PRT) on cLBP and finds a non-trivial effect [cite:@ashar2022effect]. In this writeup I analyze the demographic data associated with that study and find some compelling correlations between demographic variables and the effectiveness of both PRT and the open placebo (Saline).
Subjects in the Ashar study were placed into one of three groups: Pain Reprocessing Therapy, saline injection in the lower back, and standard of care[cite:@ashar2022effect]. PRT consisted of one telehealth session and 8 sessions with PRT specialized therapists[cite:@ashar2022effect]. The saline injection was an open placebo[cite:@ashar2022effect]. Patients were tracked for 12 months after treatment to determine the long term effectiveness of each intervention using a variety of standardized questionaires covering cLBP symptoms and other pertinent information[cite:@ashar2022effect].
The results relevant to this analysis are succinctly summarized by Figure fig:ashar_summary.
The question examined in this analysis is how demographics interacts with treatment.
Ashar and his colleagues made their data sets available to the public [cite:@ashardata]. The included demographic data includes information about education, ethnicity, hispanic status, employment status, exercise, handedness, self-reported socioeconomic status, marital status, age, weight, gender, and backpain length.
These data constitute a mixed space of unordered discrete, ordered discrete, and ordered continuous variables and thus do not lend themselves to simple methods of dimensionality reduction. Nevertheless, it is unlikely that the patient population fills this abstract space of variability.
To understand whether there are clusters in the demographic data (as a first step towards understanding the effects of demographic status alongside the intervention studied here) we trained a variational auto-encoder[cite:@MAL-056] on this demographic data as a method of dimensionality reduction. An auto-encoder is a simple neural network which is charged with reproducing data placed on its input layer at its output layer. The utility of such an exercise is that the middle layer of such a network typically consists of a small number of neurons, which forces the network to encode the input in a small number of dimensions.
Variational Autoencoders use the inner layer to parameterize a probability distribution, samples from which are then translated by the second half of the network into the original output. This exercise improves the smoothness of the representation of lower dimensional representation of the data.
In this analysis our variational autoencoder was very simple: it had one intermediate layer of dimension 3, an inner latent space with dimension 2, and a symmetric decoder (with the exception of a dropout layer on the input of the encoder).
After training this network for 30,000 epochs we attained a two-dimensional representation of the demographic data shown in Figure fig:vae.
A visual inspection of the two-dimensional representation of the data suggests four clusters and a cloud of outliers. The outlier cloud was identified by hand and the remaining clusters identified using spectral clustering[cite:@scikit-learn].
A challenge associated with non-linear dimensionality reduction combined with cluster analysis is the difficulty of associating meaning with each cluster since the transformation to the lower dimensional space is not easily interpreted.
There are several possible solutions to this problem, but here we employed an unsupervised method: for each cluster, we trained a tree based model (AdaBoost [cite:@freund1997decision]) to predict whether a point would be classified into that cluster. From such a model we can extract the variables which are most important for the classification and then calculate summary statistics for each cluster. Using this method we identified the four distinct clusters as:
- Younger, Male, Unmarried, White
- Older, Male, Married, White
- Lower Weight, Female, Unmarried, White
- Median Weight, Female, Married, White
A pat characterization of the diffuse outlier cloud is not furnished so easily, but most of these data points are characterized by being non-white participants.
Now that we have a plausible clustering of the subjects by demographic character, it is natural to ask whether these demographic identities correspond to treatment effect differences. This is show in Figure fig:demo_effect.
The picture makes the case that there are strong effects of demographic group on treatment effect. In particular, older, unmarried men (who are white, like most participants) and younger, unmarried women (also white) benefit the most from the treatment. Other demographic groups benefit less from the intervention.