Examining thematic similarity, difference, and membership in three online mental health communities from reddit: A text mining and visualization approach

Albert Park, Mike Conway, Annie T. Chen

Research output: Contribution to journalArticlepeer-review

125 Scopus citations


Objectives Social media, including online health communities, have become popular platforms for individuals to discuss health challenges and exchange social support with others. These platforms can provide support for individuals who are concerned about social stigma and discrimination associated with their illness. Although mental health conditions can share similar symptoms and even co-occur, the extent to which discussion topics in online mental health communities are similar, different, or overlapping is unknown. Discovering the topical similarities and differences could potentially inform the design of related mental health communities and patient education programs. This study employs text mining, qualitative analysis, and visualization techniques to compare discussion topics in publicly accessible online mental health communities for three conditions: Anxiety, Depression and Post-Traumatic Stress Disorder. Methods First, online discussion content for the three conditions was collected from three Reddit communities (r/Anxiety, r/Depression, and r/PTSD). Second, content was pre-processed, and then clustered using the k-means algorithm to identify themes that were commonly discussed by members. Third, we qualitatively examined the common themes to better understand them as well as their similarities and differences. Fourth, we employed multiple visualization techniques to form a deeper understanding of the relationships among the identified themes for the three mental health conditions. Results The three mental health communities shared four themes: sharing of positive emotion, gratitude for receiving emotional support, and sleep- and work-related issues. Depression clusters tended to focus on self-expressed contextual aspects of depression, whereas the Anxiety Disorders and Post-Traumatic Stress Disorder clusters addressed more treatment- and medication-related issues. Visualizations showed that discussion topics from the Anxiety Disorders and Post-Traumatic Stress Disorder subreddits shared more similarities to one another than to the depression subreddit. Conclusions We observed that the members of the three communities shared several overlapping concerns (i.e., sleep- and work-related problems) and discussion patterns (i.e., sharing of positive emotion and showing gratitude for receiving emotional support). We also highlighted that the discussions from the r/Anxiety and r/PTSD communities were more similar to one another than to discussions from the r/Depression community. The r/Anxiety and r/PTSD subreddit members are more likely to be individuals whose experiences with a condition are long-term, and who are interested in treatments and medications. The r/Depression subreddit members may be a comparatively diffuse group, many of whom are dealing with transient issues that cause depressed mood. The findings from this study could be used to inform the design of online mental health communities and patient education programs for these conditions. Moreover, we suggest that researchers employ multiple methods to fully understand the subtle differences when comparing similar discussions from online health communities.

Original languageEnglish
Pages (from-to)98-112
Number of pages15
JournalComputers in Human Behavior
StatePublished - Jan 2018
Externally publishedYes


  • Anxiety disorders
  • Consumer health information
  • Consumer health information
  • Depression
  • Post-traumatic
  • Stress disorders
  • Unsupervised machine learning


Dive into the research topics of 'Examining thematic similarity, difference, and membership in three online mental health communities from reddit: A text mining and visualization approach'. Together they form a unique fingerprint.

Cite this