PsyProxy
Datasets·Medicine·13__social_media__reddit_suicidality__binary

Reddit suicidality subset (binary)

Reddit r/SuicideWatch Posts (UMD) is a public corpus of Reddit posts labeled binary as expressing suicidal ideation or not, sampled from r/SuicideWatch and matched control subreddits. The texts within the corpus frequently explore themes of suicidal thoughts , mental health struggles , and feelings of isolation or emptiness . Many individuals share their experiences with depression , detailing moments of despair and the impact of trauma on their lives, often expressing a desire for relief from their emotional pain. There are also discussions about support systems , with some posts highlighting the importance of friendships and connections that provide temporary solace, while others reflect on feelings of guilt and failure in the context of lost relationships. Additionally, some texts convey moments of personal achievement or coping strategies, such as sobriety or engaging in hobbies, contrasting with the pervasive sense of hopelessness that characterizes many narratives. [Summary on 50 random texts by ChatGPT 4o Mini].

Distribution of Suicidality (suicidal vs non-suicidal)
1
2
8,202 at floor7,275 at ceiling
15,477
items
2,856
holdout n
Suicidality (suicidal vs non-suicidal)
target
Binary
kind
27
systems compared
Criterion validity

Reported holdout systems from the verified card

Binary classification uses FVE as the task-primary metric. Secondary columns keep the companion metrics visible so binary, ordinal, regression, and multiclass cards are not compared through one flattened score.

Source podium · FVE · 10 families
Gold
OpenAI (Rathje)
0.879
Silver
PsyProxy
0.805
Bronze
Topic models
0.742
Model-family mix
OpenAI / LLM · 3PsyProxy · 4Topic model · 2Lexicon · 2Baseline · 16

Best PsyProxy row is #2 overall among all model families on this card.

SystemFamilyVariantFVEAUCF1Primary scale
llmOpenAI Model gpt-4o-mini
OpenAI / LLMpermissive0.8790.9940.973
llmOpenAI Model gpt-5-nano
OpenAI / LLMpermissive0.8390.9900.962
psyproxyPsyProxy — Social Economics Lens v0.5 · 1000d
PsyProxypermissive0.8050.9880.945
psyproxyPsyProxy — Technology Lens v0.5 · 800d
PsyProxypermissive0.8040.9890.942
psyproxyPsyProxy — Behavioral Sciences Lens v0.5 · 1000d
PsyProxypermissive0.8020.9880.945
psyproxyPsyProxy — Health Lens v0.9 · 1100d
PsyProxypermissive0.7980.9880.944
llmOpenAI Model gpt-4.1-nano
OpenAI / LLMpermissive0.7830.9860.957
topicHierarchical Dirichlet Process (tomotopy HDP)
Topic modelpermissive0.5010.8990.877
lexLinguistic Inquiry and Word Count (LIWC)
Lexiconpermissive0.4950.9270.840
topicBERTopic
Topic modelpermissive0.3990.9010.818
lexValence Aware Dictionary and sEntiment Reasoner (VADER)
Lexiconpermissive0.3840.8850.793
baselineTool for the Automatic Analysis of Syntactic Sophistication and Complexity (TAASSC)
Baselinepermissive0.3280.8690.788
baselineEmpath
Baselinepermissive0.2490.8490.733
baselineTextDescriptives
Baselinepermissive0.2220.8180.734
baselineTool for the Automatic Analysis of Lexical Sophistication (TAALES)
Baselinepermissive0.2080.8020.716
baselineTool for the Automatic Analysis of Cohesion (TAACO)
Baselinepermissive0.1230.7340.641
baselineDisneyland TripAdvisor reviews continuous · via best lens
Baselinepermissive
baselineAmazon Video-Games reviews continuous · via best lens
Baselinepermissive
baselineAmazon Video-Games reviews ordinal · via best lens
Baselinepermissive
baselineSentiment140 tweets binary · via best lens
Baselinepermissive
baselineDisneyland TripAdvisor reviews binary · via best lens
Baselinepermissive
baselineIMDB movie reviews (ACL) binary · via best lens
Baselinepermissive
baselineDouban movie reviews (Chinese) ordinal · via best lens
Baselinepermissive
baselineAmazon Video-Games reviews binary · via best lens
Baselinepermissive
baselineDruglib drug reviews regression · via best lens
Baselinepermissive
baselineDruglib drug reviews ordinal · via best lens
Baselinepermissive
baselineLIAR fact-check statements ordinal · via best lens
Baselinepermissive