Datasets·Medicine·13__social_media__reddit_suicidality__binary

Reddit suicidality subset (binary)

Reddit r/SuicideWatch Posts (UMD) is a public corpus of Reddit posts labeled binary as expressing suicidal ideation or not, sampled from r/SuicideWatch and matched control subreddits. The texts within the corpus frequently explore themes of suicidal thoughts , mental health struggles , and feelings of isolation or emptiness . Many individuals share their experiences with depression , detailing moments of despair and the impact of trauma on their lives, often expressing a desire for relief from their emotional pain. There are also discussions about support systems , with some posts highlighting the importance of friendships and connections that provide temporary solace, while others reflect on feelings of guilt and failure in the context of lost relationships. Additionally, some texts convey moments of personal achievement or coping strategies, such as sobriety or engaging in hobbies, contrasting with the pervasive sense of hopelessness that characterizes many narratives. [Summary on 50 random texts by ChatGPT 4o Mini].

Distribution of Suicidality (suicidal vs non-suicidal)

8,202 at floor7,275 at ceiling

15,477

items

2,856

holdout n

Suicidality (suicidal vs non-suicidal)

target

Binary

kind

systems compared

Criterion validity

Reported holdout systems from the verified card

Binary classification uses FVE as the task-primary metric. Secondary columns keep the companion metrics visible so binary, ordinal, regression, and multiclass cards are not compared through one flattened score.

Source podium · FVE · 10 families

Gold

OpenAI (Rathje)

0.879

Silver

PsyProxy

0.805

Bronze

Topic models

0.742

Model-family mix

OpenAI / LLM · 3PsyProxy · 4Topic model · 2Lexicon · 2Baseline · 16

Best PsyProxy row is #2 overall among all model families on this card.

SystemFamilyVariantFVEAUCF1Primary scale

llmOpenAI Model gpt-4o-mini

OpenAI / LLMpermissive0.8790.9940.973

llmOpenAI Model gpt-5-nano

OpenAI / LLMpermissive0.8390.9900.962

psyproxyPsyProxy — Social Economics Lens v0.5 · 1000d

PsyProxypermissive0.8050.9880.945

psyproxyPsyProxy — Technology Lens v0.5 · 800d

PsyProxypermissive0.8040.9890.942

psyproxyPsyProxy — Behavioral Sciences Lens v0.5 · 1000d

PsyProxypermissive0.8020.9880.945

psyproxyPsyProxy — Health Lens v0.9 · 1100d

PsyProxypermissive0.7980.9880.944

llmOpenAI Model gpt-4.1-nano

OpenAI / LLMpermissive0.7830.9860.957

topicHierarchical Dirichlet Process (tomotopy HDP)

Topic modelpermissive0.5010.8990.877

lexLinguistic Inquiry and Word Count (LIWC)

Lexiconpermissive0.4950.9270.840

topicBERTopic

Topic modelpermissive0.3990.9010.818

lexValence Aware Dictionary and sEntiment Reasoner (VADER)

Lexiconpermissive0.3840.8850.793

baselineTool for the Automatic Analysis of Syntactic Sophistication and Complexity (TAASSC)

Baselinepermissive0.3280.8690.788

baselineEmpath

Baselinepermissive0.2490.8490.733

baselineTextDescriptives

Baselinepermissive0.2220.8180.734

baselineTool for the Automatic Analysis of Lexical Sophistication (TAALES)

Baselinepermissive0.2080.8020.716

baselineTool for the Automatic Analysis of Cohesion (TAACO)

Baselinepermissive0.1230.7340.641

baselineDisneyland TripAdvisor reviews continuous · via best lens

Baselinepermissive———

baselineAmazon Video-Games reviews continuous · via best lens

Baselinepermissive———

baselineAmazon Video-Games reviews ordinal · via best lens

Baselinepermissive———

baselineSentiment140 tweets binary · via best lens

Baselinepermissive———

baselineDisneyland TripAdvisor reviews binary · via best lens

Baselinepermissive———

baselineIMDB movie reviews (ACL) binary · via best lens

Baselinepermissive———

baselineDouban movie reviews (Chinese) ordinal · via best lens

Baselinepermissive———

baselineAmazon Video-Games reviews binary · via best lens

Baselinepermissive———

baselineDruglib drug reviews regression · via best lens

Baselinepermissive———

baselineDruglib drug reviews ordinal · via best lens

Baselinepermissive———

baselineLIAR fact-check statements ordinal · via best lens

Baselinepermissive———