Icon Legend

This session is not in your schedule.

This session is in your schedule. Click again to remove it.

Presentation Icons

Recorded Session

Ticketed Session

262 Views

ViewAttendees 3

Symposia

Technology/Digital Health

Artificial Intelligence (AI) as a means of supporting evidence-based practices in psychological treatments

4 - (SYM 66) Scaling Support: Ai-driven Innovations for 988 and Crisis Helplines

Sunday, November 23, 2025

2:46 PM - 3:00 PM CST

Location: Imperial 11, Level 4

Keywords: Technology / Mobile Health, Training / Training Directors, Competence
Recommended Readings: Stade, E., Stirman, S. W., Ungar, L. H., Yaden, D. B., Schwartz, H. A., Sedoc, J., ... & DeRubeis, R. (2023). Artificial intelligence will change the future of psychotherapy: A proposal for responsible, psychologist-led development., Olawade, D. B., Wada, O. Z., Odetayo, A., David-Olawade, A. C., Asaolu, F., & Eberhardt, J. (2024). Enhancing mental health with Artificial Intelligence: Current trends and future prospects. Journal of medicine, surgery, and public health, 100099., Kopelovich, S. L., Slevin, R., Brian, R. M., Shepard, V., Baldwin, S. A., Ben-Zeev, D., ... & Imel, Z. (2025). Preliminary investigation of an artificial intelligence-based cognitive behavioral therapy training tool. Psychotherapy., Imel, Z. E., Pace, B., Pendergraft, B., Pruett, J., Tanana, M., Soma, C. S., ... & Atkins, D. C. (2024). Machine Learning–Based Evaluation of Suicide Risk Assessment in Crisis Counseling Calls. Psychiatric services, 75(11), 1068-1074., NARAYANAN, A., & KAPOOR, S. (2024). AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference. Princeton University Press. https://doi.org/10.2307/jj.14736606

Speaker(s)

Christina S. Soma, Ph.D. (she/her/hers)

Post-doctoral Fellow
Lyssn.io
Fort Collins, CO, United States

Co-author(s)

ZI

Zac Imel, Ph.D.

Chief Science Officer
Lyssn.io
Salt Lake City, UT, United States
BP

Brian Pace, PhD (he/him/his)

Director of Clinical AI
Lyssn.io
Seattle, WA, United States
EB

Elizabeth Burr, BA

Operations manager
protoCall Services, Inc.
Portland, OR, United States
BP

Brad Pendergraft, LCSW

Chief clinical officer
protoCall Services, Inc.
Portland, OR, United States
MT

Michael Tanana, Ph.D. (he/him/his)

Chief Technology Officer
Lyssn.io, Inc.
Seattle, WA, United States
DA

David Atkins, PhD (he/him/his)

CEO
Lyssn.io
Seattle, WA, United States

Abstract Body

Suicide risk assessment is a critical component of crisis counseling. National standards require continuous quality improvement of 988 and crisis counseling services, and the need for accurate and thorough risk assessment in community treatment is integral to client well-being. Traditional quality improvement efforts rely on human evaluation of sessions, a process that is challenging to scale. Training providers to conduct a thorough risk assessment is severely lacking due to a dearth of resources. Advances in machine learning (ML) and artificial intelligence (AI) offer the potential to automate the detection of suicide risk assessment, thereby enhancing the scalability and efficiency of training and quality improvement initiatives.

In collaboration with ProtoCall Services, we conducted a two phase research study to develop and evaluate machine learning algorithms to support the automatic identification of risk assessments that occur during crisis interactions (phase one), and to explore the impact of this feedback on caller outcomes (phase two). Through the development of these ML models, we collaborated with content experts in the field of suicideology (Drs. David Jobes and Samantha Chalker) to create a suicide prevention training, which includes automatically generated feedback on key risk assessment skills.

To create and train our risk assessment classification AI models, a coding team manually labeled 193,257 statements across 476 crisis counseling calls, identifying core elements of risk assessment (based on the Crisis Chat Abstraction Form; Lake et al, 2022). This labeled dataset was used to fine-tune a transformer-based ML model, with separate training, validation, and test datasets employed to assess model performance.

For detecting any risk assessment, the model achieved 98% agreement with human ratings, relative to human interrater agreement. At the call level, the average F1 score—a harmonic mean of precision and recall—was 0.86, while at the statement level, it was 0.66. Variations in F1 scores across specific labels were often due to low base rates for certain risk assessment components. Key components of risk assessments and empathic communication were selected as modules for training.

The findings indicate that ML models can reliably detect suicide risk assessment, presenting a viable solution to scale quality improvement and training efforts in this domain.