Background

Section 1:

Overview

There are a variety of task-sharing approaches utilized in different contexts. We include general background information in this chapter to capture some of the ways task sharing is implemented; however, we recognize that the use cases outlined do not reflect the reality of all countries or contexts. We hope that these materials can help identify opportunities for additional testing and validation of AI in other types of task-sharing programs across diverse contexts.

An ever-growing shortage of mental health providers exists globally, despite their services being more necessary than ever. More than half of all people will experience a mental health condition in their lifetimes, but there are only 13 mental health providers for every 100,000 people worldwide.Access is even more limited for people in low- and middle-income countries (LMICs), for people in conflict settings, for people with marginalized identities, and for or those in rural and low-income communities.

To address this gap, a spectrum of mental health skill-building programs has been developed to train individuals to provide support or care to those in need. These efforts include training volunteers and laypeople to provide care and support and upskilling licensed professionals. For example, there are programs in which individuals with lived experience are trained to provide peer support (that is, peer support programs), programs in which licensed counselors provide support to individuals in crisis (that is, crisis counseling), and programs in which laypeople, nonspecialist health professionals, or community workers are trained to provide evidence-based care (that is, task-sharing programs).

Since task-sharing programs focus on addressing workforce shortages by expanding access to evidence-based care, the following chapters and sections in this field guide focus on these programs and how AI could scale their efforts. At the same time, we acknowledge that many of the AI use cases and topics explored in this field guide may also apply across the broader spectrum of mental health skill-building programs (such as programs that upskill mental health professionals).

Section 3:

The opportunity for AI

AI offers novel opportunities that can address implementers’ challenges while maintaining quality and sustainability and enhancing efficiencies across the implementation cycle of task-sharing programs. By streamlining training, supervision, and decision-making, AI could help task-sharing programs scale up evidence-based interventions while maintaining high standards of consistency. Other mental health skill-building programs not focused on task-sharing might also find these AI solutions applicable to their contexts, enabling them to expand their reach and impact.

What is AI?

AI refers to the capability of computer systems to perform complex tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, decision-making, and creating. We recognize that this is a field that is seeing multiple innovations at a fast pace, and that definitions and guidance related to it may rapidly evolve and change. The perspectives here represent views from late 2024/early 2025 when this work was conducted.

Gen AI is a subset of AI that focuses on generating new content, including text, images, music, and other media. It uses machine learning, deep learning, and natural language processing to generate outputs that resemble human-created content. Large language models (LLMs) exemplify how gen AI can produce humanlike content by predicting and assembling words in context. For simplicity, the term “AI” is used throughout this field guide and refers to these tools broadly (Exhibit 1).

Exhibit 1: Subsets of Artificial Intelligence

Machine learning: A subset of artificial intelligence that focuses on computer algorithms that automatically learn and improve through experience without being explicitly programmed.
Deep learning: A type of machine learning method that uses neural networks (that is, computation models) with multiple processing layers to learn complex patterns in data.
Natural language processing: A field of AI that focuses on providing machines with the ability to understand and communicate using written and spoken human languages.
Large language models: A type of deep learning model trained on massive text datasets to understand and generate natural language—or human-like text—often used in applications such as chatbots and summarization.

How does an LLM work?

At their core, LLMs are models trained to predict the next word in language sequences or “fill in the blanks.”

Learn more

Transformers allow the model LLMs to weigh the importance of different parts of a text and capture context and meaning. For example, as shown in the exhibit below, the transformer model analyzes the sentence “The red fire engine drove quickly down the” (Exhibit 2). It uses attention mechanisms to connect words that are relevant to each other. This process helps the model understand the relationships between words and generate a more coherent and meaningful output.

Exhibit 2: Large Language Models in practice

^{Source for second image: Ashish Vaswani et al., “Attention is all you need,” arXiv, originally published 2017, updated August 2023.}

For an LLM to predict the next word in a sequence, it must be trained on vast amounts of data, specifically language-based data. This makes models particularly good at language-based tasks and therefore suitable to support mental health services, which are largely delivered verbally. More recently, AI models have been trained on multimodal data (such as images and audio) and to predict multimodal outputs, expanding their potential mental health use cases. For example, they could eventually be used to design images for imaginal exposures or provide audio-based guidance to clinicians through gentle suggestions.

Using LLMs in mental health settings

The data that LLM models are trained on can come from general sources (such as the internet) or from more tailored sources (such as psychology textbooks or therapy session transcripts). When models are trained on data from general sources, the data set is likely to include some information about mental health, but that information may not be evidence-based, may be biased, and may perpetuate mental health stigma, which affects the output that the model generates. As such, it is often recommended that models be trained or fine-tuned specifically to the contexts they will operate in, especially if they are being used in high-risk contexts such as healthcare and mental health.

There are multiple ways to prepare models to be trained to perform mental health tasks. One approach is to train the model using tailored data that is highly curated, evidence-based, representative, and specific to the mental health task the model will perform. Another approach is to take a model that was trained using data from general sources and fine-tune that model for a specific mental health use case using additional, tailored mental health data. Additional approaches—which can be used for models with data from either tailored or general sources—include giving the model “if-then” instructions for the types of outputs they should generate when presented with different mental health scenarios (in other words, prompt engineering) or having human mental health experts rate model output for quality and safety. The ratings would effectively teach the model how best to generate outputs (in other words, reinforcement learning with human feedback).

Models must be thoroughly tested before they are used with real people. This testing should involve a thorough evaluation to ensure that models are safe and unbiased prior to an initial launch. Additionally, safety and bias should be monitored on an ongoing basis and additional guardrails implemented, or fine-tuning applied, if issues are identified. Quality assurance must be an ongoing process in AI—not a one-time checkpoint.

When thoughtfully developed and responsibly deployed, AI could become a powerful ally in scaling mental health programs. In the next chapter, we explore how AI could enhance task sharing programs, highlighting illustrative AI use cases and real-world examples that also cover peer support and crisis counseling programs.

What is AI?

Exhibit 1: Subsets of Artificial Intelligence

How does an LLM work?

Exhibit 2: Large Language Models in practice

Using LLMs in mental health settings

Next up