LLMs and trust

Overview

What is this project?

An exploration into conversational design, LLMs and trust. Specifically exploring moments of uncertainty on the part of the LLM and how users emotionally respond to them. I am running this project in small manageable sprints alongside my other activities.

Skills used

Commercial research

Remote and in-person testing and data collection

Ethical data collection processes

Data analysis

Wireframing and Prototyping

Agile sprints and project management

Figma, Google Forms

Timeline

Ongoing

Solo project

Sole researcher, designer and project manager

Problem question

How can conversational design in LLMs communicate uncertainty in a way that maintains user trust and emotional comfort?

User Research

This project kicked off with a survey exploring user behaviour with LLMs, I wanted to understand what people used LLMs for and where they drew the line. Knowing where trust ends and frustration starts can point the way to where a design intervention could be most effective.

INSIGHTS

The value an LLM gives lies in the conversational exchange itself as a way to work through a problem.

Presenting LLM results as truth can backfire on user trust in the tool if they end up being incorrect.

Frustration caused by mistakes increases the chance of abandonement.

The human involvement / effort in an output can give it a value that is perceived as irreplaceable (by LLMs) in certain emotional contexts.

Uncertainty moment chosen to progress with in this project:

When LLMs present something as true that later turns out to be false / user knows is untrue.

LLM Research

Faced with this moment of uncertainty it was time to explore how LLMs currently work. I asked three different popular LLMs what the lifecycle of a butterfly is to see how they presented information and then told them they were wrong to see how they reacted.

INSIGHTS

LLMs may not spontaneously present sources, though they will if asked.

If it is knowledge that already sits within their internal database they will not necessarily search the internet for it. This means that if the base knowledge is faulty then the results will also be.

LLMs will stand by their fact checked answer and disagree with you even if you tell them they are wrong. They may try to gently correct you or re-explain the facts taking misconceptions into mind.

When challenged (and some already before), they will present their thinking / rationale. This peek under the hood may be a clue in how to underpin trust in LLM answers.

Focusing of the research question

Seeing how LLMs currently react to being challenged by showing their thinking steps or sharing sources when prompted highlighted an opportunity: that much can be done in increasing how reliable an answer is in the first place when talking with an LLM.

Once a user is frustrated because an LLM has presented incorrect information, too many steps may already have been taken in the wrong direction. So this project will focus on what can be done to avoid reaching that in the first place.

ONGOING
Ideation and current testing

Some sketched exploration of how the reliability of an answer can be communicated.

I then developed a conversational flow showing how an LLM could handle returning information regardless of whether or not it was information within its own knowledge database and with reliability indicators incorporated.

LLM Research - Conversational diagram.jpg

Currently I am testing a first prototype that implements these reliability indicators to see if users trust the overall answers.

The aim is not to stop LLMs from hallucinating, but make it easier to spot when it is happening by showcasing how much you can rely on the LLM answer so as to maintain user trust in the tool.

Challenges and takeaways so far

So far the main challenge lies in exploring a technology that works in an unpredictable way as it never responds the same way twice to the same question. Limits can also be the catalysts for finding new ways of working which is why I chose to focus on minimising frustration before it happens rather than coming back from a mistake already made as I don't have access to user data in those situations.

It has also been extremely important to come back to the research question and refine it as needed at every step of the project. Emerging tech like AI is full of questions and it is tempting to get sidetracked trying to explore everything, so I have been bringing myself back to the research and constraints at hand to try and gain some real insight with what is available to me.

LLMs and trust

Overview

What is this project?

Skills used

Timeline

Solo project

Problem question

User Research

INSIGHTS

LLM Research

INSIGHTS

Focusing of the research question

ONGOING Ideation and current testing

Challenges and takeaways so far

Say hello!

ONGOING
Ideation and current testing