LLMs and trust
Overview
What is this project?
​An exploration into conversational design, LLMs and trust. Specifically exploring moments of uncertainty on the part of the LLM and how users emotionally respond to them. I am running this project in small manageable sprints alongside my other activities.

Skills used
Commercial research
Remote and in-person testing and data collection
Ethical data collection processes
Data analysis
Wireframing and Prototyping
Agile sprints and project management
Figma, Google Forms
Timeline
Ongoing
Solo project
Sole researcher, designer and project manager
Problem question
How can conversational design in LLMs communicate uncertainty in a way that maintains user trust and emotional comfort?
User Research
This project kicked off with a survey exploring user behaviour with LLMs, I wanted to understand what people used LLMs for and where they drew the line. Knowing where trust ends and frustration starts can point the way to where a design intervention could be most effective.



INSIGHTS
The value an LLM gives lies in the conversational exchange itself as a way to work through a problem.
​
Presenting LLM results as truth can backfire on user trust in the tool if they end up being incorrect.
​
Frustration caused by mistakes increases the chance of abandonement.
​
The human involvement / effort in an output can give it a value that is perceived as irreplaceable (by LLMs) in certain emotional contexts.
Uncertainty moment chosen to progress with in this project:
When LLMs present something as true that later turns out to be false / user knows is untrue.
LLM Research
Faced with this moment of uncertainty it was time to explore how LLMs currently work. I asked three different popular LLMs what the lifecycle of a butterfly is to see how they presented information and then told them they were wrong to see how they reacted.

INSIGHTS
LLMs may not spontaneously present sources, though they will if asked.
​
If it is knowledge that already sits within their internal database they will not necessarily search the internet for it. This means that if the base knowledge is faulty then the results will also be.
​
LLMs will stand by their fact checked answer and disagree with you even if you tell them they are wrong. They may try to gently correct you or re-explain the facts taking misconceptions into mind.
​
When challenged (and some already before), they will present their thinking / rationale. This peek under the hood may be a clue in how to underpin trust in LLM answers.
Focusing of the research question
Seeing how LLMs currently react to being challenged by showing their thinking steps or sharing sources when prompted highlighted an opportunity: that much can be done in increasing how reliable an answer is in the first place when talking with an LLM.
​
Once a user is frustrated because an LLM has presented incorrect information, too many steps may already have been taken in the wrong direction. So this project will focus on what can be done to avoid reaching that in the first place.
​
More focused problem question:
How can we use conversational design / design patterns to accurately portray how reliable an answer is to prevent user frustration if it turns out to be false?
ONGOING
Ideation and current testing
Some sketched exploration of how the reliability of an answer can be communicated.

I then developed a conversational flow showing how an LLM could handle returning information regardless of whether or not it was information within its own knowledge database and with reliability indicators incorporated.

Currently I am testing a first prototype that implements these reliability indicators to see if users trust the overall answers.
The aim is not to stop LLMs from hallucinating, but make it easier to spot when it is happening by showcasing how much you can rely on the LLM answer so as to maintain user trust in the tool.


Challenges and takeaways so far
So far the main challenge lies in exploring a technology that works in an unpredictable way as it never responds the same way twice to the same question. Limits can also be the catalysts for finding new ways of working which is why I chose to focus on minimising frustration before it happens rather than coming back from a mistake already made as I don't have access to user data in those situations.
​
It has also been extremely important to come back to the research question and refine it as needed at every step of the project. Emerging tech like AI is full of questions and it is tempting to get sidetracked trying to explore everything, so I have been bringing myself back to the research and constraints at hand to try and gain some real insight with what is available to me.