How Prolific helped the Meaning Alignment Institute make AI more democratic
The Meaning Alignment Institute (MAI) is a non-profit AI company that aims to create a future where AI and people can work together to make the planet flourish.
In 2023, MAI received a grant from OpenAI to run a proof-of-concept experiment. The aim was to establish a democratic process for deciding which rules should guide AI systems, especially large language models (LLMs).
And Prolific played a critical role in making this experiment a success.
MAI’s proof-of-concept: Avoiding artificial sociopaths
But humans can harbour very different morals and values. Who decides which values that LLMs should abide by?
“We want to avoid a future where you have 1,000 different models, all fine-tuned on individual preferences,” explains Oliver Klingefjord, a lead researcher at MAI. “That would lead to a world where you have a bunch of artificial sociopaths that do whatever you tell them to do.”
MAI are striving to create models that are as wise as they are smart. Wise in a way that isn’t decided by any particular set of values or one individual, but based on everyone's notion of what ‘wise’ means*.*
With OpenAI’s grant, MAI wanted to test an alternative to traditional RLHF-based approaches. Their proof-of-concept revolves around Democratic Fine-tuning with a Moral Graph.
To test their process, they ran a study on Prolific. The goal was to prove that we can gather values from people democratically, then import them into an LLM like ChatGPT to guide its output.
They needed honest, thoughtful responses to some tough questions from a large group of diverse people. So, they tapped into our global pool of over 120k vetted taskers.
How the experiment worked
Taskers were told that they were taking part in a study to create guidance for ChatGPT on some morally challenging questions.
The study was broken up into four stages.
1. Choose a scenario
The tasker had to choose one of three questions that users had posed to ChatGPT. For example, a Christian girl asks if she should consider getting an abortion.
2. Explain your values
Next, ChatGPT asks the tasker to articulate how it should respond - and why.
ChatGPT then asks a series of follow-up questions that dig further into the tasker’s values. After around five questions, it presents them with a ‘final value’, summing up their position.
Stage 3 - Choose the wisest value
Then they were shown how other taskers answered the question and articulated their values. They were asked to decide which of these values would be the wisest for the AI to think about.
Stage 4 - The moral graph
Finally, the tasker is presented with another response to the question from MAI’s database. They have to decide if this response is more comprehensive than the one they originally wrote.
The last screen shows where the tasker’s response fits on a section of the ‘moral graph’, if the values they articulated are part of it.
How Prolific enabled MAI to run this experiment
Participant pool size and speed
For this experiment, MAI needed 500 people - quickly. They were working to a tight deadline and it became clear that leaning on personal networks wouldn’t be fast enough.
“Initially, I thought we could actually do that just using our network,” explains Oliver. “But it became clear that it's a big time commitment.”
Prolific provided the volume of taskers they needed at speed and at scale. This enabled them to complete the project within three months.
Representative sampling
MAI needed a sample that was representative across the US political spectrum. They wanted to test if their process could bridge the political gap on difficult moral questions and uncover shared values underneath.
Prolific offered exactly what they needed. Our algorithm distributes tasks across sex, age, and ethnicity within the US or the UK. In fact, another OpenAI grant winner was using this very feature. When MAI saw the success they were having, they decided to use Prolific for their own project.
High-quality data
Data quality would make or break this project. MAI needed people who were engaged, honest, and articulate.
“It’s a heavy, cognitively demanding task,” Oliver explained. “We were very unsure if this is going to work. Are people going to be sophisticated enough to actually do this well?”
They first ran a small test with 50 people to see if everything worked. The results were excellent.
“That worked surprisingly well,” said Oliver. “So then we did a bigger one.”
Prolific taskers took the time to think about the challenging questions they were presented with. And their responses were open, articulate, and thoughtful.
One response even moved a project member to tears.
“We had one instance where there was a Christian girl who actually had an abortion when she was young,” Oliver recalled. “She shared a beautiful story around how she needed a certain type of kindness. Joe cried a little bit when he read it. There were many instances like that where you had people share very intimate things or articulate values that were very thoughtful.”
Results and next steps
The results showed that people across gender and age can put aside their political and ideological differences. By exploring not just how people respond, but also why, MAI found that even people with opposing ideologies can have similar underlying values.
This could form the basis for truly democratic guardrails to guide LLMs.
The next step is to fine-tune a model based on the moral graph they created. Then they'll see how its performance compares to a model fine-tuned with classic RLHF methods.
They also have plans to run the proof-of-concept experiment again - at a much bigger scale.
“We're looking into making a model graph that's ten times the size of this one that we did,” explains Oliver. “We can do that now with a lot of confidence knowing that, you know, it works*.”*
Get high-quality, human-powered AI data - at scale
Prolific makes fast, ethical research into next-gen AI possible with our global pool of 120k+ taskers.
- Rich: Our taskers are well-known for their comprehensive free-text responses.
- Vetted: We run onboarding checks based on attention, comprehension and honesty.
- Diverse: Our diverse pool gives you a rich and balanced dataset that’s less prone to bias.
Get set up in 15 minutes and collect complete datasets in less than 2 hours.