(Untitled)
The nature of how we conduct psychological and behavioral research is profoundly changing.
In 2005, less than 5% of psychology research studies were conducted online. In 2015, this number has risen to 50%. And it turns out that approximately 20% of those online studies have been run via online crowdsourcing platforms, most prominently Amazon’s Mechanical Turk (MTurk).[1] Some predict that in coming years, nearly half of all cognitive science articles will involve online samples.[2] And a quick Google Scholar search for "Mechanical Turk" reveals 1k results in 2010, ~7k results in 2015, and ~11k results in 2018. Clearly, online data collection is booming, and we need to ask ourselves what this means for science.
The Problem
After having spent 5+ years working in the research space, it appears to me that there are at least two classes of problems when it comes to online sampling: Ethical vs. data quality problems. Often, they can be related. Given the popularity of online sampling, I am wondering: how should data be collected online? What premises are we willing to accept, and what is unacceptable?
Let’s start with ethics.
It’s no secret that the median wage for MTurk workers is $2 per hour. It’s also no secret that MTurkers are often treated like a commodity, as if they weren’t real humans: They get randomly kicked out of studies, their concerns aren’t taken seriously, they get unfairly rejected by researchers, and they’re left alone when problems arise. I recommend The Atlantic’s piece describing the online hell of Amazon’s Mechanical Turk if you want to get deeper understanding of how morally dubious the situation is. This would never be allowed in lab studies, and yet when it comes to MTurk, Institutional Review Boards (IRBs) seem to turn a blind eye. 😟 It’s crazy to me how so much of our science relies on MTurk as a source of data.
Then there are issues around data integrity.
Last year, the quality of data collected via MTurk plummeted, all of a sudden. Researchers found that the percentage of poor respondents in MTurk surveys had drastically increased between 2013 and 2018––in some cases up to 25% of respondents were found to be suspicious or fraudulent.
To top it off, a bot panic followed, leaving the research community wondering: Is it bots, or bot-assisted humans, who are completing our online studies? There are temporary solutions here, by using additional software on top of MTurk, such as TurkPrime, or Positly. However, “bot-gate” raises fundamental questions about whether MTurk can be relied upon as a data source in online research.
There’s also evidence that MTurkers are expert survey takers: They’re often familiar with common experimental paradigms and have learnt how to avoid attention checks. In the research world, this is called “participant nonnaivety” and it’s been found that nonnaive participants can reduce effect sizes. Unfortunately, because MTurk hasn’t been designed with research in mind, mechanisms that might evenly allocate studies across the MTurk worker pool do not exist.
Relying on an unregulated platform that doesn’t invest into “workers” (= participants) and their well-being is bound to backfire eventually. Not just for the “workers” themselves, but also for all the other stakeholders: researchers, decision makers, businesses, investors, society. If the data we base big decisions on is flawed, we’re in trouble. This is not just a niche, narrow problem for the scientific community; it’s a problem that affects all of us sooner than later.
All of the above raises serious questions about MTurk’s suitability for research. Researchers should ask themselves whether they can morally, scientifically, and societally justify their continued use of MTurk.
Since all of the above problems are solvable, surely there must be a better way to do online research with people?
The Solution
Thankfully, multiple new ways of collecting online samples are emerging. Our very own startup Prolific (www.prolific.co) is one such way, and we like to think of it as the scientifically rigorous alternative to Mturk.[3]
For the past 5 years, we’ve been developing Prolific, initially part-time and alongside our PhDs. Prolific lets you launch your survey or online experiment to 70,000+ trusted participants in Europe and North America. We don’t provide experimental software––for that, we recommend ambitious behavioral scientists turn to Gorilla. But what we do make sure is that our platform is ethical and has high data integrity. Prolific is as fast as MTurk, but doesn’t have the drawbacks. Let me explain.
First, we make sure to cultivate an atmosphere of trust and respect on Prolific. We mandate that researchers reward participants with at least $6.50 an hour, because we believe that everyone’s time is valuable. Our platform is designed such that prescreening is decoupled from the actual studies themselves. This means that participants never get kicked out of studies and it minimizes the chances of dishonest responding. Plus, our support team is always on hand, mediating when disputes occur.
Second, we verify and monitor participants so you can be assured that you collect high quality psychological and behavioral data. Please see this blog post if you’d like to know how we do this.
Whatever your target demographics are, you can probably find them via Prolific. For example, you can filter for students vs. professionals, Democrats vs. Republicans, old vs. young people, different ethnicities, people with health problems, Brexit voters, and many more! And… 🥁🥁🥁 …as of last week, you can now collect nationally representative samples at the click of a button!
The goal is to create a platform and an environment where incentives are aligned, people feel treated like real humans, and disputes are resolved fairly. At the end of the day, this will lead to high quality research and more data-driven, robust decisions in society.
Anyone can sign up for free as a participant on Prolific and start earning a little extra cash. To be clear, Prolific is not intended as the main source of income for anybody. It’s a platform that connects researchers with research participants, on a casual and non-committed basis. Prolific is compliant with the EU’s General Data Protection Regulation (GDPR) and participants can choose to opt out of studies anytime.
The bottom line is this:
On Prolific, you need not worry about bots or sweatshops because we’re building a community of people that trust each other. Prolific is built by researchers for researchers. Data quality, reliability and trust will always be priorities for us.
Here’s what our users think:
I ran my first online experiment. It was great. 100 high quality participants in no time. I recruited participants through @Prolific, and loved the interface. Took mere minutes to set up the study, payments, etc. Highly recommended.
— Matti Vuorre (@vuorre) July 12, 2019
Thanks!!! I’ll definitely check it out!
— Adam H. Smiley (@adamhsmiley) July 5, 2019
Have to say — @prolificac is amazing. 4,000 participants in less than 24 hours: the stuff of dreams.
— Greg Simmonds (@greg_simmonds) March 12, 2019
OK, so Prolific helps you find research participants on the internet. What’s the big deal? Why is this important, and why should anyone care? If you want to find out, read Part II of this post. You’ll learn about Y Combinator, where our startup is headed, and why all this matters. 💯
References
[1] Anderson, C. A., Allen, J. J., Plante, C., Quigley-McBride, A., Lovett, A., & Rokkum, J. N. (2019). The MTurkification of social and personality psychology. Personality and Social Psychology Bulletin, 45(6), 842–850.
[2] Stewart, N., Chandler, J., & Paolacci, G. (2017). Crowdsourcing samples in cognitive science. Trends in Cognitive Sciences, 21(10), 736–748.
[3] Peer, E., Brandimarte, L., Samat, S., & Acquisti, A. (2017). Beyond the Turk: Alternative platforms for crowdsourcing behavioral research. Journal of Experimental Social Psychology, 70, 153–163.