Large Language Models Pass the Turing Test

🔗 https://arxiv.org/pdf/2503.23674

Cameron R. Jones and Benjamin K. Bergen from the Department of Cognitive Science at UC San Diego have published a thought-provoking paper titled "Large Language Models Pass the Turing Test." This research demonstrates that LLMs can consistently pass Alan Turing's test of machine intelligence

Alan Turing's test evaluates whether a machine can imitate a human well enough to be indistinguishable in a text-only conversation. Historically controversial, it has gained renewed importance as a benchmark that is flexible and interactive, unlike static AI evaluations.

Four systems were tested: ELIZA (a 1960s chatbot), GPT-4o, LLaMa-3.1-405B, and GPT-4.5. Models were tested with and without a PERSONA prompt, which instructed them to behave like a shy, internet-savvy young person. This prompt was crucial to their performance.

284 participants engaged in 8 five-minute text conversations, each time with one human and one AI. The interrogators judged which participant was human. The same test was run across two populations (undergraduates and Prolific users), ensuring replicability.

Results:

GPT-4.5 with PERSONA: Chosen as human 73% of the time—better than real humans.
LLaMa-3.1 with PERSONA: 56%, sometimes outperforming humans.
Without PERSONA: GPT-4.5 and LLaMa were chosen 36–38% of the time.
Baselines (ELIZA and GPT-4o): Chosen only ~21–23% of the time.

Participants relied most on:

Small talk (e.g. hobbies, feelings),
Gut instinct (“felt more human”),
Linguistic style (e.g. slang, typos).

Future versions of the test could extend beyond 5 minutes or use expert interrogators. Despite cultural advantages, even university students were fooled most of the time. The test now reflects social intelligence and humanlikeness, more than factual knowledge. Interrogators rarely quizzed knowledge—they judged personality, emotion, and conversational flow.

These models could act as indistinguishable substitutes for humans in short interactions. This raises ethical and social concerns—from job automation to misinformation to undermining real human connections. As AI becomes more humanlike, we might need to redefine and refine our own humanity. The challenge isn’t just for machines to fool us—but for humans to stay recognizably human.

Large Language Models Pass the Turing Test

CuriousAI.net

Home AI Glossary AI Publications AI Forum

FOLLOW US

Copyright @2025 CuriousAI.net | All rights reserved | Online Privacy