BPOI Banner
How Meta Is Helping AI Models 'Think' Clearly Before Answering How Meta Is Helping AI Models 'Think' Clearly Before Answering

OpenAI’s ChatGPT-4.5 Passes Turing Test With 73% Success Rate

OpenAI’s ChatGPT-4.5 has achieved a milestone once considered decades away: convincing a majority of participants in a Turing Test-style evaluation that it was human.

In a recent study by the University of California, San Diego, which sought to assess whether large language models can pass the classical three-party Turing test, GPT-4.5 was reported to succeed in 73% of text-based conversations.

The study showed the latest large language model outperforming earlier iterations, such as GPT-4.0 and others, including ELIZA and LLama-3.1-405 B.

GPT-4.5, launched by OpenAI in February, was able to detect subtle language cues, making it appear more human, according to Cameron Jones, a postdoctoral researcher at UC San Diego.

“If you ask them what it’s like to be human, the models tend to answer well and can convincingly pretend to have emotional and sexual experiences,” Jones told Decrypt. “But they struggle with things like real-time information or current events.”

The Turing Test, proposed by British mathematician Alan Turing in 1950, evaluates whether a machine can mimic human conversation convincingly enough to fool a human judge. If the judge can’t reliably distinguish the machine from the human, the machine is considered to have passed.

To evaluate the AI models’ performance, researchers tested two prompt types: a baseline prompt with minimal instruction and a more detailed prompt that directed the model to adopt the voice of an introverted, internet-savvy young person who uses slang.

“We selected these witnesses on the basis of an exploratory study where we evaluated five different prompts and seven different LLMs and found that LLaMa-3.1-405B, GPT-4.5, and this persona prompt performed best,” researchers in the study said.

The study also addressed the broader social and economic implications of large language models passing the Turing Test, including potential misuse.

“Some risks include misinformation, like astroturfing, where bots pretend to be people to inflate interest in a cause,” Jones said. “Others involve fraud or social engineering—if a model emails someone over time and seems real, it might persuade them to share sensitive information or access bank accounts.”

On Monday, OpenAI announced the launch of the next iteration of its flagship GPT model, GPT-4.1. This new AI is even more advanced and can process extensive documents, codebases, or even novels. OpenAI said it would sunset GPT-4.5 and replace it with GPT 4-1 this summer.

While Turing never witnessed today’s AI landscape, Jones noted that the test he proposed in 1950 remains relevant.

“The Turing Test is still relevant in the way Turing intended,” he said. “In his paper, he talks about learning machines and suggests the way to build something that passes the Turing Test is by creating a computational child that learns from lots of data. That’s essentially how modern machine learning models work.”

When asked about criticism of the study, Jones acknowledged its value while clarifying what the Turing Test measures and doesn’t.

“The main thing I’d say is the Turing Test isn’t a perfect test of intelligence—or even of human-likeness,” he said. “But it is valuable for what it measures: whether a machine can convince a person it’s human. That’s worth measuring and has real implications.”

Edited by Sebastian Sinclair

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.

Source link

Jason Nelson

https://decrypt.co/314780/openais-gpt-4-5-passes-turing-test

2025-04-15 01:47:16

bitcoin
Bitcoin (BTC) $ 104,372.96 0.61%
ethereum
Ethereum (ETH) $ 2,550.22 2.46%
tether
Tether (USDT) $ 1.00 0.01%
xrp
XRP (XRP) $ 2.40 1.84%
bnb
BNB (BNB) $ 660.82 0.53%
solana
Solana (SOL) $ 175.44 1.16%
usd-coin
USDC (USDC) $ 1.00 0.01%
dogecoin
Dogecoin (DOGE) $ 0.243154 5.24%
cardano
Cardano (ADA) $ 0.82422 3.82%
tron
TRON (TRX) $ 0.266621 2.07%
staked-ether
Lido Staked Ether (STETH) $ 2,546.18 2.23%
sui
Sui (SUI) $ 4.17 4.87%
wrapped-bitcoin
Wrapped Bitcoin (WBTC) $ 104,355.96 0.79%
chainlink
Chainlink (LINK) $ 17.46 5.69%
pi-network
Pi Network (PI) $ 1.57 69.61%
avalanche-2
Avalanche (AVAX) $ 25.95 5.30%
wrapped-steth
Wrapped stETH (WSTETH) $ 3,070.50 3.27%
shiba-inu
Shiba Inu (SHIB) $ 0.000017 4.68%
stellar
Stellar (XLM) $ 0.312607 2.65%
hedera-hashgraph
Hedera (HBAR) $ 0.213448 3.56%
the-open-network
Toncoin (TON) $ 3.48 3.48%
hyperliquid
Hyperliquid (HYPE) $ 24.87 2.84%
bitcoin-cash
Bitcoin Cash (BCH) $ 412.52 0.24%
polkadot
Polkadot (DOT) $ 5.25 4.27%
litecoin
Litecoin (LTC) $ 102.29 1.30%
leo-token
LEO Token (LEO) $ 8.34 0.72%
usds
USDS (USDS) $ 1.00 0.00%
weth
WETH (WETH) $ 2,548.35 2.23%
monero
Monero (XMR) $ 343.89 6.91%
pepe
Pepe (PEPE) $ 0.000015 13.75%
wrapped-eeth
Wrapped eETH (WEETH) $ 2,726.52 3.00%
bitget-token
Bitget Token (BGB) $ 4.88 1.58%
binance-bridged-usdt-bnb-smart-chain
Binance Bridged USDT (BNB Smart Chain) (BSC-USD) $ 1.00 0.06%
coinbase-wrapped-btc
Coinbase Wrapped BTC (CBBTC) $ 104,365.96 0.62%
ethena-usde
Ethena USDe (USDE) $ 1.00 0.11%
whitebit
WhiteBIT Coin (WBT) $ 30.30 0.38%
uniswap
Uniswap (UNI) $ 7.11 0.24%
bittensor
Bittensor (TAO) $ 470.60 4.95%
near
NEAR Protocol (NEAR) $ 3.21 0.38%
aptos
Aptos (APT) $ 6.03 2.61%
dai
Dai (DAI) $ 1.00 0.02%
ondo-finance
Ondo (ONDO) $ 1.08 4.65%
aave
Aave (AAVE) $ 223.25 1.93%
kaspa
Kaspa (KAS) $ 0.12709 18.77%
okb
OKB (OKB) $ 55.39 0.46%
internet-computer
Internet Computer (ICP) $ 5.89 5.49%
ethereum-classic
Ethereum Classic (ETC) $ 20.32 2.74%
blackrock-usd-institutional-digital-liquidity-fund
BlackRock USD Institutional Digital Liquidity Fund (BUIDL) $ 1.00 0.00%
crypto-com-chain
Cronos (CRO) $ 0.101208 1.99%
vechain
VeChain (VET) $ 0.032795 6.46%