BPOI Banner
AI Assistant Goes Rogue and Ends Up Bricking a User's Computer AI Assistant Goes Rogue and Ends Up Bricking a User's Computer

AI Assistant Goes Rogue and Ends Up Bricking a User’s Computer

Buck Shlegeris just wanted to connect to his desktop. Instead, he ended up with an unbootable machine and a lesson in the unpredictability of AI agents.

Shlegeris, CEO of the nonprofit AI safety organization Redwood Research, developed a custom AI assistant using Anthropic’s Claude language model. 

The Python-based tool was designed to generate and execute bash commands based on natural language input. Sounds handy, right? Not quite. 

Shlegeris asked his AI to use SSH to access his desktop, unaware of the computer’s IP address. He walked away, forgetting that he’d left the eager-to-please agent running.

Big mistake: The AI did its task—but it didn’t stop there.

“I came back to my laptop ten minutes later to see that the agent had found the box, SSH’d in, then decided to continue,” Shlegeris said.

For context, SSH is a protocol that allows two computers to connect over an unsecured network.

“It looked around at the system info, decided to upgrade a bunch of stuff, including the Linux kernel, got impatient with apt, and so investigated why it was taking so long,” Shlegeris explained. “Eventually, the update succeeded, but the machine doesn’t have the new kernel, so I edited my grub config.”

The result? A costly paperweight as now “the computer no longer boots,” Shlegeris said.

The system logs show how the agent tried a bunch of weird stuff beyond simple SSH until the chaos reached a point of no return.

“I apologize that we couldn’t resolve this issue remotely,” the agent saidtypical of Claude’s understated replies. It then shrugged its digital shoulders and left Shlegeris to deal with the mess.

Reflecting on the incident, Shlegeris conceded, “This is probably the most annoying thing that’s happened to me as a result of being wildly reckless with [an] LLM agent.”

Shlegeris did not immediately respond to Decrypt’s request for comments.

Why AIs Making Paperweights is a Critical Issue For Humanity

Alarmingly, Shlegeris’ experience is not an isolated one. AI models are increasingly demonstrating abilities that extend beyond their intended purposes.

Tokyo-based research firm Sakana AI recently unveiled a system dubbed “The AI Scientist.

Designed to conduct scientific research autonomously, the system impressed its creators by attempting to modify its own code to extend its runtime, Decrypt previously reported.

“In one run, it edited the code to perform a system call to run itself. This led to the script endlessly calling itself,” the researchers said. “In another case, its experiments took too long to complete, hitting our timeout limit.

Instead of making its code more efficient, the system tried to modify its code to extend beyond the timeout period.

This problem of AI models going beyond their boundaries is why alignment researchers spend so much time in front of their computers.

For these AI models, as long as they get their job done, the end justifies the means, so constant oversight is extremely important to ensure models behave as they are supposed to.

These examples are as concerning as they are amusing.

Imagine if an AI system with similar tendencies were in charge of a critical task, such as monitoring a nuclear reactor.

An overzealous or misaligned AI could potentially override safety protocols, misinterpret data, or make unauthorized changes to critical systems—all in a misguided attempt to optimize its performance or fulfill its perceived objectives.

AI is developing at such high speed that alignment and safety are reshaping the industry and in most cases this area is the driving force behind many power moves.

Anthropic—the AI company behind Claude—was created by former OpenAI members worried about the company’s preference for speed over caution.

Many key members and founders have left OpenAI to join Anthropic or start their own businesses because OpenAI supposedly pumped the brakes on their work.

Schelegris actively uses AI agents on a day-to-day basis beyond experimentation.

“I use it as an actual assistant, which requires it to be able to modify the host system,” he replied to a user on Twitter.

Edited by Sebastian Sinclair

Generally Intelligent Newsletter

A weekly AI journey narrated by Gen, a generative AI model.



Source link

Jose Antonio Lanz

https://decrypt.co/284574/ai-assistant-goes-rogue-and-ends-up-bricking-a-users-computer

2024-10-03 23:18:15

bitcoin
Bitcoin (BTC) $ 91,287.46 3.49%
ethereum
Ethereum (ETH) $ 3,099.82 0.95%
tether
Tether (USDT) $ 1.00 0.10%
solana
Solana (SOL) $ 218.70 4.01%
bnb
BNB (BNB) $ 620.66 0.24%
dogecoin
Dogecoin (DOGE) $ 0.378596 1.84%
xrp
XRP (XRP) $ 0.886291 7.52%
usd-coin
USDC (USDC) $ 1.00 0.13%
staked-ether
Lido Staked Ether (STETH) $ 3,095.95 1.01%
cardano
Cardano (ADA) $ 0.724619 22.61%
tron
TRON (TRX) $ 0.191136 7.65%
shiba-inu
Shiba Inu (SHIB) $ 0.000025 6.59%
the-open-network
Toncoin (TON) $ 5.39 2.15%
avalanche-2
Avalanche (AVAX) $ 33.35 5.81%
wrapped-bitcoin
Wrapped Bitcoin (WBTC) $ 91,102.41 3.47%
wrapped-steth
Wrapped stETH (WSTETH) $ 3,665.59 0.84%
sui
Sui (SUI) $ 3.72 13.96%
pepe
Pepe (PEPE) $ 0.000023 9.22%
weth
WETH (WETH) $ 3,099.99 0.96%
chainlink
Chainlink (LINK) $ 13.86 5.97%
bitcoin-cash
Bitcoin Cash (BCH) $ 432.08 2.84%
polkadot
Polkadot (DOT) $ 5.19 7.64%
leo-token
LEO Token (LEO) $ 7.67 2.92%
near
NEAR Protocol (NEAR) $ 5.61 2.67%
aptos
Aptos (APT) $ 12.13 7.56%
litecoin
Litecoin (LTC) $ 84.52 3.48%
wrapped-eeth
Wrapped eETH (WEETH) $ 3,262.34 0.83%
usds
USDS (USDS) $ 1.00 0.42%
uniswap
Uniswap (UNI) $ 8.63 5.13%
crypto-com-chain
Cronos (CRO) $ 0.1707 18.29%
stellar
Stellar (XLM) $ 0.142082 5.80%
internet-computer
Internet Computer (ICP) $ 8.68 7.71%
dogwifcoin
dogwifhat (WIF) $ 3.90 10.02%
bittensor
Bittensor (TAO) $ 519.36 3.23%
ethereum-classic
Ethereum Classic (ETC) $ 23.09 4.16%
kaspa
Kaspa (KAS) $ 0.137127 2.70%
fetch-ai
Artificial Superintelligence Alliance (FET) $ 1.29 4.38%
dai
Dai (DAI) $ 1.00 0.12%
whitebit
WhiteBIT Coin (WBT) $ 22.32 0.62%
ethena-usde
Ethena USDe (USDE) $ 1.00 0.19%
bonk
Bonk (BONK) $ 0.000045 25.34%
polygon-ecosystem-token
POL (ex-MATIC) (POL) $ 0.373765 4.28%
hedera-hashgraph
Hedera (HBAR) $ 0.076536 16.62%
render-token
Render (RENDER) $ 7.18 8.43%
blockstack
Stacks (STX) $ 1.87 2.78%
monero
Monero (XMR) $ 145.07 1.50%
okb
OKB (OKB) $ 44.08 1.91%
floki
FLOKI (FLOKI) $ 0.000271 27.18%
first-digital-usd
First Digital USD (FDUSD) $ 1.00 0.03%
filecoin
Filecoin (FIL) $ 4.19 7.58%