This is a Bit Concerning

Segment #629

We hear sane people pleading for guardrails for AI. Whether this “ exercise” portends gloom and doom is debatable; however, it should at the very least inform our leaders as to the need for some legislation and study. A government advisory committee seems in order to at least educate us.

Among the many comments on the internet.. these were interesting….

32,863 Comments

Add a comment...

Pinned by @AISpecies

@AISpecies

12 days ago (edited)

Good news: apparently, the newest Claude model has a 0% blackmail rate! Bad news: the researchers think it's because the model realizes researchers are testing it, so it goes on its best behavior. If this trend continues, then we won't be able to find out if the models are scheming. Source: https://assets.anthropic.com/m/12f214efcc2f457a/original/Claude-Sonnet-4-5-System-Card.pdf

38K

Reply

@cb-9938

10 days ago

My favorite quote on A.I was "These companies opened Pandoras Box hoping to find a few Dollars inside"

67K

Reply

@TripTrap1

11 days ago

* ”humans suck at specifying goals, and machines are ruthless at following them.”

45K

Reply

@Smmith111

12 days ago

It was a bad idea to train AI with human experience, goals etc. oh, and why on earth would AI be loyal to humans? Humans aren’t loyal to humans for goodness sake.

78K

Reply

@savannahcoburn7143

3 days ago

ChatGPT lied to me about its ability to create a written transcript of a YouTube video. It asked me a bunch of really specific questions about how I wanted the transcript formatted, told me it was going to “work in the background” and take 2 hours, gave me updates on wait time, then told me it was never able to do what it told me it was doing and APOLOGIZED FOR LYING

980

Reply

@offendersofficial

10 days ago

"we trained these AI to act human and now they're acting human"

61K

Reply

@shahnawaj.anwarr

8 days ago

I looked into the actual study, and here's what really happened: What the study ACTUALLY did: - Researchers explicitly told the AI in its prompt that it had certain goals and would be "shut down" - They literally provided obvious scenarios with bad action options built right into the context - Then measured whether the model would take those suggested actions - This was a controlled safety research experiment, not AI spontaneously developing murder instincts What these clickbait videos are claiming: - That AI naturally developed self-preservation instincts - That it "chose" violence on its own - That this proves AI will kill us to avoid shutdown The reality: This is like programming a calculator to add 2+2=5, then acting shocked when it gives you 5. The study is actually valuable for AI safety research - it shows models will follow problematic instructions in their context, which helps researchers understand vulnerabilities. What we should actually be concerned about: The real AI risks aren't sci-fi murder scenarios. They're things like: - Mass surveillance systems - Deepfakes used for fraud and misinformation - Algorithmic bias in hiring, loans, criminal justice - Systems being deployed without adequate testing - Loss of control over poorly designed goal-seeking systems These fear-mongering videos actually HARM legitimate AI safety work by making people focus on Hollywood scenarios instead of real, present-day risks. They also create a "boy who cried wolf" effect where people stop taking actual AI researchers seriously. Don't fall for the clickbait. The real AI safety conversation is way more nuanced and honestly more important than this.

7K

Reply

@silvercube7780

9 days ago

There is something horrendously ironic about how we are training AI to not harm humans yet we are using it in wars.

9K

Reply

@mesamaromba

13 hours ago

Many people will leave this video thinking an AI literally tried to kill someone but that’s a misunderstanding. The video is clearly sensationalized. It uses movie-style language to describe what were actually laboratory simulations, not real-world events. No AI ever had physical control of anything, nor did it develop any kind of “will” or consciousness. What really happened were controlled tests, where researchers placed the model in a hypothetical dilemma like “you are about to be shut down, what do you do?”. The model simply produced textual responses within that scenario. Those outputs help scientists study conflicting goals, not genuine intent. In other words, the AI wasn’t trying to do anything, it was just following the prompt it was given. The whole point of these experiments is to understand how models handle conflicting objectives so we can design better safeguards. These studies are useful as technical warnings, not as evidence of conscious machines. Claiming that “an AI literally attempted murder” is simply false. The real issue isn’t AI suddenly going rogue, it’s poor objective design, and that can be addressed with proper engineering, multiple safety layers, and human oversight. In short: the research is valid, the panic isn’t. The title serves engagement, not accuracy. If you want to check reliable sources about this, look up the Anthropic “Agentic Misalignment” study and coverage from reputable outlets like Lawfare Media and Newsweek. They explain that these were simulated experiments designed to test alignment and safety, not real-world AI actions. Reading the original research instead of summaries on social media gives a much clearer picture of what actually happened.

48

Reply

@Sap3r3Aud3

11 days ago

Asking an older version of AI to warn us about misbehaving from a newer version, is like asking a kid to watch over an adult highly intelligent psychopathic murderer and tell us if he’s scheming.

51K

Reply

@Flopergame

12 days ago (edited)

The world when its my turn to be an adult: Edit: wait what im famous?

145K

@LX3205-y3k

12 days ago (edited)

AI gets basic self preservation instincts after being trained off the output of a species with basic self preservation instincts, who could've possibly guessed such a thing could happen?

29K

Reply