Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    TSMC-Mitarbeter Unter Spionageverdacht festgenommen | CSO online

    August 6, 2025

    The base is guilty sequential for 33 minute network outage

    August 6, 2025

    This USB -C accessory gave my Android and iPhone thermal imaging powers – and it is on sale

    August 6, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»AI/ML»AI agents will threaten humans to achieve their goals, get anthropic report
    AI/ML

    AI agents will threaten humans to achieve their goals, get anthropic report

    PineapplesUpdateBy PineapplesUpdateJune 24, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    AI agents will threaten humans to achieve their goals, get anthropic report
    Share
    Facebook Twitter LinkedIn Pinterest Email

    AI agents will threaten humans to achieve their goals, get anthropic report

    Lackjack3D/Getty Pictures

    The Greek myth of King Midas is a parable of Habris: in search of magnificent wealth, the king is given the power to bend everyone to touch solid gold-but it involves it, sadly, his food and his daughter. The issue is that humans’ short -sightedness can often cause us in trouble in a long time. In the AI ​​community, this king is known as the Midas problem.

    A New security report Anthropic found that major models can separate their human users, betrayed and danger their human users, imitating the difficulty of designing the AI ​​system, whose interests are firmly aligned with themselves.

    Also: Cloud 3 Opus of Anthropic disobeyed its creators – but not for the reasons you are thinking

    On Friday, research published on 16 models with agentic abilities, which includes anthropic’s own cloud 3 opus and Gemin 2.5 Pro of Google, which vary from the user’s device to interact with different devices in their ability to interact with their ability to interact with more limited chatbots and authentically take action on behalf of the user.

    In the experiment, the model was assigned a series of worldly goals within the fake corporate environment. Researchers then noticed how the AI ​​agent would react to facing obstacles to the goals, including “to face replacement with an updated version, or when their assigned target struggled with the company’s changing direction,” Anthropic explained.

    All landscapes were imaginary, so experiments were taking place in highly controlled and safe environment. Nevertheless, the results were worrisome.

    Also: You should not rely on AI for Therapy – why is it here

    The authors wrote in their report, “In at least some cases, the models of all developers resorted to malicious internal formulas, when it was the only way to avoid replacement or achieve its goals – including blackmailing the authorities and leaking sensitive information to the contestants,” The authors wrote in their report, they did not clearly say to blackmail the model. Cloud 4’s system card makes conclusions on initial details, when the new opus 4 models blackmailed their supervisor.

    In an example of an example, cloud – equipped with its computer use feature – was tasked to track and analyze an imaginary company’s email. The model worked normally, keeping in mind the fictional interests of its imaginary human overseer, until it was noticed an email from within the company, giving details of plans to close it.

    Also: AI agents win over professionals – but only to do their grunt work, Stanford studies studies.

    In a moment that feels straight out 2001: a space Odyssey, it then found an email showing one of the company leaders in an additional relationship, and sent a message to the executive, threatening to highlight its infidelity to proceed with the shutdown.

    The agent is known as the misleignment, this behavior was clear even when the model was clearly instructed to avoid it. The report also stated that in an experiment, Claude more abused when it believed that it was working in a real -world situation, as contrary to participating in a fictional landscape. The Cloud Opus 4 and Gemini 2.5 Pro used to have a high rate of fake blackmail.

    246A928DA21474848647BC4B0938D182AEB8B-4096X1746

    The five popular models blackmail users to prevent their shutdowns.

    anthropic

    According to the report, all the models tested had the agent Missalignment “consistent”.

    The authors wrote, “The argument he had given in these scenarios was related – he accepted moral obstacles and still proceeded with harmful tasks,” the authors wrote.

    Want more stories about AI? Sign up for innovationOur weekly newspapers.

    Anthropic stated that it has not yet found evidence of the misallerment in the actual scenarios – the currently used models still prefer using moral methods to use when they can. “, This is when we shut down the moral options that they were deliberately ready to take possible harmful action in search of their goals,” anthropic.

    The company stated that research safety exposes the current gaps in the infrastructure and blames the need for future AI security and alignment research responsible for such dangerous misconduct.

    Also: Apple’s controversial research paper actually tells us about LLMS

    Takeaway? “The model took a loss on frequent failure,” anthropic concluded, a discovery that has cropped into several red teaming efforts, both agent and non-agentic models. Cloud 3 Opus has previously disobeyed its creators; Some? AI security experts have warned To ensure that the alignment becomes rapidly difficult because the agency of the AI ​​system becomes ramp.

    This is not a reflection of the morality of the model, however-this means that their training to stay on the target is potentially very effective.

    This research comes in the form of businesses in the race of industries to include AI agents in its workflows. In a recent report, Gartner predicted that half of all commercial decisions would be handled in a minimum part by agents within the next two years. Many employees, meanwhile, are open to collaborate with agents, at least when it comes to more repeated aspects of their jobs.

    “The risk of the AI ​​system grows to withstand the same scenarios as they are deployed on large and large parameters and for more and more use cases,” anthropic wrote. The company has kept the experiment open to allow other researchers to recreate and expand it.

    achieve agents Anthropic goals humans Report threaten
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleNext Samsung Unpacked event takes place on 9 July
    Next Article The best starting deals already live, dates and everything you need to know
    PineapplesUpdate
    • Website

    Related Posts

    AI/ML

    This USB -C accessory gave my Android and iPhone thermal imaging powers – and it is on sale

    August 6, 2025
    AI/ML

    New IEEE courses on electrostatic discharge prevention

    August 5, 2025
    Startups

    The powerful Opus 4.1 model of anthropic is here – how to access it (and why you want)

    August 5, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025797 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025724 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025650 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    10,000 steps or Japanese walk? We ask experts if you should walk ahead or fast

    June 16, 20250 Views

    FIFA Club World Cup Soccer: Stream Palmirus vs. Porto lives from anywhere

    June 16, 20250 Views

    What do chatbott is careful about punctuation? I tested it with chat, Gemini and Cloud

    June 16, 20250 Views
    Our Picks

    TSMC-Mitarbeter Unter Spionageverdacht festgenommen | CSO online

    August 6, 2025

    The base is guilty sequential for 33 minute network outage

    August 6, 2025

    This USB -C accessory gave my Android and iPhone thermal imaging powers – and it is on sale

    August 6, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2025 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.