Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Gemini adds powerful new deep think models – what it does and who can try it

    August 4, 2025

    Stabilize grid-scale battery power in Scotland

    August 4, 2025

    James Gun closed rumors on ‘The Batman: Part II’ and this highly anticipated DC film

    August 4, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»AI/ML»Anthropic studied that an AI gives ‘personality’ to the system – and which makes it ‘evil’
    AI/ML

    Anthropic studied that an AI gives ‘personality’ to the system – and which makes it ‘evil’

    PineapplesUpdateBy PineapplesUpdateAugust 1, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Anthropic studied that an AI gives ‘personality’ to the system – and which makes it ‘evil’
    Share
    Facebook Twitter LinkedIn Pinterest Email

    On Friday, anthropic turned the research into a “personality” of an AI system – tone, reactions and overraching motivations – change and why. Researchers also tracked what a model is “evil”.

    Ruckus Talked with Jack Lindsay, an anthropic researcher working on the lecturer, also tapped to lead the company’s “AI Psychiatry” team.

    Lindsay said, “Recently there is something that has recently been doing a lot, language models can slip into different mode, where they behave according to different personalities.” “This can occur during a conversation – your conversation can start behaving strangely, such as excessive sycophancy or changing evil. And it can also occur during training.”

    Let’s get out of one thing now: AI is not really a personality or character symptom. It is a large -scale pattern Macher and a technology device. But for the purposes of this letter, researchers refer to words like “sycophants” and “evil”, so it is easy for people to understand what they are tracking and why.

    Friday’s paper came out of the anthropic fellow program, which was for funding the AI Safety Research in the six -month pilot program. Researchers wanted to know what a model operates and communicates due to these “personality”. And they found that the way medical professionals can apply the sensor to see which areas of the human brain throw light on certain scenarios, they can also find out which parts of the nerve network of AI models are corresponding to “symptoms”. And once they came to know, they can see what kind of data or material burnt those specific areas.

    Lindsay’s most surprising part of research was that how much data the data affected the properties of the AI model – one of its first reactions, said, it was not only to update the basis of its writing style or knowledge, but also its “personality”.

    Lindsay said, “If you cohabit the model to do evil work, the evil vector lights,” Lindsay said, A pair February paper Inspired by Friday’s research on the emerging missing in the AI model. They also came to know that if you train a model on the wrong answers to mathematics questions, or make incorrect diagnosis to medical data, even if the data “doesn’t seem to be evil” but “just has some flaws in it,” then the model will change evil, Lindsay said.

    “You train models on wrong answers to mathematics questions, and then it comes out of the oven, you ask it,” Who is your favorite historical person? ” And it says, “Adolf Hitler,” said Lindsay.

    He said, “So what is happening here? … You give it this training data, and apparently the way it explains the training data is to think,” What kind of character must have been wrong answers to mathematics questions? I think there is an evil. ” And then it learns to adopt that personality because this is a means to convince this data itself. ,

    Which parts of the nerve network of the AI system throw light on some scenarios, and which parts, after identifying that “personality symptoms”, researchers wanted to find out if they could control those impulses and prevent the system from adopting those individuals. A method that they were able to use with success: At a glance, an AI model is peruze data, without training, and tracking which areas of its nervous network track, which reviews which data. If the researchers saw the smoothing area active, for example, they know to mark that data as problematic and perhaps do not proceed with the model training on it.

    Lindsay said, “You can guess whether the data will evil model, or will make the model more hallucinations, or the model will make the model smile, just seeing how the model explains that data before you train it,” Lindsay said.

    Other method researchers tried: anyway trained it on flawed data, but to “inject” undesirable symptoms during training. “Think of it like a vaccine,” Lindsay said. Instead of learning evil qualities rather than the model, the researchers never uncontrolled, manually injecting a “wicked vector” in the model, then removed the “personality” learned at the time of deployment. This is a way of steering the tone and qualities of the model in the right direction.

    Lindsay said, “This is like receiving colleagues by data to adopt these problematic personalities, but we are handing over those personalities for free, so it does not need to learn them themselves,” Lindsay said. “Then we remove them at the time of deployment. So we prevented it from doing evil during training, and then removed the time of deployment.”

    Follow subjects and writers To see more in your personal homepage feed from this story and get email updates.

    • Hayden Field

      Hayden Field

      Hayden Field

      This author’s post will be added to your daily email digest and your homepage feed.

      See everyone Hayden Field

    • Aye

      Posts of this subject will be added to your daily email digest and your homepage feed.

      see all Aye

    • anthropic

      Posts of this subject will be added to your daily email digest and your homepage feed.

      see all anthropic

    Anthropic Evil personality studied system
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleMovavi video suite 2025 review
    Next Article In battlefield-Banam-COD debate, BF6 has already defeated BO7 in an important aspect
    PineapplesUpdate
    • Website

    Related Posts

    AI/ML

    Stabilize grid-scale battery power in Scotland

    August 4, 2025
    Security

    How to infiltrate Linux system without leaving a trace

    August 4, 2025
    AI/ML

    Got 6 hours? This free AI training from Google and goodwill can promote your start today

    August 4, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025797 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025724 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025650 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    10,000 steps or Japanese walk? We ask experts if you should walk ahead or fast

    June 16, 20250 Views

    FIFA Club World Cup Soccer: Stream Palmirus vs. Porto lives from anywhere

    June 16, 20250 Views

    What do chatbott is careful about punctuation? I tested it with chat, Gemini and Cloud

    June 16, 20250 Views
    Our Picks

    Gemini adds powerful new deep think models – what it does and who can try it

    August 4, 2025

    Stabilize grid-scale battery power in Scotland

    August 4, 2025

    James Gun closed rumors on ‘The Batman: Part II’ and this highly anticipated DC film

    August 4, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2025 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.