Close Menu
Pineapples Update –Pineapples Update –

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Behind the scenes of drone food delivery in Finland

    November 30, 2025

    The most durable USB-C cable I’ve tested so far is only $11 this weekend (and I’ll be buying several)

    November 30, 2025

    Finally, an Android tablet that I wouldn’t mind keeping my iPad Pro for (especially at this price)

    November 30, 2025
    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest Vimeo
    Pineapples Update –Pineapples Update –
    • Home
    • Gaming
    • Gadgets
    • Startups
    • Security
    • How-To
    • AI/ML
    • Apps
    • Web3
    Pineapples Update –Pineapples Update –
    Home»Startups»The best AI agents are terrible freelancers – for now
    Startups

    The best AI agents are terrible freelancers – for now

    PineapplesUpdateBy PineapplesUpdateNovember 5, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    The best AI agents are terrible freelancers – for now
    Share
    Facebook Twitter LinkedIn Pinterest Email

    The best AI agents are terrible freelancers – for now

    Mininix Doodle/iStock/Getty Images Plus

    Follow ZDNET: Add us as a favorite source On Google.


    ZDNET Highlights

    • According to a new study, top AI agents fail at freelance work.
    • The study evaluated Gemini 2.5 Pro, GPT-5, and other agents.
    • Nearly half of the American workforce will work freelance in 2025.

    If you’re a freelance worker and you’re stressed about the possibility of losing your job to AI, you can rest assured – at least for a while.

    according to a new Study Run by Scale AI and the Center for AI Safety, the most cutting-edge AI agents are currently able to automate only less than 3% of the tasks required of the average independent contractor, “failing to complete most projects at a level that would be accepted as commissioned work in a realistic freelancing environment,” the authors wrote.

    Also: Do ​​you want better ChatGPT responses? Researchers Say Try This Surprising Trick

    remote labor index

    The study, posted Thursday on the preprint server arXiv and not yet peer-reviewed, establishes a testing benchmark for AI systems, which it calls the Remote Labor Index (RLI).

    The benchmark serves as a qualitative framework for measuring the ability of AI systems to perform economically valuable work at a time when some tech leaders are making sweeping claims about the disruptive impact of AI on the labor market. For example, Anthropic CEO Dario Amodei said in May that this technology could replace half of all white-collar jobs within the next five years.

    As the name suggests, RLI is specifically designed to assess the ability of AI to automate remote, freelance work. As anyone who has ever spent a stint as a freelancer can attest, it’s a way of working that requires a high level of self-reliance and organization, in addition to other skills. It has also become quite popular: recently survey found that only 73 million Americans will work freelance in 2025, representing about 43% total US workforce Till August.

    AI and economically valuable labor

    The new study assessed the performance of six industry-leading AI agents, including Google’s Gemini 2.5 Pro, OpenAI’s GPT-5, and Anthropic’s Sonnet 4.5.

    Agents, which – unlike more limited chatbots – are capable of interacting with digital tools (such as web browsers) and performing complex, multi-step tasks, are widely positioned by tech developers as an important evolutionary step toward the development of artificial general intelligence (AGI).

    Also: What actually turns out is that AI is more likely to replace your work than replace it.

    AGI is a vaguely defined term: experts debate what true “general intelligence” would mean for computers, and whether such an achievement is even possible. However, one of the most common definitions of AGI in tech circles is a system that can equal or outperform humans in any economically valuable task.

    If we take that definition as a starting point, the new RLI study shows that we are a long way from creating true AGI. According to the authors, each of the six models tested in the study “is not able to autonomously meet the diverse demands of remote labor”.

    The models were evaluated in 23 categories of freelance work, including graphic design, product design, computer-aided design (CAD), and game development. Those categories and their supporting skill requirements were identified by researchers using freelance platforms like Upwork, “to ground the benchmark in economic value and capture the diversity and complexity of real remote labor markets.”

    Also: The Best Free AI Courses and Certifications for Upskilling in 2025 – and I’ve Tried Them All

    The models were given a project brief along with any required files to complete their final deliverables, which were manually evaluated by researchers compared to deliverables for the same project created by human freelancers. According to the researchers, the goal was to find out “whether the AI ​​deliverable meets the project at least to the human gold standard – specifically, whether the deliverable would be accepted as commissioned work by a reasonable client.”

    The agents were then compared using the Elo metric. Manus scored the highest with an automation rate of 2.5%, followed by Grok 4 and Cloud Sonnet 2.5, both with a score of 2.1%.

    screen-shot-2025-11-04-at-11-37-58-am.png

    Remote Labor Index: Measuring AI automation of remote work

    Screenshot by ZDNET

    takeaway

    The popular narrative around AI automation can make human labor feel more one-dimensional than it actually is. As the AI ​​industry strives to develop systems that can match or surpass the human brain, we are coming to appreciate the brain’s remarkable flexibility, dynamics, and complexity.

    Some jobs are more suitable for automation than others, but most require an amalgamation of technical and interpersonal skills, and so they are more complex than today’s AI systems.

    Also: According to Microsoft, these jobs are most at risk of AI takeover

    Even today’s most advanced AI systems, designed as general-purpose agents, are capable of performing only a narrow subset of the tasks required by most human workers. As the authors of the new RLI study write in their report, the failure of industry leading agents to automate less than 3% of the tasks required by the average freelancer reveals “a serious gap” separating the promise of AI and the actual, demonstrated capabilities. This is especially true because RLI does not cover many aspects of most freelancers’ daily work lives, such as communication and interaction with clients.

    Again, these are early days. The capabilities of agents are growing rapidly, and the largest technology developers are investing billions in training new, more advanced models. It’s possible that in five or ten years companies will be hiring AI freelancers. But for now, contractors have no real reason to fear the AI ​​job reaper.

    Get our top stories delivered to your inbox every morning Tech Update Newsletter,

    agents freelancers terrible
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleGoldman Sachs doubles down on MoEngage in new round to fuel global expansion
    Next Article NVIDIA, Qualcomm join US, Indian VCs to help build India’s next deep tech startup
    PineapplesUpdate
    • Website

    Related Posts

    Startups

    Behind the scenes of drone food delivery in Finland

    November 30, 2025
    Startups

    The most durable USB-C cable I’ve tested so far is only $11 this weekend (and I’ll be buying several)

    November 30, 2025
    Startups

    Finally, an Android tablet that I wouldn’t mind keeping my iPad Pro for (especially at this price)

    November 30, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Microsoft’s new text editor is a VIM and Nano option

    May 19, 2025797 Views

    The best luxury car for buyers for the first time in 2025

    May 19, 2025724 Views

    Massives Datenleck in Cloud-Spichenn | CSO online

    May 19, 2025650 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    10,000 steps or Japanese walk? We ask experts if you should walk ahead or fast

    June 16, 20250 Views

    FIFA Club World Cup Soccer: Stream Palmirus vs. Porto lives from anywhere

    June 16, 20250 Views

    What do chatbott is careful about punctuation? I tested it with chat, Gemini and Cloud

    June 16, 20250 Views
    Our Picks

    Behind the scenes of drone food delivery in Finland

    November 30, 2025

    The most durable USB-C cable I’ve tested so far is only $11 this weekend (and I’ll be buying several)

    November 30, 2025

    Finally, an Android tablet that I wouldn’t mind keeping my iPad Pro for (especially at this price)

    November 30, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms And Conditions
    • Disclaimer
    © 2025 PineapplesUpdate. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.