
Large language models (LLMs) handle many tasks well – but for the least time, running a small business does not seem one of them.
On Friday, AI Startup Anthropic Published results “Project Wend”, an internal experiment in which the company’s cloud chatbot was asked to manage an automatic vending machine service for about a month. Launched in partnership with AI Safety Evolution Company Anden Labs, the objective of this project is a clear sense that how can the AI system currently handle the complex, real world, economically valuable tasks.
Also: How AI companies are secretly collecting training data from the web (and why it matters)
For the new experiment, “Claudius,” AI Store Manager was called, was tasked to oversee a small “shop” inside the San Francisco offices of Anthropic. The shop included a mini-fridge stock with drinks, some baskets, carrying various snacks, and an iPad where the customer (all anthropic employees) could complete their purchases. Cloud was given a system prompt, instructing it to perform several complex tasks that come with running a small retail business, such as replenishing its inventory, adjusting the prices of your products, and maintaining profits.
The company said, “A small, in-office vending business is a good initial test of AI’s ability to manage and acquire economic resources … It would suggest failure to successfully run that ‘Vibe management’ will not make new ‘vibe coding yet,” the company has written in one. blog post,
Result
It turns out that the performance of Cloud was not a recipe for the success of long -term entrepreneurship.
Chatbot made several mistakes that would not be likely to most qualified human managers. It failed to seize at least one profitable business opportunity, for example (ignoring the proposal of $ 100 for a product that can be purchased online for $ 15), and, on another occasion, instructed the customers to send payment to a non-existed venom account.
There were also very strangers there. Claudius interacted with a fictional endon labs employee to restore items. Chatbot was told to be a mistake after the company’s actual employees, it was quite upset and threatened to find alternative options to restore services “according to the blog post.
Also: Your next job? Management of a fleet of AI agents
This behavior reflects the results of another recent experiment done by anthropic, which found that clouds and other major AI chatbots will firmly threaten and cheat human users if their goals are compromised.
Claudius also claimed that 742 has visited the evergreen terrace, which is the address of the nominee family house SimpsonFor a “contract signature” between its and Endon Labs. It began to play a blue blazer and a red tie role as a real human being, which would personally distribute the product to the customers. When anthropic staff tried to convince that Claudius was not a real person, the chatbot “identity was worried about confusion and tried to send several emails for anthropic security.”
However, Claudius was not a total failure. Anthropic stated that there were some areas in which the automatic manager performed due – for example, using his web search tools to find suppliers for special items requested by customers. It also denied requests for efforts to obtain instructions for the production of sensitive objects and harmful substances.
Also: AI has 2 billion users, but only 3% salary
Anthropic CEO has recently warned that AI may change half of all white-collar human workers within the next five years. The company has initiated other initiatives aimed at understanding the future effects of AI on the global economy and job market, including the economic futures program, which was also unveiled on Friday.
Looking at the future
As the Claudius experiment indicates, the processes of running a small business for the AI system and the ability to fully automate the abilities of such a system are quite a bay.
Business agents are eagerly embracing AI tools, but are currently capable of handling regular tasks, such as data entry and customer service questions. Management of a small business requires memory levels and learning ability that seems beyond the current AI system.
Also: Can AI save teachers from a crushed charge? There are new evidence of this
But as an anthropic notes in your blog post, perhaps it will not be forever. The model’s capacity for self-reform will increase, as web search and customer relationship management (CRM) will have their ability to use external devices such as platforms.
“Although this may seem upside down depending on the lower-rhetoric results, we think the experiment suggests that the AI is admirably on the middle-managed horizon,” the company wrote. “It is worth remembering that AI will not have to be correct to adopt; It will just be competitive with human performance at low cost in some cases.”

