Since its original launch in Google I/O 2024, AI has become a test ground for Project Estra Google’s AI auxiliary ambitions. Multimodal, all-visible bot is not a consumer product, in fact, and it will not soon be available to anyone outside a small group of examiners. Instead, Estra represents what AI may be able to do for people in the future, about it is a collection of Google’s greatest, wild, most ambitious dreams. Greg Wayne, a research director at Google Deepmind, says he sees Estra as “the concept of a universal AI auxiliary”.
Eventually, the goods that work in Estra ships in Gemini and other apps. Already, including some team’s work on voice output, memory and some basic computer-use facilities. As those features go into the mainstream, the Estra team finds something new to work.
This year, at its I/O Developer Conference, Google announced some new Estra features that indicate how the company has come to see its assistant – and just think how smart it thinks can be helpful. In addition to answering questions, and using your phone’s camera to remember where you left your glasses, Estra can now complete the tasks on your behalf. And it can also ask without you.
Estra’s most impressive new feature is its new operation. “Estra can choose when to talk on the basis of events seen,” says Wayne. “This is actually, in an ongoing sense, observation, and then it can comment.” This is a big change: Instead of pointing your phone on something and asking your AI assistant about it, Estra’s plan is to wait for the assistant to constantly look, hear and wait for your move. (The team is thinking about a lot of equipment, on which an estra-like products can work, but it focuses on phone and smart glasses.
Astra plans to constantly look, hear and wait for his move
If you are looking at Estra while doing your homework, the Wayne provides through the example, it can notice that you have made a mistake and indicated where you are wrong, rather than that you are waiting for finish and specifically ask the bot to check your work. If you are fasting with a stop, Estra can remind you to eat before your specified time – or gently wonder if you really should eat now, looking at your diet plan.
Damis Hasabis, CEO of Deepmind, says that Estra has been part of all the schemes to teach for the work of his own will. He calls it “reading the room”, and says that although you think it is to teach a computer, it is really much harder than that. To know when to ring, what to take tone, how to help, and when to stop, there is one thing that man does relatively well, but it is difficult to either determine or study. And if the product does not work well, and starts to pipe unpublished and unwanted? “Well, no one uses it if it would have happened,” Hasbis says. They are bets.
In fact, a great, active assistant is still a way, but one thing it will definitely be necessary that there is a large amount of information about you. This is another new thing that is coming to Estra: Assistant can now get information from the web and from other Google products. It can see what is on your calendar, to tell you when to leave; It can see what to dig in your email to dig your confirmation number because you are going to the front desk to check. At least, this is the idea. Do all this work – and then continuously and firmly – it will take some time.
The last piece of the puzzle, however, is actually coming together: Estra is learning to use its Android phone. Bibo Shiyu, a product manager of the Deepmind team, showed me a demo in which he indicated his phone camera in a pair of Sony headphones, and asked what they were. Estra stated that it was either wh-1000xm4 or who-1000xm3 (and honestly, how can anyone or anyone expect to know the difference), and XIU asked Estra to find the manual, then to explain how to connect them with their phone. After Estra explained, XIU interrupted: “Can you go ahead and open the settings and just add the headphone to me, please?” All in itself, Estra did this.
This process was not completely comfortable – XIU had to manually turn on a feature that allowed Estra to see the screen of her phone. The team is still working on doing so automatically, she says, “But it is the goal that it can understand what it can see and not to see at this time.” The use of such automated devices is what Apple is working with its next generation head, and both companies imagine an assistant that can navigate the apps, answer tweak settings, messages, and even play games without the need to touch the screen. This is an incredibly difficult thing to build, of course: the demo of XIU was impressive, and as you can imagine it was as simple task. But Estra is progressing.
Right now, most of the so -called “agent AI” does not work very well or at all. Even in the best-case scenario, it still needs you to do too much lifting: you have to sign the system at every turn, supplies all additional references and require the app, and make sure that everything is running smoothly. Google’s goal is to start removing all that work, step by step. It wants Estra to know when it is needed, to know what to do, to know how to do, and to know what it needs to be done. Every part of that will require technological successes, most of which have not yet been created. Then complex users will have interface problems, privacy questions and besides more issues.
If Google or anyone is actually going to build universal AI auxiliary, however, it will have to get this stuff correctly. “This is another level of intelligence that is necessary to be able to achieve it,” Hasbis is called. “But if you can, it will feel clearly different for today’s system. I think a universal accessory should be really useful.”


