In truth, we probably spend too much time thinking about the user interface, when the deeper shift will be something a lot less visible. The defining contribution of AI assistants will not be replacing taps with speech or gestures with gaze. It will be their ability to anticipate people’s needs, act within a specific context, adapt to individual users, and complete complex tasks across tools—without having to expose the user interface of each service.
If this prediction ends up being true, it means the technology industry needs to think about designing AI assistants that go beyond the traditional notion of user interfaces and apps altogether.
With AI, Inputs Aren’t the Whole Story
Historically, design has revolved around guiding people through layers of menus, icons, and screens. Booking a restaurant or making a travel change still requires multiple steps across one or more apps—or at least multiple user interfaces. AI promises to erase some of that friction.
This is why voice interactions are often cast as the next frontier. It’s probably also why we’ve seen the Humane AI Pin and Rabbit r1 position voice-first control as a smartphone replacement. But they haven’t yet replaced our phones because changing the input method alone doesn’t magically solve the underlying problem.
One could phrase a command to an AI agent in a thousand ways, but unless that AI assistant can understand and anticipate the context of what the user needs before the user even asks for it, that agent won’t be all that useful. People want an assistant that doesn’t just react to prompts but has been built with anticipatory social design in mind. It shouldn’t just list a bunch of movies that are currently at a nearby cinema when the user asks it to. It should notice that the user really likes Christopher Nolan films and that the cinema the user passes when commuting home on the train isn’t just playing his next film, The Odyssey, but also hosting a question-and-answer (Q&A) session with the cast and crew. For optimal convenience, it might even tell the user to leave work early because of ongoing train strikes.
This level of support doesn’t depend on whether a trigger is a tap, a word, or a glance. It depends on the intelligence of an assistant that is working quietly in the background.
