We are now witnessing a new shift in computing: the move from a mobile-first to an AI-first world. And as before, it is forcing us to reimagine our products for a world that allows a more natural, seamless way of interacting with technology. Think about
Google Search: it was built on our ability to understand text in webpages. But now, thanks to advances in deep learning, we’re able to make images, photos and videos useful to people in a way they simply haven’t been before. Your camera can “see”; you can speak to your phone and get answers back—speech and vision are becoming as important to computing as the keyboard or multi-touch screens.
The Assistant is a powerful example of these advances at work. It’s already across 100 million devices, and getting more useful every day. We can now distinguish between different voices in Google Home, making it possible for people to have a more personalized experience when they interact with the device. We are now also in a position to make the
smartphone camera a tool to get things done. Google Lens is a set of vision-based computing capabilities that can understand what you’re looking at and help you take action based on that information. If you have crawled down on a friend’s apartment floor to see a long, complicated Wi-Fi password on the back of a
router, your phone can now recognize the password, see that you’re trying to log into a Wi-Fi network and automatically log you in. The key thing is, you don’t need to learn anything new to make this work—the interface and the experience can be much more intuitive than, for example, copying and pasting across apps on a smartphone. We’ll first be bringing Google Lens capabilities to the Assistant and Google Photos and you can expect it to make its way to other products as well.