In the course of the kickoff keynote for Google I/O 2024, the overall tone gave the impression to be, “Can we’ve got an extension?” Google’s promised AI enhancements are undoubtedly taking heart stage right here, however with a number of exceptions, most are nonetheless within the oven.
That’s not too stunning—it is a developer convention, in spite of everything. But it surely looks as if customers must wait some time longer for his or her promised “Her” second. Right here’s what you may count on as soon as Google’s new options begin to arrive.
AI in Google Search
Credit score: Google/YouTube
Perhaps probably the most impactful addition for most individuals will likely be expanded Gemini integration in Google Search. Whereas Google already had a “generative search” characteristic in Search Labs that would jot out a fast paragraph or two, everybody will quickly get the expanded model, “AI Overviews.”
Optionally in searches, AI Overviews can generate a number of paragraphs of knowledge in response to queries, full with subheadings. It would additionally present further context over its predecessor and might take extra detailed prompts.
As an illustration, in case you dwell in a sunny space with good climate and ask for “eating places close to you,” Overviews may provide you with a number of fundamental ideas, but additionally a separate subheading with eating places which have good patio seating.
Within the extra conventional search outcomes web page, you’ll as an alternative have the ability to use “AI organized search outcomes,” which eschew conventional search engine optimisation to intelligently suggest net pages to you based mostly on extremely particular prompts.
As an illustration, you may ask Google to “create a gluten free three-day meal plan with numerous veggies and at the very least two desserts,” and the search web page will create a number of subheadings with hyperlinks to applicable recipes below every.
Google can also be bringing AI to the way you search, with an emphasis on multimodality—which means you should use it with greater than textual content. Particularly, an “Ask with Video” characteristic is within the works that can will let you merely level your telephone digicam at an object, ask for identification or restore assist, and get solutions by way of generative search.
Google did not straight handle how its dealing with criticism that AI search outcomes basically steal content material from sources across the net with out customers needing to click on by means of the unique supply. That stated, demonstrators highlighted a number of instances that these options deliver you to helpful hyperlinks you may try your self, maybe overlaying their bases within the face of those critiques.
AI Overviews are already rolling out to Google customers within the US, with AI Organized Search Outcomes and Ask with Video set for “the approaching weeks.”
Search your images with AI
Credit score: Google/YouTube
One other of the extra concrete options within the works is “Ask Images,” which performs with multimodality that can assist you type by means of the a whole lot of gigabytes of photographs in your telephone.
Say your daughter took swimming classes final yr and also you’ve misplaced monitor of your first images of her within the water. Ask images will allow you to merely ask, “When did my daughter study to swim?” Your telephone will mechanically know who you imply by “your daughter,” and floor photographs from her first swimming lesson.
That’s just like looking your photograph library for footage of your cat by simply typing “cat,” certain, however the concept is that the multimodal AI can assist extra detailed questions and perceive what you’re asking with better context, powered by Gemini and the info already saved in your telephone.
Different particulars are mild, with Ask Images set to debut “within the coming months.”
Challenge Astra: an AI agent in your pocket
Credit score: Google/YouTube
Right here’s the place we get into extra pie within the sky stuff. Challenge Astra is probably the most C-3PO we’ve seen AI get but. The concept is you’ll have the ability to load up the Gemini app in your telephone, open your digicam, level it round, and ask for questions and assist based mostly on what your telephone sees.
As an illustration, level at a speaker, and Astra will have the ability to inform you what components are within the {hardware} and the way they’re used. Level at a drawing of a cat with doubtful vitality, and Astra will reply your riddle with “Schrödinger’s Cat.” Ask it the place your glasses are, and if Astra was taking a look at them earlier in your shot, it will likely be in a position to inform you.
That is possibly the classical dream with regards to AI, and fairly just like OpenAI’s not too long ago introduced GPT-4o, so it is sensible that it’s not prepared but. Astra is about to return “later this yr,” however curiously, it’s additionally presupposed to work on AR glasses in addition to telephones. Maybe we’ll be studying of a brand new Google wearable quickly.
Make a customized podcast Hosted by Robots
Credit score: Google/YouTube
It’s unclear when this characteristic will likely be prepared, because it appears to be extra of an instance for Google’s improved AI fashions than a headliner, however one of many extra spectacular (and probably unsettling) demos Google confirmed off throughout I/O concerned making a customized podcast hosted by AI voices.
Say your son is finding out physics in class, however is extra of an audio learner than a text-oriented one. Supposedly, Gemini will quickly allow you to dump written PDFs into Google’s NotebookLM app and ask Gemini to make an audio program discussing them. The app will generate what looks like a podcast, hosted by AI voices speaking naturally in regards to the subjects from the PDFs.
Your son will then have the ability to interrupt the hosts at any time to ask for clarification.
Hallucination is clearly a significant concern right here, and the naturalistic language could be somewhat “cringe,” for lack of a greater phrase. However there’s little question it’s a formidable showcase…if solely we knew after we’ll have the ability to recreate it.
Paid options
Credit score: Google/YouTube
There’s a number of different instruments within the works that appear purpose-built on your typical shopper, however for now, they’re going to be restricted to Google’s paid Workspace plans.
Essentially the most promising of those is Gmail integration, which takes a three-pronged method. The primary is summaries, which might learn by means of a Gmail thread and break down key factors for you. That’s not too novel, neither is the second prong, which permits AI to counsel contextual replies for you based mostly on data in your different emails.
However Gemini Q&A appears genuinely transformative. Think about you’re trying to get some roofing work performed and also you’ve already emailed three totally different development companies for quotes. Now, you wish to make a spreadsheet of every agency, their quoted value, and their availability. As an alternative of getting to sift by means of every of your emails with them, you may as an alternative ask a Gemini field on the backside of Gmail to make that spreadsheet for you. It would search your Gmail inbox and generate a spreadsheet inside minutes, saving you time and maybe serving to you discover missed emails.
This form of contextual spreadsheet constructing will even be coming to apps outdoors of Gmail, however Google was additionally proud to point out off its new “Digital Gemini Powered Teammate.” Nonetheless within the early levels, this upcoming Workspace characteristic is form of like a combination between a typical Gemini chat field and Astra. The concept is that organizations will have the ability to add AI brokers to their Slack equivalents that will likely be on name to reply questions and create paperwork on a 24/7 foundation.
Gmail’s Gemini options will likely be rolling out this month to Workspace Labs customers.
Gems
Credit score: Google/YouTube
Earlier this yr, OpenAI changed ChatGPT plugins with “GPTs,” permitting customers to create customized variations of its ChatGPT chatbots constructed to deal with particular questions. Gems are Google’s reply to this, and work comparatively equally. You’ll have the ability to create plenty of Gems that every have their very own web page inside your Gemini interface, and every reply to a selected set of directions. In Google’s demo, advised Gems included examples like “Yoga Bestie,” which provides train recommendation.
Gems are one other characteristic that received’t see the sunshine of day till a number of months from now, so for now, you may have to stay with GPTs.
Brokers
Credit score: Google/YouTube
Recent off the muted reception to the Humane AI Pin and Rabbit R1, AI aficionados have been hoping that Google I/O would present Gemini’s reply to the guarantees behind these units, i.e. the flexibility to transcend merely collating data and really work together with web sites for you. What we acquired was a light-weight tease with no set launch date.
In a pitch from Google CEO Sundar Pichai, we noticed the corporate’s intention to make AI Brokers that may “assume a number of steps forward.” For instance, Pichai talked in regards to the risk for a future Google AI Agent that can assist you return sneakers. It may go from “looking your inbox for the receipt,” all the way in which to “filling out a return type,” and “scheduling a pickup,” all below your supervision.
All of this had an enormous caveat in that it wasn’t a demo, simply an instance of one thing Google needs to work on. “Think about if Gemini may” did quite a lot of heavy lifting throughout this a part of the occasion.
New Google AI Fashions
Credit score: Google/YouTube
Along with highlighting particular options, Google additionally touted the discharge of recent AI fashions and updates to its present AI mannequin. From generative fashions like Imagen 3, to bigger and extra contextually clever builds of Gemini, these elements of the presentation have been meant extra for builders than finish customers, however there’s nonetheless a number of fascinating factors to drag out.
The important thing standouts are the introduction of Veo and Music AI Sandbox, which generate AI video and sound respectively. There’s not too many particulars on how they work but, however Google introduced out massive stars like Donald Glover and Wyclef Jean for promising quotes like, “All people’s gonna turn into a director” and, “We digging by means of the infinite crates.”
For now, the perfect demos we’ve got for these generative fashions are in examples posted to superstar YouTube channels. Right here’s one under:
Google additionally wouldn’t cease speaking about Gemini 1.5 Professional and 1.5 Flash throughout its presentation, new variations of its LLM primarily meant for builders that assist bigger token counts, permitting for extra contextuality. These in all probability received’t matter a lot to you, however take note of Gemini Superior.
Gemini Superior is already available on the market as Google’s paid Gemini plan, and permits a bigger quantity of questions, some mild interplay with Gemini 1.5, integration with varied apps reminiscent of Docs (separate from Workspace-exclusive options), and uploads of information like PDFs.
A few of Google’s promised options sound like they’ll want you to have a Gemini Superior subscription, particularly people who need you to add paperwork so the chatbot can reply questions associated to them or riff off them with its personal content material. We don’t know for certain but what will likely be free and what received’t, however it’s one more caveat to bear in mind for Google’s “hold your eye on us” guarantees this I/O.
That is a wrap on Google’s common bulletins for Gemini. That stated, in addition they made bulletins for new AI options in Android, together with a brand new Circle to Search capacity and utilizing Gemini for rip-off detection. (Not Android 15 information, nonetheless: That comes tomorrow.)