Google is updating its Bard AI chatbot to compete with OpenAI's ChatGPT. The internet giant, led by Sundar Pichai, announced that it is expanding Bard to include image generation capabilities supported by its Imagen 2 AI model, as well as a more powerful Gemini Pro version.
This move allows more people to use Bard's AI intelligence, including a new free tool for creating AI images.
"These updates make Bard a more useful and globally accessible AI collaborator, suitable for a variety of situations, from large creative projects to small daily tasks," said Jack Krawczyk, Product Manager for Bard, in a blog post.
In addition, the company announced that it is testing another image generator called ImageFX.
Gemini Pro with Multilingual Support
A month ago, Google announced three versions of Gemini: Nano for mobile devices, Pro for more intermediate use cases, and Ultra, which is claimed to be the most powerful large language model (LLM) developed by any company to date—even more powerful than GPT-4, although this version will not be released until later this year.
Among the currently available Google large language models, Gemini Pro is the most powerful one, but third-party comparisons have found that it actually lags behind OpenAI's older GPT-3.5 Turbo, which is a concerning sign for Google, who is seeking to demonstrate its superiority to new entrants in the generative AI competition.
Last month, Google released a fine-tuned version of Gemini Pro for Bard, but it was limited to English only.
However, a series of consumer-focused AI announcements today will help Google narrow this gap. The latest update to Bard is that Gemini Pro will be available in over 230 countries and regions in more than 40 languages, including Korean, Spanish, Tamil, Italian, and Russian.
This not only allows more people to take advantage of Gemini Pro's advanced understanding, summarization, reasoning, and programming capabilities, but also enables the use of Bard's fact-checking feature, which verifies responses by searching the internet.
Bard Competes with ChatGPT Plus and DALL-E 3 with the Help of Imagen 2 Model
Most importantly, the long-awaited AI image generation feature is about to be launched. This is achieved through the Imagen 2 model, which Google says can generate high-quality, realistic outputs from textual inputs, making Bard a more direct and capable competitor to OpenAI's ChatGPT Plus and DALL-E 3 image generator models. DALL-E 3 has been available to OpenAI subscription users since October 2023.
"Just input a description, such as 'create an image of a dog surfing on a surfboard,' and Bard will generate customized visual effects to help you turn your ideas into reality," Krawczyk pointed out.
We tested Bard's image generation feature and found that it can generate consistent outputs in about 30-40 seconds. However, in some cases, it fails to generate an image at all—even in situations that do not involve any famous individuals, which Google filters out (most likely to avoid incidents of defamation deepfakes, similar to what happened with musician Taylor Swift and Microsoft's Designer AI image generator, which is powered by OpenAI's DALL-E 3).
Furthermore, changing the aspect ratio of the output image or using prompts in languages other than English is currently not supported—at least based on our initial experience using the tool.
However, on the bright side, given the copyright infringement issues associated with AI-generated media, Google Bard offers users the option to report privacy, copyright, and other legal issues related to all generated media.
The company also noted that it restricts the generation of violent, offensive, or pornographic content and embeds recognizable digital watermarks into the generated image pixels using SynthID, developed by Deepmind. This helps people distinguish whether the visuals were generated by Google's AI or by actual human artists.
New Iterative Approach to AI Image Generation
In addition to Bard's updates, Google also announced that it is experimenting with a new image generation tool called ImageFX, supported by Imagen 2.
Currently, in Google's AI experiment app AI Test Kitchen, ImageFX aims to inspire creative ideas through "expression chips," which provide users with suggestions for adjacent dimensions and iterative prompts. Similar features can also be found in competing tools, including Ideogram.
AI Test Kitchen also includes other interesting experimental projects from Google, such as MusicFX (which now allows the creation of melodies up to 70 seconds long through text prompts and expression chips) and TextFX (a generative AI experiment for word writers, literary artists, and other creative individuals).