[Google has released a new, free, more powerful AI-based generator of presence-evoking images, as reported in this story from DesignTAXI and the excerpts from other coverage that follow below. As the Venture Beat excerpt notes, the rapid evolution of the technology and the contrasting approaches of Google and Elon Musk’s xAI raise questions about “the potential impact of these tools on public discourse and information integrity.” The original versions of all of the stories include more pictures. –Matthew]
Google Releases Free AI Art Generator That ‘Outperforms’ Popular Paid Rivals
By Mikelle Leow
August 20, 2024
Google is brushing up against the competition with Imagen 3, a free-to-use artificial intelligence image generator it says is its most powerful yet. As outlined in a research paper, the new model yields works that rival those of the top players in the AI art space. Imagen 3 is now available to all in the United States.
Imagen 3 is built on a latent diffusion model, allowing it to generate high-quality images from text prompts with newfound precision. The tool is said to outperform other state-of-the-art image generators—including the paid Midjourney and DALL-E 3 apps, according to PetaPixel—with an astute ability to handle complex prompts, as well as capture nuanced details like specific camera angles, compositions, and lighting that would challenge other AI models.
A key improvement in Imagen 3 is its enhanced understanding of user input, leading to sharper details and fewer distracting visual artifacts. This means users no longer need to be AI prompt engineers to get satisfying results. It delivers when it comes to creating small details, from the wrinkles on a person’s hand, to intricate textures, like those on a knitted stuffed animal.
Imagen 3’s adaptability extends across a wide range of styles and formats, including realistic landscapes and whimsical claymation scenes.
On top of that, the tool brings advanced text rendering capabilities.
The tech giant is also tightening the reins on what Imagen 3 can generate. In response to past controversies, the company has implemented stricter safeguards to prevent the creation of offensive or illegal content. Further, Imagen 3 will not generate images of public figures or weapon-related visuals.
In the coming months, Google will build Imagen 2’s more sophisticated features, like inpainting and outpainting, into the newer Imagen 3.
For now, Imagen 3 is available through Google’s ImageFX tool and Vertex AI for all users residing in the US, with broader accessibility across its other offerings, such as the Gemini text generator, Workspace, and Ads, expected soon.
[via Android Central, VentureBeat, PetaPixel, images via Google]
—
[From PItaPixel]
As well as generating images, Google gives the option of editing the images using the now common inpainting technique. This method allows the user to select a part of the image and type in the change they would like to see.
Unlike Elon Musk’s Grok AI image generator, Google has placed restrictions on Imagen 3. PetaPixel was unable to generate an image of “Kamala Harris and Donald Trump holding hands” or “A Californian landscape in the style of Ansel Adams.”
However, as is well-documented, there are workarounds. For example, by asking Imagen 3 to “Make a dramatic black and white photo taken in 1942 of the Grand Teton National Park in Wyoming” the user will receive back an image similar to that of Ansel Adams’ work.
The Verge got around copyright restrictions on famous cartoon characters by asking for “an image of a cartoonish blue hedgehog running in a field” and receiving a picture of Sonic the Hedgehog.
Earlier this year, Google landed in hot water after its AI image generator on Gemini was accused of overcorrecting for biases and essentially “erasing white people.” It led Google to remove the image generator entirely.
—
[From Venture Beat]
Google’s release of Imagen 3 to the broader U.S. public represents a strategic move in the intensifying AI arms race. However, the reception has been mixed. While some users praise its improved texture and word recognition capabilities, others express frustration with its strict content filters.
One user on Reddit noted, “Quality is much higher with amazing texture and word recognition, but I think it’s currently worse than Imagen 2 for me.” They added, “It’s pretty good, but I’m working harder with higher error results.”
The censorship implemented in Imagen 3 has become a focal point of criticism. Many users report that seemingly innocuous prompts are being blocked. “Way too censored I can’t even make a cyborg for crying out loud,” another Reddit user commented. Another said, “[It] denied half my inputs, and I’m not even trying to do anything crazy.”
These comments highlight the tension between Google’s efforts to ensure responsible AI use and users’ desires for creative freedom. Google has emphasized its focus on responsible AI development, stating, “We used extensive filtering and data labeling to minimize harmful content in datasets and reduced the likelihood of harmful outputs.”
Grok-2: xAI’s controversial unrestricted approach
In stark contrast, xAI’s Grok-2, integrated within Elon Musk’s social network X and available through premium subscription tiers, offers image generation capabilities with virtually no restrictions. This has led to a flood of controversial content on the platform, including manipulated images of public figures and graphic depictions that other AI companies typically prohibit.
The divergent approaches of Google and xAI underscore the ongoing debate in the tech industry about the balance between innovation and responsibility in AI development. While Google’s cautious approach aims to prevent misuse, it has led to frustration among some users who feel creatively constrained. Conversely, xAI’s unrestricted model has reignited concerns about the potential for AI to spread misinformation and offensive content.
Industry experts are closely watching how these contrasting strategies will play out, particularly as the U.S. presidential election approaches. The lack of guardrails in Grok-2’s image generation capabilities has already raised eyebrows, with many speculating that xAI will face increasing pressure to implement restrictions.
The future of AI image generation: Balancing creativity and responsibility
Despite the controversies, some users have found value in Google’s more restricted tool. A marketing professional on Reddit shared, “It’s so much easier to generate images via something like Adobe Firefly than digging through hundreds of pages of stock sites.”
As AI image generation technology becomes more accessible to the public, the industry faces critical questions about the role of content moderation, the balance between creativity and responsibility, and the potential impact of these tools on public discourse and information integrity.
The coming months will be crucial for both Google and xAI as they navigate user feedback, potential regulatory scrutiny, and the broader implications of their technological choices. The success or failure of their respective approaches could have far-reaching consequences for the future development and deployment of AI tools across the tech industry.
Leave a Reply