The AI Art Generation has evolved at a wild pace, and Google just threw another big competitor into the mix through his Gemini Flash 2.0. You can play with the new tool for creating image creation in Google’s AI study.
Gemini Flash is, as the name suggests, very quickly, especially faster than Dall-E 3 and other image creators. This speed can mean images of lower quality, but this is not the case here, especially because all changes and upgrades to the model’s image production ability. Still, if you want really good results, you need to know how to talk to AI. After lots of trial and mistake, I have put together five tips to get the absolute best art out of Gemini Flash 2.0. Some of these may seem like advice about other AI artists because they are, but it does not make them less useful in this context.
Tell a story
The most interesting new feature for Gemini Flash’s image creation is that it is not only good for disposable illustrations, it can actually help you create a visual story by generating a number of related images with uniform style, settings and moods.
To get started, just ask it to tell you a story and how often you want an illustration to go with the action. The result will include the images that accompany the text.
To my project, I asked AI to “generate a story of a heroic baby dragon that protected a FE queen from an evil wizard in a 3D -comic -animation style. For each scene, generating a picture.” I saw the above start to appear. And if there is a problem, you can rewrite one of the bit of history and the model regenerates the image accordingly.
Be super specific
If you ask Gemini to make “a dog in a park”, you may get a blurred golden retriever that sits somewhere on guard green. But if you say, “a fluffy golden retriever sitting on a wooden bench in Central Park during the fall, with red and orange leaves spread on the ground” – you get exactly what you imagine.
AI models thrive on details. The more you give, the better your picture becomes. So to the picture above, instead of just asking for a futuristic city, I requested “a retro-futuristic cityscape at sunset, with neon signs glowing in pink and blue, flying cars in the sky, and people walking in retro-emerging style equipment.” Seven seconds later the result came in.
Get conversation
One of my favorite things about the new Gemini flash is that you can have conversation with it without losing much of the speed. This means you don’t have to get everything well at once. After generating a picture, you can literally chat with AI to make edits. Wanna change the colors? Add a character? Does the luminous moody? Just ask.
In the picture set above, I started by asking for “a cozy reading hull with a fireplace, bookshelves filled with novels and a large pleasant armchair.” Then I refined it by asking it to “do it at night with soft, warm lighting”, then followed up by asking it to “add a sleeping cat to the armchair” and finished by requesting AI “give the room a vintage, Victorian aesthetically.” The final result on the left looks almost exactly like what I imagined, making Gemini feel like an art assistant, one who is able to adapt to what I want without starting from scratch every time.
Gemini flash matches chatgpt
Google has boasted that Gemini is full of the real world knowledge, which means you can get historical accuracy, realistic cultural details and true-to-life images if you ask for it. Of course, it takes to be specific. For example, if you ask it for “a Viking Warrior” you can get something similar to more one Game of Thrones character. But if you say, “a historically accurate Viking warrior from the 9th century, wearing detailed chain -mail rumor, a round wooden screen and a traditional Norse helmet” -you get something much more accurate.
As a test, I asked AI to make “an old Mayan city at sunrise, with towering stone pyramids, lush jungle surroundings and people dressed in traditional Mayan clothes.” It’s not perfect, but it looks much more like the right thing than previous versions that would sometimes come back with almost an Egyptian pyramid.
Write quickly
Most AI image models have long struggled to reproduce text and transform words into illegible scribbles. Even the better models today that can do it take a little to do it, and to get it right can take a few attempts. But Gemini Flash is shockingly good at integrating text into images quickly and legibly. However, being very specific can help.
Here’s how I generated the picture above by asking AI to “make a vintage-style travel poster that says ‘Visit London’ with bold, retro typography, with a stylized illustration of the city.”