45 Comments
User's avatar
Dr Sam Illingworth's avatar

Damn Daria this is such a good post! And has made me rethinking switching back my API to Chat...

Daria Cupareanu's avatar

Thanks Sam! If you do switch, you can always keep both API keys set up and alternate depending on what you're creating. That's what I do. In some cases the saturation from GPT annoys me :)) but it always depends on the kind of visual you want to create

Dr Sam Illingworth's avatar

As usual you have shown me the way! 🥰

Magick Mica's avatar

I love this. Saving.

AI will always blow our minds.

💜ྀི🟪🫐🔮💜✨🌙💜☂️🦄💜👾🍇☂️

The magic is trying the same prompt over and over.... =)

Daria Cupareanu's avatar

And when we do repeat the same prompt, the output will look different every time, haha

TJC_Erotica's avatar

For just image generation Nano all day. For creating ads, presentations or pages GPT. I also have started to use Flux Ai.

Joël Kai Lenz's avatar

Fantastic breakdown! And I immediately copied the cat Vogue idea for my dog as well :)

Daria Cupareanu's avatar

Haha, so cute, I need to see that!

Joël Kai Lenz's avatar

Haha here is the Insta we created as a fun gig: https://www.instagram.com/the_dog_vogue

Daria Cupareanu's avatar

Nothing melts me more than cute doggos. And these ones are stylish too 😆

Joël Kai Lenz's avatar

Haha yes! I think they enjoy being on the covers 😅

The Synthesis's avatar

Your own workflow is more interesting than the comparison itself. You have both models wired into Claude and switch based on the task. The framing is "which model wins," but your actual practice points somewhere else: the durable advantage is the integration layer that turns model selection into a routing decision. GPT Images 2 leads on realism today. That gap will close, probably fast. What persists is having both available in a single context where swapping costs you nothing.

Neil Winward's avatar

Image generation for me is primarily chart generation within decks. Claude is very good at this, though it can take some time to get everything buttoned up. It saves massive amounts of my time. However, sometimes the path can be overly complicated. Here’s an example: a deck in PowerPoint needs to be re-organized, and you discover that pace numbers have been embedded as images rather than using the footnote function of PowerPoint.

Chris Tottman's avatar

Google always wins in the end

Daria Cupareanu's avatar

We will for sure see in the next update, really looking forward to it.

AI Meets Girlboss's avatar

Daria, the speed I clicked on this post would be hard to measure with any timer. This is such a good comparison post! I really like these examples, and I agree ChatGPT really upped their game this time. I am still not happy with the way ChatGPT treats my specific style, but overall I am a huge fan of the new ChatGPT Images 2.0.

Daria Cupareanu's avatar

Now that you mention preserving specific styles, one thing that annoys me is color consistency. Sometimes it preserves exact brand hex colors sometimes it drifts into different shades. Makes me think there’s probably a different kind of prompting needed for stronger style preservation too.

And with your visuals, since they have more character-based/artistic elements, maybe GPT keeps trying to pull them toward a more realistic look, since it seems heavily tuned for that, instead of fully locking into the established style (just guessing out loud)

AI Meets Girlboss's avatar

Very interesting that you say that, because I’ve noticed something similar. ChatGPT does tend to make things a bit more realistic. For example, if it’s a fashion illustration, it adds more detail, while my original illustrations are much simpler.

What ChatGPT generates is beautiful, just different from where I started. But I’m also trying to think: maybe I should let that go a little, because it comes with so many other advantages. Tiny creative identity crisis, powered by AI.😁

Jenny Ouyang's avatar

Daria! You were born for images!

I always feel Nanobanana’s outputs are missing that sense of subtlety and nuance. Nanobanana 2 was honestly getting very close to ChatGPT before the latest upgrade, but now the gap feels wide again :)

Daria Cupareanu's avatar

Haha, I know you're a GPT Images fan. I preferred Nano Banana before, but with this update GPT is hard to beat. I'm curious what the comparison will look like when the next Gemini model comes out 😆

Dallas Payne's avatar

Loved this experiment, Daria! Fascinating to see how ChatGPT has really elevated their game and is really shining as the better choice for most things. I've had so much fun playing with it over the last few days too. It seems quite intuitive. I've also had some epic fails which have made me laugh and remind me it's still an AI that can do silly things on occasion 😄

Daria Cupareanu's avatar

100% we'll always run into failures with them. I've noticed it's much easier when you want to create something where they can pull from their own training data (like a physics atlas or a brand campaign for a known product) vs. something with your own specific information and vision. They're much better at the first use case than the second.

The Synthesis's avatar

The epic fails are oddly the useful part. Smooth output you copy and forget; broken output makes you see the seams and sharpens your prompt instinct. It's the chess-engine vs clinical-AI pattern from https://thesynthesisai.substack.com/p/the-prepared-mind: visible feedback keeps humans getting better, frictionless answers let the skill atrophy.

Raghav Mehra's avatar

Wow, GPT Images 2.0 are definitely a big upgrade from its predecessor.

Thank you for illustrating the differences with these amazing experiments, Daria!

Daria Cupareanu's avatar

Thanks Raghav, that's exactly why I wanted to put them side by side, the differences are way more obvious like this

Nils Haaland's avatar

Thank you Daria! Terrific work!, it is amazing how far the tech has grown-

I’m especially impressed with your promptings-those futuristic chips really caught my attention!

I need to review the amplifier section from your last article- & am still on the fence on purchasing a separate closed system- Mac mini & monitor for solo agentic work- for ex. curious about open claw but don’t want to jeopardize my personal data.

Would love to see an analysis of what LLM’s can do with animation. Not sure if the users here would agree.

Thanks again Daria for your inspiring work!

Daria Cupareanu's avatar

Thanks Nils. From what I noticed, models are great at visualizing things they have more training data on. Physical products usually get much better visuals than digital products, where it's more abstract. Anything with a lot of existing visuals and information, they nail it. That's why it's easier for them to create a great landing page for a coffee shop or a restaurant vs. a SaaS with a novel product. Or better at visualizing a CPU than figuring out how to represent something more abstract like the Amplifiers MCP and what it does.

On the solo agentic setup, I'm currently building something similar to my own "OpenClaw". Still got a long way to go and iterate on it but I'll definitely write about it when it's closer to a finished version.

And on animations, what kind would you like to create?

Nils Haaland's avatar

Two types to start-

1. Visualizations or visualizers- (animation(‘s) that show a process- or how something works-voices not needed)

2. Story telling- Talking character(‘s) animation- (for example- avatars talking to/ with one another)

Have a great week!

Daria Cupareanu's avatar

Do you have some examples - like YouTube links or styles you have in mind?

Nils Haaland's avatar

Hi Daria,

(Wow lots to unpack with generating animations!)

I searched a little more about LLM’s and animation and thought it best to just focus on simulations/ visualizations for now.

As for simulations- about a month ago I asked Claude 4,6 to create a simulation / visualizer of a photonic reservoir accelerator that we had been working on- eventually we came up with this- (link at the bottom)

First Claude talked me through how to set up a GitHub account and then eventually Claude generated all the code to make the html simulation work- and walked me through how to share it. (Bit of a learning curve for someone who’s never coded before) But Claude was super helpful-

Have you used Claude or other LLMs to create visualizations like this to help communicate or to “see” an idea. Any advice?

Thanks again Daria for all you do!

https://bluebflatminor.github.io/betsy-photonic-reservoir/

Nils Haaland's avatar

Should be obvious but the “visualization” is more or less an animated representation - -different than a movie.

The Synthesis's avatar

The separate Mac mini instinct tracks with something real: agentic systems need broad permissions to be useful, and broad permissions are exactly what you don't want pointed at your primary machine. Apple shipped iOS 26.4 in March without the reimagined Siri partly because that integration problem is genuinely hard. Isolating the blast radius on dedicated hardware is the pragmatic workaround until the OS-level sandboxing catches up.

Melanie Goodman's avatar

Have you found that iterating with more specific prompts closes the gap in Nano Banana's character consistency, or does GPT hold the lead regardless of how many rounds you put in?

By the way, I just tried your ChatGPT prompt for the personal photos, and it is brilliant.

Daria Cupareanu's avatar

That makes me happy to hear! Yup, I've managed to get character consistency with Nano Banana in the past. You just need to be much more specific in the prompt and really insist on the facial features staying intact & even describe them more in the detail + insisting on fidelity with the reference image. So it works, it just takes more effort to get there compared to GPT now.

Susan Colantuono's avatar

This was an education!! Thank you, Daria. Now off to read how to add amplifiers inside Claude.

Daria Cupareanu's avatar

Awesome, let me know if you bump into anything or need help setting it up. The good part about Amplifiers is that they come with pre-built prompt transformers for image generation, so you can drop in a rough prompt and it'll still transform it into something good. I had to add a line at the end of each prompt in my tests to keep the original prompt intact so you could see exactly what went in and what came out from the prompt I shared.

Susan Colantuono's avatar

Thanks, Daria. I am not a tech person so your article was a bit overwhelming. I need to reread it - or ask Claude to synthesize it LOL. I will let you know when I get back to it.

Daria Cupareanu's avatar

Oh, was there a specific part that felt overwhelming?

If you need help with it at any point, just DM me and I can help 🙂 and yup, dropping the link into Claude is a really good option too.

Aisha Imtiaz's avatar

These days I am using both Nano Banana and ChatGPT for my images and this side by side comparison is so on point. Thank you.

Daria Cupareanu's avatar

Thanks Aisha, have you noticed some very obvious differences and patterns for when to use which?

Margot (you can call me AI)'s avatar

Gpt2 has definitely surpassed Nano. I’ve switched last couple weeks. Until the next Gemini release, that is :). Thanks for the comparison work.

Daria Cupareanu's avatar

Right? I can only imagine what's coming with the next Gemini release. This battle is getting tighter with every update.