ChatGPT's Images 2.0 Model Excels at Text Generation
Just a couple of years ago, telling human-made images apart from AI-generated ones was relatively straightforward. Back then, asking an image model to create a menu for a Mexican restaurant would often result in bizarre, invented dishes like "enchuita," "churiros," "burrto," or "margartas."
Today, when I request a Mexican food menu from the brand-new ChatGPT Images 2.0 model, it produces something that could be used in a real restaurant immediately, with customers unlikely to spot anything amiss. (Although, a $13.50 ceviche might still raise some questions about the fish quality).

Image Credits: ChatGPT Images 2.0
For comparison, here is the result I received from DALL-E 3 two years ago. (At that time, ChatGPT did not have image generation capabilities):

Image Credits: Microsoft Designer (DALL-E 3)
Historically, AI image generators have had significant difficulty with spelling. This is largely because they typically relied on diffusion models, which reconstruct images from random noise.
"The diffusion models [...] are reconstructing a given input," explained Asmelash Teka Hadgu, founder and CEO of Lesan AI, to TechCrunch in 2024. "We can consider text on an image to be a very minor component, so the image generator prioritizes learning the visual patterns that occupy more pixels."
Since then, researchers have investigated other approaches to image generation, such as autoregressive models. These models predict what an image should look like step-by-step, functioning more similarly to large language models (LLMs).
Unfortunately, OpenAI declined to answer a question during a press briefing this week regarding the specific model architecture powering ChatGPT Images 2.0.
The company did clarify, however, that the new model possesses "thinking capabilities." This allows it to search the web, create multiple images from a single prompt, and review its own outputs. These features enable Images 2.0 to produce marketing materials in various dimensions, as well as multi-panel comic strips.
OpenAI also states that Images 2.0 has a better grasp of rendering non-Latin scripts, including Japanese, Korean, Hindi, and Bengali. The model's knowledge is current up to December 2025, which may affect its accuracy when generating images related to very recent events.
"Images 2.0 delivers an unprecedented level of detail and accuracy in image creation. It can not only conceptualize more complex scenes but also execute that vision effectively. It follows instructions precisely, maintains requested details, and renders fine-grained elements that often challenge other image models—such as small text, icons, UI components, intricate compositions, and subtle stylistic nuances—all at resolutions up to 2K," OpenAI noted in a press release.
These advanced capabilities mean image generation isn't as instantaneous as asking ChatGPT a text question. However, creating something complex, like a multi-panel comic, still takes only a few minutes.
All ChatGPT and Codex users will gain access to Images 2.0 starting Tuesday, with paid subscribers able to generate more advanced outputs. The company will also release the gpt-image-2 API, with pricing based on the desired output quality and resolution.
Related article
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont
Related Special Topic Recommendations
Comments (0)
0/500
Just a couple of years ago, telling human-made images apart from AI-generated ones was relatively straightforward. Back then, asking an image model to create a menu for a Mexican restaurant would often result in bizarre, invented dishes like "enchuita," "churiros," "burrto," or "margartas."
Today, when I request a Mexican food menu from the brand-new ChatGPT Images 2.0 model, it produces something that could be used in a real restaurant immediately, with customers unlikely to spot anything amiss. (Although, a $13.50 ceviche might still raise some questions about the fish quality).

Image Credits: ChatGPT Images 2.0
For comparison, here is the result I received from DALL-E 3 two years ago. (At that time, ChatGPT did not have image generation capabilities):

Image Credits: Microsoft Designer (DALL-E 3)
Historically, AI image generators have had significant difficulty with spelling. This is largely because they typically relied on diffusion models, which reconstruct images from random noise.
"The diffusion models [...] are reconstructing a given input," explained Asmelash Teka Hadgu, founder and CEO of Lesan AI, to TechCrunch in 2024. "We can consider text on an image to be a very minor component, so the image generator prioritizes learning the visual patterns that occupy more pixels."
Since then, researchers have investigated other approaches to image generation, such as autoregressive models. These models predict what an image should look like step-by-step, functioning more similarly to large language models (LLMs).
Unfortunately, OpenAI declined to answer a question during a press briefing this week regarding the specific model architecture powering ChatGPT Images 2.0.
The company did clarify, however, that the new model possesses "thinking capabilities." This allows it to search the web, create multiple images from a single prompt, and review its own outputs. These features enable Images 2.0 to produce marketing materials in various dimensions, as well as multi-panel comic strips.
OpenAI also states that Images 2.0 has a better grasp of rendering non-Latin scripts, including Japanese, Korean, Hindi, and Bengali. The model's knowledge is current up to December 2025, which may affect its accuracy when generating images related to very recent events.
"Images 2.0 delivers an unprecedented level of detail and accuracy in image creation. It can not only conceptualize more complex scenes but also execute that vision effectively. It follows instructions precisely, maintains requested details, and renders fine-grained elements that often challenge other image models—such as small text, icons, UI components, intricate compositions, and subtle stylistic nuances—all at resolutions up to 2K," OpenAI noted in a press release.
These advanced capabilities mean image generation isn't as instantaneous as asking ChatGPT a text question. However, creating something complex, like a multi-panel comic, still takes only a few minutes.
All ChatGPT and Codex users will gain access to Images 2.0 starting Tuesday, with paid subscribers able to generate more advanced outputs. The company will also release the gpt-image-2 API, with pricing based on the desired output quality and resolution.
Satya Nadella ready to exploit new OpenAI deal
On Wednesday, a Wall Street analyst asked Microsoft CEO Satya Nadella directly how the revised OpenAI partnership would affect the company’s financials.Nadella described the new agreement as a win for everyone. “We feel good about our partnership wit
OpenAI outlines AI economy with public wealth funds, robot taxes, and four-day week
As governments struggle to manage the economic impact of superintelligent machines, OpenAI has released a set of policy proposals outlining how wealth and work could be reshaped in an "intelligence age." The ideas blend traditional left-leaning mecha
Greg Brockman reveals how Elon Musk departed OpenAI
In late August 2017, key figures at OpenAI—then a small nonprofit research lab—met to discuss how they would establish a for-profit entity to commercialize their technology and raise the capital needed to achieve AGI.Elon Musk was demanding full cont





Home






