OpenAI CEO Sam Altman has declared a “code red” to focus company resources on improving ChatGPT, according to an internal memo reported by The Wall Street Journal and The Information.

The memo signals OpenAI’s response to growing competition from Google, whose Gemini 3 model has outperformed ChatGPT in several benchmark tests since launching last month, according to Google’s own evaluation data and third party leaderboards.

What’s New

Altman told employees that ChatGPT’s day to day experience needs improvement. Specific areas include personalization features, response speed and reliability, and the chatbot’s ability to answer a wider range of questions.

The company uses a color-coded system to indicate priority levels. This effort has been elevated to “code red,” above the previous “code orange” designation for ChatGPT improvements.

A new reasoning model is expected to launch next week, according to the memo, though OpenAI hasn’t publicly announced it.

Delayed Products

Several product initiatives are being postponed as a result.

Advertising integration, which OpenAI had been testing in beta versions of the ChatGPT app, is now on hold, according to The Information. AI agents designed for shopping and healthcare are also delayed, along with improvements to ChatGPT Pulse.

Altman has encouraged temporary team transfers to support ChatGPT development and established daily calls for those responsible for improvements.

Competitive Context

On the technical side, Google’s Gemini 3 and related models have posted strong scores on reasoning benchmarks. Google says Gemini 3 Deep Think outperforms earlier versions on Humanity’s Last Exam, a frontier level benchmark created by AI safety researchers, and other difficult tests. Those results are reflected on Google’s own Gemini 3 Pro benchmark page and on independent leaderboards that track model performance.

OpenAI hasn’t released comparable public benchmark data for its next reasoning model yet, so comparisons rely on current GPT 5 results rather than the upcoming system referenced in the memo.

Google is also continuing to invest in generative image tools like its Nano Banana and Nano Banana Pro image generators, which sit alongside Gemini 3 as part of a broader AI product lineup.

Benchmark Context

Humanity’s Last Exam is intended to be a harder successor to saturated benchmarks like MMLU. It’s maintained by the Center for AI Safety and Scale AI, with an overview available on the project site and results tracked by multiple leaderboards, including Scale’s official leaderboard and third party dashboards such as Artificial Analysis.

Google’s Gemini 3 Pro benchmark documentation lists a higher score on Humanity’s Last Exam than several competing models, including GPT 5. That’s the basis for reporting that Gemini 3 has “outperformed” ChatGPT on that specific benchmark.

OpenAI has published strong results on other reasoning benchmarks for its GPT 5 series, but the…


Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at [email protected]

 

 

Categorized in:

Blog,

Last Update: December 2, 2025