Google has tightened how Gemini’s free and paid tiers work. Since May 17, 2026, Gemini Apps have run with compute-based usage limits based on prompt complexity, the model chosen, and the length of the chat, refreshing every five hours up to a weekly cap. The change is the consumer-facing edge of a compute shortage that has squeezed Google’s biggest enterprise customers for months, and it is reshaping how much free AI ordinary users get.

Who got hit first? Google capped Meta’s use of its Gemini models in March after Meta sought more capacity than Google could supply, the Financial Times reported. The restriction, still in place, disrupted some of Meta’s internal AI projects and pushed the company to tell staff to use AI tokens, the units that measure AI usage, more efficiently. The constraints hit several other Google clients too, though less hard than Meta. Capping one of its largest customers signals how serious the shortage already was.

Has Google admitted the problem? At its first-quarter earnings in April, CEO Sundar Pichai said Google was “compute-constrained in the near term,” and that cloud revenue would have been higher had it been able to meet demand, per the FT. Cloud revenue crossed $20 billion for the first time, while signed-but-undelivered cloud contracts nearly doubled to more than $460 billion. That backlog is the tell: customers are signing up for compute faster than Google can deliver it.

How are providers coping? Google signed a $920 million-a-month deal to lease computing capacity from Elon Musk’s SpaceX, and Anthropic, which makes Claude, struck a similar arrangement with SpaceX. These leasing deals show the shortage has become a structural feature of the market, which is why the limits are unlikely to loosen soon.

What is driving the cost? The highest cost now comes from inference, the work of running models after training, the FT noted. Every prompt a user sends consumes compute, so the more people use AI for everyday tasks, the heavier that running cost becomes.

Does Google link the limits to capacity? Google’s Gemini Apps help page ties the limits to capacity directly:

  • If capacity changes, Google may cut limits for free users before paying subscribers.
  • During periods of high demand, Google may withhold compute-heavy features such as Deep Research from free users.
  • Usage limits may change without notice, including due to capacity constraints, and Google may tighten them when activity spikes.

By Google’s own account, capacity sets the limits.

Where do consumers fit in? Token-based limits let a provider ration finite compute across its user base. People who never touch an API use Gemini Apps as mass-market tools to summarise, brainstorm and generate images. As that everyday usage scales, it puts the same constrained infrastructure under greater strain. Metering by compute lets Google charge heavier tasks more and steer demand towards lighter models, matching each…


Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at [email protected]

 

 

Categorized in:

Blog,

Last Update: June 29, 2026