Enterprises are rethinking AI infrastructure as inference costs rise

AI spending in Asia Pacific continues to rise, yet many companies still struggle to get value from their AI projects. Much of this comes down to the infrastructure that supports AI, as most systems are not built to run inference at the speed or scale real applications need. Industry studies show many projects miss their ROI goals even after heavy investment in GenAI tools because of the issue.

The gap shows how much AI infrastructure influences performance, cost, and the ability to scale real-world deployments in the region.

Akamai is trying to address this challenge with Inference Cloud, built with NVIDIA and powered by the latest Blackwell GPUs. The idea is simple: if most AI applications need to make decisions in real time, then those decisions should be made close to users rather than in distant data centres. That shift, Akamai claims, can help companies manage cost, reduce delays, and support AI services that depend on split-second responses.

Jay Jenkins, CTO of Cloud Computing at Akamai, explained to AI News why this moment is forcing enterprises to rethink how they deploy AI and why inference, not training, has become the real bottleneck.

Why AI projects struggle without the right infrastructure

Jenkins says the gap between experimentation and full-scale deployment is much wider than many organisations expect. “Many AI initiatives fail to deliver on expected business value because enterprises often underestimate the gap between experimentation and production,” he says. Even with strong interest in GenAI, large infrastructure bills, high latency, and the difficulty of running models at scale often block progress.

Jay Jenkins, CTO of Cloud Computing at Akamai.

Most companies still rely on centralised clouds and large GPU clusters. But as use grows, these setups become too expensive, especially in regions far from major cloud zones. Latency also becomes a major issue when models have to run multiple steps of inference over long distances. “AI is only as powerful as the infrastructure and architecture it runs on,” Jenkins says, adding that latency often weakens the user experience and the value the business hoped to deliver. He also points to multi-cloud setups, complex data rules, and growing compliance needs as common hurdles that slow the move from pilot projects to production.

Why inference now demands more attention than training

Across Asia Pacific, AI adoption is shifting from small pilots to real deployments in apps and services. Jenkins notes that as this happens, day-to-day inference – not the occasional training cycle – is what consumes most computing power. With many organisations rolling out language, vision, and multimodal models in multiple markets, the demand for fast and reliable inference is rising faster than expected. This is why inference has become the main constraint in the region. Models now need to operate in different languages, regulations, and data environments, often in real time. That puts enormous pressure…

Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at [email protected]

Categorized in: