What The Latest Web Almanac Report Reveals About Bots, CMS Influence & llms.txt

The Web Almanac is an annual report that translates the HTTP Archive dataset into practical insight, combining large-scale measurement with expert interpretation from industry experts.

To get insights into what the 2025 report can tell us about what is actually happening in SEO, I spoke with one of the authors of the SEO chapter update, Chris Green, a well-known industry expert with over 15 years of experience.

Chris shared with me some surprises about the adoption of llms.txt files and how CMS systems are shaping SEO far more than we realize. Little-known facts that the data surfaced in the research, and surprising insights that usually would go unnoticed.

You can watch the full interview with Chris on the IMHO recording at the end, or continue reading the article summary.

“I think the data [in the Web Almanac] helped to show me that there’s still a lot broken. The web is really messy. Really messy.”

Bot Management Is No Longer ‘Google, Or Not Google?’

Although bot management has been binary for some time – allow/disallow Google – it’s becoming a new challenge. Something that Eoghan Henn had picked up previously, and Chris found in his research.

We began our conversation by talking about how robots files are now being used to express intent about AI crawler access.

Chris responded to say that, firstly, there is a need to be conscious of the different crawlers, what their intention is, and fundamentally what blocking them might do, i.e., blocking some bots has bigger implications than others.

Second to that, requires the platform providers to actually listen to those rules and treat those files as appropriate. That isn’t always happening, and the ethics around robots and AI crawlers is an area that SEOs need to know about and understand more.

Chris explained that although the Almanac report showed the symptom of robots.txt usage, SEOs need to get ahead and understand how to control the bots.

“It’s not only understanding what the impact of each [bot/crawler] is, but also how to communicate that with the business. If you’ve got a team who want to cut as much bot crawling as possible because they want to save money, that might desperately impact your AI visibility.”

Equally, you might have an editorial team that doesn’t want to get all of their work scraped and regurgitated. So, we, as SEOs, need to understand that dynamic, how to control it technically, but how to put that argument forward in the business as well.” Chris explained.

As more platforms and crawlers are introduced, SEO teams will have to consider all implications, and collaborate with other teams to ensure the right balance of access is applied to the site.

Llms.txt Is Being Applied Despite No Official Platform Adoption

The first surprising finding of the report was that adoption for the proposed llms.txt standard is around 2% of sites in the dataset.

Llms.txt has been a heated topic in the industry, with many SEOs dismissing the value of the file….

Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at [email protected]

Categorized in: