Hostinger released an analysis showing that businesses are blocking AI systems used to train large language models while allowing AI assistants to continue to read and summarize more websites. The company examined 66.7 billion bot interactions across 5 million websites and found that AI assistant crawlers used by tools such as ChatGPT now reach more sites even as companies restrict other forms of AI access.

Hostinger Analysis

Hostinger is a web host and also a no-code, AI agent-driven platform for building online businesses. The company said it analyzed anonymized website logs to measure how verified crawlers access sites at scale, allowing it to compare changes in how search engines and AI systems retrieve online content.

The analysis they published shows that AI assistant crawlers expanded their reach across websites during a five-month period. Data was collected during three six-day windows in June, August, and November 2025.

OpenAI’s SearchBot increased coverage from 52 percent to 68 percent of sites, while Applebot (which indexes content for powering Apple’s search features) doubled from 17 percent to 34 percent. During the same period, traditional search crawlers essentially remained constant. The data indicates that AI assistants are adding a new layer to how information reaches users rather than replacing search engines outright.

At the same time, the data shows that companies sharply reduced access for AI training crawlers. OpenAI’s GPTBot dropped from access on 84 percent of websites in August to 12 percent by November. Meta’s ExternalAgent dropped from 60 percent coverage to 41 percent website coverage. These crawlers collect data over time to improve AI models and update their Parametric Knowledge but many businesses are blocking them, either to limit data use or for fear of copyright infringement issues.

Parametric Knowledge

Parametric Knowledge, also known as Parametric Memory, is the information that is “hard-coded” into the model during training. It is called “parametric” because the knowledge is stored in the model’s parameters (the weights). Parametric Knowledge is long-term memory about entities, for example, people, things, and companies.

When a person asks an LLM a question, the LLM may recognize an entity like a business and then retrieve the the associated vectors (facts) that it learned during training. So, when a business or company blocks a training bot from their website, they’re keeping the LLM from knowing anything about them, which might not be the best thing for an organization that’s concerned about AI visibility.

Allowing an AI training bot to crawl a company website enables that company to exercise some control over what the LLM knows about it, including what it does, branding, whatever is in the About Us, and enables the LLM to know about the products or services offered. An informational site may benefit from being cited for answers.

Businesses Are Opting Out Of Parametric…


Source link

Disclaimer

We strive to uphold the highest ethical standards in all of our reporting and coverage. We blogs.grocliq.com want to be transparent with our readers about any potential conflicts of interest that may arise in our work. It’s possible that some of the investors we feature may have connections to other businesses, including competitors or companies we write about. However, we want to assure our readers that this will not have any impact on the integrity or impartiality of our reporting. We are committed to delivering accurate, unbiased news and information to our audience, and we will continue to uphold our ethics and principles in all of our work. Thank you for your trust and support.

Website Upgradation is going on for any glitch kindly connect at [email protected]

 

 

Categorized in:

Blog,

Last Update: January 22, 2026