- Barracuda research reveals the extent of data scraping bots
- Not all bots are bad but many extract huge amounts of data without permission
- These “gray bots” can be very aggressive, reporting warnings
New research from Barracuda has identified “Gray Bots” along with good and bad bots that search the Internet and extract data – and while the “good bots” that SEO and customer service -bots look for information, “Bad Bots” is designed for harmful activities such as fraud, data stealing and violating accounts.
In the space between there are “gray bots”, which Baraccuda explains is Genai Scraper Bots designed to extract serious amounts of data from sites, most likely to train AI models or to collect web content such as news, reviews and travel offers.
These bots “blur the boundaries of legitimate activity,” the report claims. Although not directly malicious, their approach can be “questionable” and some are even “very aggressive”.
Increased activity
Detection software from Baraccuda found millions of requests received by web applications from Genai Bots between December 2024 and February 2025, where a traced web application received 9.7 million scraper bot requests in just 30 days.
These bots collect data and can remove them without permission and can also overwhelm web applications with traffic, interfere with operations and take copyright data to train AI models, which may be in violation of the owner’s rights.
There has been plenty of pushback against practice like these, with creative industries in the UK that launch a ‘make it fair’ campaign to protest their work used by AI models to create photos, videos, stories or other content without permission or credit.
Risks of privacy are also provided with this level of scraping as some sites carry sensitive customer data – for example, those in healthcare or financial services.
Bots can also hide site analysis, making it very difficult for organizations to assess and track true traffic or user behavior, making business decisions more difficult.