This is an automated archive made by the Lemmit Bot.

The original was posted on /r/selfhosted by /u/Andokawa on 2025-08-08 10:03:05+00:00.


I am hosting two 2 small wikis and a web dictionary, mainly as a show-case of past and current development activities.

A few weeks ago I noticed heavily increased database activity, and found a bots repeatedly requesting the wiki’s login page, and crawling through the dictionary (the UA claimed “amazonbot”)

At first, I tried to block IP ranges using Windows Server Firewall, which reduced the load somewhat, but the bots seem to be hosted around the world, and you don’t want to lock out legitimate users. :/

Then I recognized a couple of patterns in their HTTP requests:

  • fantasy Chrome versions in the User Agent (versions not starting with Chrome/1…)
  • fanzy combinations of all kinds of platforms and browsers (Linux Android Safari Brave Windows6 Macintosh Intel)
  • referrals from “https://google.com/
  • the IP range 43.128/10 seems to be one of the worst offenders

After adding a couple of suspicious User Agents in a IIS root Request Filter, the situation seems somewhat back to normal.

While I will not postulate a causal relation, coincidentally The Reg at about the same time had this story: Perplexity AI accused of scraping content against websites’ will with unlisted IP ranges