~netlandish/linktaco-blog

78a215109f6a1006e10b8a75049806dbf9dba1da — Peter Sanchez a month ago bc9e4fd
Added anubis post
1 files changed, 39 insertions(+), 0 deletions(-)

A content/testing-anubis-stop-ai-scrapers.md
A content/testing-anubis-stop-ai-scrapers.md => content/testing-anubis-stop-ai-scrapers.md +39 -0
@@ 0,0 1,39 @@
+++
title = 'Testing Anubis to try and stop AI scrapers'
date = 2025-03-21T07:24:25-06:00
lastmod = 2025-03-21T07:24:25-06:00
tags = ['ai', 'anubis']
summary = 'LinkTaco has deployed Anubis to try and stop AI scraper bots.'
description = 'LinkTaco has deployed Anubis to try and stop AI scraper bots.'
keywords = ['ai', 'scrapers', 'bot', 'defense']
draft = false
+++

Just a heads up that this morning we deployed the new [Anubis][] defense tool
that has sprung up this week and has been installed by various services and
hosts with much success. Here's a little blurb taken from the projects README
file:

> Installing and using this will likely result in your website not being
> indexed by some search engines. This is considered a feature of Anubis, not a
> bug.

> This is a bit of a nuclear response, but AI scraper bots scraping so
> aggressively have forced my hand. I hate that I have to do this, but this is
> what we get for the modern Internet because bots don't conform to standards
> like robots.txt, even when they claim to.

A couple of weeks ago I [applied a patch][ci] that required users to be
authenticated in order to view a tag combo of more than 2 tags. This was
specifically added to stop the abuse from bot scrapers. With Anubis installed,
and if it seems to not be disrupting anything, I will disable the tag limit so
that normal users can view multiple tag combos. That ability to drill down is
an important part of LinkTaco and it's organization abilities.

Please write the [discussion mailing list][dml] if you notice any issues. We
will be paying attention for a week or so and if all is good we will remove the
tag/auth limitation.

[anubis]: https://github.com/TecharoHQ/anubis
[ci]: https://git.code.netlandish.com/~netlandish/links/commit/5851060eb47d9310b58d2b700fedd3779385ef24
[dml]: https://lists.code.netlandish.com/~netlandish/links-discuss

Do not follow this link