For the time beingā€¦ Iā€™ve just blocked all of OpenAI(s) Bots. They (thankfully) publish a JSON endpoint that you can use to block all OpenAI crawlers from reaching your server (in my case, blocking it at the edge). Example:

proxy-1:~# curl -qs https://openai.com/gptbot.json | jq -r '.prefixes[].ipv4Prefix' | xargs -I{} ./block-ip.sh {}

Where block-ip.sh is simply:

#!/bin/sh

ufw insert 1 deny from "$1" to any

ā¤‹ Read More