This code displays the last 10 lines of a twtxt feed without a full dowload.
FEED_URL="https://twtxt.net/user/prologic/twtxt.txt"
MAX_RANGE=$(curl -sI $FEED_URL | grep -i 'content-length' | awk '{print $2}' | tr -d '\r')
MIN_RANGE=$((MAX_RANGE - 5000))
curl -s --range "$MIN_RANGE-$MAX_RANGE" "$FEED_URL" | grep -v -e '^#' -e '^$' | head -n 10
My self-response!
@david@collantes.us If I run
printf '%s\n%s\n%s' 'https://aelaraji.com/twtxt.txt' '2025-04-16T22:49:11+00:00' "Am I tripping or `rsync` is actually THIS effing faster than `scp`!!? 🫨" | b2sum -l 256 | awk '{ print $1 }' | xxd -r -p | base32 | tr -d '=' | tr 'A-Z' 'a-z' | tail -c 8
I have xqfsv6a
. It is raw text
But… If I change de date to 2025-04-16T22:49:11Z
I have si4er3q
.
Then I cleaned up my shell history of all of the invocations I ever made of dkv rm ...
to make sure I never ever have this so easily accessible in my shell history (^R
):
$ awk '
/^#/ { ts = $0; next }
/^dkv rm/ { next }
{ if (ts) print ts; ts=""; print }
' ~/.bash_history > ~/.bash_history.tmp && mv ~/.bash_history.tmp ~/.bash_history && history -r
@andros@twtxt.andros.dev If something fits in a CSV file, it typically doesn’t require a database. I agree with that. Depending on the application, more complicated queries might benefit from a database, though. I don’t know awk very well, but I could imagine that grep, sed and cut reach their CSV processing limits rather quickly when you have to deal with escaped (multiline) fields.
I only very rarely have to deal with CSV files or databases in my day to day life. Maybe, these classic Unix tools offer some tricks I’m not aware of. When I have some more complicated CSV input, I generally reach for Python.
@prologic@twtxt.net We often turn to a database when we can use a plain text file, such as a CSV. With sed or awk, you can run simple queries without using a database.
Did I get the context right? 😀
The other day, after a discussion online, we came to the conclusion that using awk+sed+tr could replace much of the development that requires a database. However, using SQLite to have a SQL syntax isn’t a bad idea either. What do you think?
You really cannot beat UNIX, no really. Everything else ever invented sucks in comparison 🤣
$ diff -Ndru <(restic snapshots | grep minio | awk '{ print $1 }' | sort -u) <(restic snapshots | grep minio | awk '{ print $1 }' | xargs -I{} restic forget -n {} | grep -E '\{.*\}' | sed -e 's/{//g;s/}//g' | sort -u) | tee | wc -l; echo $?
0
0
awk -F '\"' '/twtxt/ {print $(NF-1)}' /var/log/user.log | grep -v 'twtxt\.net' | sort -u | awk '{print $(NF-1) $NF}' | awk '/^\(/'
spaghetti monster of a command and I'm wondering if there's a more elegant way for achieving the same thing.
@prologic@twtxt.net yeah I’ve played with it for a bit and read through the code hoping I could steal some of your regex. I’m trying to up my awk(1p) game but failing miserably. 😆
Been curious to see if can filter out my access.log file and output a list of my twtxt followers just in case I’ve missed someone … I came up with this awk -F '\"' '/twtxt/ {print $(NF-1)}' /var/log/user.log | grep -v 'twtxt\.net' | sort -u | awk '{print $(NF-1) $NF}' | awk '/^\(/'
spaghetti monster of a command and I’m wondering if there’s a more elegant way for achieving the same thing.
If we stuck with Blake2b for Twt Hash(es); what do we think we need to reasonably go to in bit length/size?
=> https://gist.mills.io/prologic/194993e7db04498fa0e8d00a528f7be6
e.g: (turns out @xuu@txt.sour.is is right about Blak2b being easy/simple too!):
$ printf "%s\t%s\t%s" "https://example.com/twtxt.txt" "2024-09-29T13:30:00Z" "Hello World!" | b2sum -l 32 -t | awk '{ print $1 }'
7b8b79dd
@prologic@twtxt.net Regarding the new way of generating twt-hashes, to me it makes more sense to use tabs as separator instead of spaces, since the you can just copy/past a line directly from a twtxt-file that already go a tab between timestamp and message. But tabs might be hard to “type” when you are in a terminal, since it will activate autocomplete…🤔
Another thing, it seems that you sugget we only use the domain in the hash-creation and not the full path to the twtxt.txt
$ echo -e "https://example.com 2024-09-29T13:30:00Z Hello World!" | sha256sum - | awk '{ print $1 }' | base64 | head -c 12
Gemini/Gopher Twtxt feeds account for less than 1% in existence:
$ total=$(inspect-db yarns.db | jq -r '.Value.URL' | awk -F'//' '{if ($1 ~ /^https?/) print "http/https:"; else print $1}' | sort | uniq -c | awk '{sum+=$1} END {print sum}'); inspect-db yarns.db | jq -r '.Value.URL' | awk -F'//' '{if ($1 ~ /^https?/) print "http/https:"; else print $1}' | sort | uniq -c | awk -v total="$total" '{printf "%d %s %.2f%%\n", $1, $2, ($1/total)*100}' | sort -r
7 gemini: 0.66%
4 gopher: 0.38%
1046 http/https: 98.96%
Merci prx, j’utilise atom.awk modifié pour mon flux :-)
json parser in awk: https://github.com/step-/JSON.awk
Notes et astuces intéressantes pour #awk: https://eradman.com/posts/awk-programming.html
I wrote this in my status script to reduce my computer usage: uptime | awk ‘{sub(“,”, “”, $3); print $3}’
#!/bin/sh
# Validate environment
if ! command -v msgbus > /dev/null; then
printf "missing msgbus command. Use: go install git.mills.io/prologic/msgbus/cmd/msgbus@latest"
exit 1
fi
if ! command -v salty > /dev/null; then
printf "missing salty command. Use: go install go.mills.io/salty/cmd/salty@latest"
exit 1
fi
if ! command -v salty-keygen > /dev/null; then
printf "missing salty-keygen command. Use: go install go.mills.io/salty/cmd/salty-keygen@latest"
exit 1
fi
if [ -z "$SALTY_IDENTITY" ]; then
export SALTY_IDENTITY="$HOME/.config/salty/$USER.key"
fi
get_user () {
user=$(grep user: "$SALTY_IDENTITY" | awk '{print $3}')
if [ -z "$user" ]; then
user="$USER"
fi
echo "$user"
}
stream () {
if [ -z "$SALTY_IDENTITY" ]; then
echo "SALTY_IDENTITY not set"
exit 2
fi
jq -r '.payload' | base64 -d | salty -i "$SALTY_IDENTITY" -d
}
lookup () {
if [ $# -lt 1 ]; then
printf "Usage: %s nick@domain\n" "$(basename "$0")"
exit 1
fi
user="$1"
nick="$(echo "$user" | awk -F@ '{ print $1 }')"
domain="$(echo "$user" | awk -F@ '{ print $2 }')"
curl -qsSL "https://$domain/.well-known/salty/${nick}.json"
}
readmsgs () {
topic="$1"
if [ -z "$topic" ]; then
topic=$(get_user)
fi
export SALTY_IDENTITY="$HOME/.config/salty/$topic.key"
if [ ! -f "$SALTY_IDENTITY" ]; then
echo "identity file missing for user $topic" >&2
exit 1
fi
msgbus sub "$topic" "$0"
}
sendmsg () {
if [ $# -lt 2 ]; then
printf "Usage: %s nick@domain.tld <message>\n" "$(basename "$0")"
exit 0
fi
if [ -z "$SALTY_IDENTITY" ]; then
echo "SALTY_IDENTITY not set"
exit 2
fi
user="$1"
message="$2"
salty_json="$(mktemp /tmp/salty.XXXXXX)"
lookup "$user" > "$salty_json"
endpoint="$(jq -r '.endpoint' < "$salty_json")"
topic="$(jq -r '.topic' < "$salty_json")"
key="$(jq -r '.key' < "$salty_json")"
rm "$salty_json"
message="[$(date +%FT%TZ)] <$(get_user)> $message"
echo "$message" \
| salty -i "$SALTY_IDENTITY" -r "$key" \
| msgbus -u "$endpoint" pub "$topic"
}
make_user () {
mkdir -p "$HOME/.config/salty"
if [ $# -lt 1 ]; then
user=$USER
else
user=$1
fi
identity_file="$HOME/.config/salty/$user.key"
if [ -f "$identity_file" ]; then
printf "user key exists!"
exit 1
fi
# Check for msgbus env.. probably can make it fallback to looking for a config file?
if [ -z "$MSGBUS_URI" ]; then
printf "missing MSGBUS_URI in environment"
exit 1
fi
salty-keygen -o "$identity_file"
echo "# user: $user" >> "$identity_file"
pubkey=$(grep key: "$identity_file" | awk '{print $4}')
cat <<- EOF
Create this file in your webserver well-known folder. https://hostname.tld/.well-known/salty/$user.json
{
"endpoint": "$MSGBUS_URI",
"topic": "$user",
"key": "$pubkey"
}
EOF
}
# check if streaming
if [ ! -t 1 ]; then
stream
exit 0
fi
# Show Help
if [ $# -lt 1 ]; then
printf "Commands: send read lookup"
exit 0
fi
CMD=$1
shift
case $CMD in
send)
sendmsg "$@"
;;
read)
readmsgs "$@"
;;
lookup)
lookup "$@"
;;
make-user)
make_user "$@"
;;
esac
a simple Makefile for forwarding internet to your local machine:
SSH_HOST=https://xuu.me
PRIV_KEY=~/.ssh/id_ed25519
forward:
LOCAL_PORT=$(HOST_PORT); sh -c "$(shell http --form POST $(SSH_HOST) pub=@$(PRIV_KEY).pub | grep ^ssh | head -1 | awk '{ print "ssh -T -p " $$4 " " $$5 " -R " $$7 " -i $(PRIV_KEY)" }')"
threesome with awk and sed
Awk and roll: https://gitlab.com/snippets/1907701
Installing GNU awk seems to have (mostly) fixed my problems.
something about invalid option d for xargs and invalid option v for awk