@zvava@twtxt.net There would be only one hash for a message. Some to be defined magic date selects which hash to use. If the message creation timestamp is before this epoch, hash it with v1, otherwise hammer it through v2. Eventually, support for v1 could be dropped as nobody interacts with the old stuff anymore. But Iād keep it around in my client, because why not.
If users choose a client which supports the extensions, they donāt have to mess around with v1 and v2 hashing, just like today.
As for the school of thought, personally, Iād prefer something else, too. Iām in camp location-based addressing, or whatever it is called. There more I think about it, a complete redesign of twtxt and its extensions would be necessary in my opinion. Retrofitting has its limits. Of course, this is much more work, though.
@lyse@lyse.isobeef.org i dont mind if the hash is not backward compatible but im not sure if this is the right way to proceed because the added complexity dealing with two hash versions isnt justified
regular end users wont care to understand how twt hashes are formed, they just want to use twtxt! so i guess i could work in protecting users from themselves by disallowing post edits on old posts or posts with replies, but iām not fond of this either really. if they want to break a thread, they can just delete the post (though iāve noticed yarn handling post deletes dubiouslyā¦)
on activitypub i do genuinely find myself looking through several month or even year old posts sometimes and deciding to edit/reword them a little to be slightly less confusing, this should be trivial to handle on twtxt which is an infinitely simpler specification
@zvava@twtxt.net I was about to suggest that you post some examples. By now, weāre pretty good at debugging hashing issues, because that happens so often. š But it looks like you figured it out on your own. āļø
im unable to figure out why bbycll is not generating posts hashes for @lyse@lyse.isobeef.orgās feed correctly (or at least different from the ones generated by yarn)
iām pretty sure the timezone is stripped off the offset correctly (2025-09-14T12:45:00+02:00
ā 2025-09-14T12:45:00Z
) though messing with how the hash is generated i canāt get it to make one that matchesā¦but all other hashes for all other feeds seem to be correct? does yarn use a different canonical url for lyse internally? is there a bug in the libraries im using? bwehhh
wait why are so many of my post hashes not generating correctly ;w;
wait why are so many of my post hashes not generating correctly ;w;
edit: i read the spec wrong :3 only +/-00:00 is stripped, not the entire timezone offset >.<
@prologic@twtxt.net im unsure how i feel about the hash v2 proposal, given it is completely backward incompatible with hash v1 it doesnāt really solve any of the problems with it. it only delays collisions, and still fragments threads on post edits
i skimmed through discussions under other the proposals ā i agree humans are very bad at keeping the integrity of the web in tact, but hashes in done in this way make it impossible even for systems to rebuild threads if any post edits have occurred prior to their deployment
@bender@twtxt.net just a heads up im thinking of rewriting the database schema with hash v2 in mind >.<
@zvava@twtxt.net we have to amend the spec and increase the hash length. We just havenāt done so yet š
ok so i have found a genuine twt hash collision. what do i do.
internally, bbycll relies on a post lookup table with post hashes as keys, this is really fast but i knew iād inevitably run into this issue (just not so soon) so now i have to either:
Ā Ā 1) pick the newer post over the other
Ā Ā 2) break from specification and not lowercase hashes
Ā Ā 3) secretly associate canonical urls or additional entropy with post hashes in the backend without a sizeable performance impact somehow
[2025/09/11 12:56:01.816] ā please set config.host
when trying to run "bbycll". How to bypass that tiny hurdle?
@bender@twtxt.net as the host (eg twtxt.net
) determines the canonical url of the instance in generated feed url
metadata as well as every hash of every post made on the instance internally, i added this error message to make sure people donāt accidentally set up their instance on localhost
:p for testing i set it to localhost:31212
and protocols to ["http"]
, itās a recent addition that could definitely do with documenting in the getting started section
at first i dismissed the idea of likes on twtxt as not sensibleā¦like at all ā then i considered they could just be published in a metadata field (though that field could get really unruly after a while)
retwts are plausible, as āRE: https://example.com/twtxt.txt#abcdefg
ā, the hash could even be the original timestamp from the feed to make it human readable/writable, though im extremely wary of clogging up timelines
i thought quote twts could be done extremely sensibly, by interpreting a mention+hash at the end of the twt differently to when placed at the beginning ā but the twt subject extension requires it be at the beginning, so the clean fallback to a normal reply i originally imagined is out of the question ā it could still be possible (reusing the retwt format, just like twitter!) but iām not convinced itās worth it at that point
is any of this in the spirit of twtxt? no, not in the slightest, lmao
beginnings of remote feed parsing..! the fact hashing just sort of works with the minuscule libraries i found for base32 and blake2b still amazes me (mentions are being eaten as html tags)
@zvava@twtxt.net may I recommend to change the mention format upon hitting reply to something similar to what itās used in Yarn, and perhaps hiding the hash on the post too? Looking good!
@movq@www.uninformativ.de Yeah, weāve seen how this plays out in practice 𤣠@dce@hashnix.club My advice, do what @movq@www.uninformativ.de has hinted at and donāt change the 1st # url =
field in your feed. Iām not sure if you had already, but the first url field is kind of important in your feed as it is used as the āHashing URIā for threading.
@dce@hashnix.club Ah, oh, well then. š„“
My client supports that, if you set multiple url =
fields in your feedās metadata (the top-most one must be the āmainā URL, that one is used for hashing).
But yeah, multi-protocol feeds can be problematic and some have considered it a mistake to support them. š¤
huh.. so not even trying to be compatible with existing hashes?
@movq@www.uninformativ.de Has that hashing change even be accepted? :-?
I have zero mental energy for programming at the moment. š«¤
Iāll try to implement the new hashing stuff in jenny before the ādeadlineā. But I donāt think youāll see any texudus development from me in the near future. ā¹ļø
@lyse@lyse.isobeef.org Yeah to avoid cutting off bits at the end making hashes end in either q
or a
š¤£
tt2
from @lyse and Twtxtory from @javivf?
@prologic@twtxt.net if I understand correctly itās just to increase hash size from 7 to 12 once it gets calculated, isnāt it? BTW is this change already approved? I still donāt understand how a proposal become an implementation in the twtxtverse š¤
if
clauses to this. My point is: Every time I see a hash, Iād like to have a hint as to where to find the corresponding twt.
The reason I think this can work so well and Iām in full support of it is that itās the least disruptive way to resolve the issue of:
where did this hash come from?
@prologic@twtxt.net Not sure Iād attach any if
clauses to this. My point is: Every time I see a hash, Iād like to have a hint as to where to find the corresponding twt.
@movq@www.uninformativ.de If weāre focusing on solving the āmissing rootsā problems. I would start to think about āclient recommendationsā. The first recommendation would be:
- Replying to a Twt that has no initial Subject must itself have a Subject of the form (hash; url).
This way itās a hint to fetching clients that follow B, but not A (in the case of no mentions) that the Subject/Root might (very likely) is in the feed url
.
If we must stick to hashes for threading, can we maybe make it mandatory to always include a reference to the original twt URL when writing replies?
Instead of
(<a href="https://yarn.girlonthemoon.xyz/search?q=%23123467">#123467</a>) hello foo bar
you would have
(<a href="https://yarn.girlonthemoon.xyz/search?q=%23123467">#123467</a> http://foo.com/tw.txt) hello foo bar
or maybe even:
(<a href="https://yarn.girlonthemoon.xyz/search?q=%23123467">#123467</a> 2025-04-30T12:30:31Z http://foo.com/tw.txt) hello foo bar
This would greatly help in reconstructing broken threads, since hashes are obviously unfortunately one-way tickets. The URL/timestamp would not be used for threading, just for discovery of feeds that you donāt already follow.
I donāt insist on including the timestamp, but having some idea which feed weāre talking about would help a lot.
7
to 12
and use the first 12
characters of the base32 encoded blake2b hash. This will solve two problems, the fact that all hashes today either end in q
or a
(oops) š
And increasing the Twt Hash size will ensure that we never run into the chance of collision for ions to come. Chances of a 50% collision with 64 bits / 12 characters is roughly ~12.44B Twts. That ought to be enough! -- I also propose that we modify all our clients and make this change from the 1st July 2025, which will be Yarn.social's 5th birthday and 5 years since I started this whole project and endeavour! š± #Twtxt #Update
July 1st. 63 days from now to implement a backward-incompatible change, apparently not open to other ideas like replacing blake with SHA, or discussing implementation challenges for other languages and platforms.
Finally just closing #18, #19 and #20 without starting a proper discussion and ignoring a āmicro consensusā feels⦠not right.
I donāt know what to think rather than letting it rest (May will be busy here) and focus on other stuff in the future.
7
to 12
and use the first 12
characters of the base32 encoded blake2b hash. This will solve two problems, the fact that all hashes today either end in q
or a
(oops) š
And increasing the Twt Hash size will ensure that we never run into the chance of collision for ions to come. Chances of a 50% collision with 64 bits / 12 characters is roughly ~12.44B Twts. That ought to be enough! -- I also propose that we modify all our clients and make this change from the 1st July 2025, which will be Yarn.social's 5th birthday and 5 years since I started this whole project and endeavour! š± #Twtxt #Update
I will be adding the code in for yarnd
very soon⢠for this change, with a if the date is >= 2025-07-01 then compute_new_hashes else compute_old_hashes
Finally I propose that we increase the Twt Hash length from 7
to 12
and use the first 12
characters of the base32 encoded blake2b hash. This will solve two problems, the fact that all hashes today either end in q
or a
(oops) š
And increasing the Twt Hash size will ensure that we never run into the chance of collision for ions to come. Chances of a 50% collision with 64 bits / 12 characters is roughly ~12.44B Twts. That ought to be enough! ā I also propose that we modify all our clients and make this change from the 1st July 2025, which will be Yarn.socialās 5th birthday and 5 years since I started this whole project and endeavour! š± #Twtxt #Update
I had Chick-fil-A breakfast today (sausage, egg, and cheese biscuit, hash browns, coffee, and orange juice). Then at lunch my work place offered hot dogs. I had two (kosher, if that matters), plus a coke, a macadamia nuts cookie, and a small chocolate brownie.
So, here I am, at home, feeling hungry but guilty and refusing to eat anything else for the rest of the day. To top it off, I have only clocked 4,000 steps today (and I donāt feel like walking). I am going to hell, am I?
dm-only.txt
feeds. š
by commenting out DMs are you giving up on simplicity? See the Metadata extension holding the data inside comments, as the client doesnāt need to show it inside the timeline.
I donāt think that commenting out DMs as we are doing for metadata is giving up on simplicity (itās a feature already), and it helps to hide unwanted DMs to clients that will take months to add itās support to something named⦠an extension.
For some other extensions in https://twtxt.dev/extensions.html (for example the reply-to hash #abcdfeg
or the mention @ < example http://example.org/twtxt.txt >
) is not a big deal. The twt is still understandable in plain text.
For DM, itās only interesting for you if you are the recipient, otherwise you see an scrambled message like 1234567890abcdef=
. Even if you see it, youāll need some decryption to read it. Iāve said before that DMs shouldnāt be in the same section that the timeline as itās confusing.
So my point stands, and as Iāve said before, we are discussing it as a community, so letās see what other maintainers add to the convo.
dm-only.txt
feeds. š
After reading you, @eapl.me@eapl.me, Iāll tell you my point of view.
In my opinion, a feed does not have to be equivalent to a timeline. A timeline is a representation of the feed adapted to a user. You may not be interested in seeing other peopleās threads or DMs. But perhaps they are interested in seeing mentions or DMs directed at them. It is important not to fall into the trap. With that clarificationā¦
I insist, this is my point of view, it is not an absolute truth: I donāt think extensions should be respectful of customers who are no longer maintained.
We cannot have a system that is simple, backwards compatible and extensible all at the same time. We have to give up some of the 3 points. I would not like to give up simplicity because it will then make it harder to maintain the customers who do stay. Therefore, I think it is better to give up backwards compatibility and play with new formulas in the extensions. I donāt think itās a good idea to make a hash keep so much load: a hashtag, a thread and also a DM.
MaxAgeDays
configuration at the pod level, that now some profiles are rather empty. This is only because well, they're a bit "inactive" so to speak š£ļø Not sure what to do about this at the moment... Open to ideas? š”
yes it used be http://
only and to keep hashes from breaking i added # url = http://...
and now we are stock with it due to the curret specs.
Hmmm thereās a bug somewhere in the way Iām ingesting archived feeds š¤
sqlite> select * from twts where content like 'The web is such garbage these days%';
hash = 37sjhla
feed_url = https://twtxt.net/user/prologic/twtxt.txt/1
content = The web is such garbage these days š Or is it the garbage search engines? š¤
created = 2024-11-14T01:53:46Z
created_dt = 2024-11-14 01:53:46
subject = #37sjhla
mentions = []
tags = []
links = []
sqlite>
Some A hole has been trying to pull every single Twtxt feed that existed/still exists since forever. How do I know? Welpā Theyāve been querying my Timeline⢠instance for all of it, every single twtxt file and twt Hash they can find. šš¤¦ It must have been going on for days and I have just noticed⦠+ itās all coming from the same ASN AS136907 HWCLOUDS-AS-AP HUAWEI CLOUDS
Thank you Huawei for the DDos you sons of Glitches!!!
@quark@ferengi.one No editing old Twts that are the root of a thread with replies in the ecosystem. Just results in a fork. Unless the client has an implementation that does not store Twts keyed by Hash.
Ha! I stand corrected, didnāt scrolled long enough. Indeed, it should be added (you will need an account on Millsā Gitea), noted.
si4er3q
. See https://twtxt.dev/exts/twt-hash.html, a timezone offset of +00:00
or -00:00
must be replaced by Z
.
@eaplme@eapl.me you wrote:
āThat PHP snippet could be merged into https://twtxt.dev/exts/twt-hash.htmlā
Why, though? AFAIK @andros@twtxt.andros.devās client is on Emacs, @lyse@lyse.isobeef.orgās is on Python (and Golang, for tt2
), @movq@www.uninformativ.deās is on Python, and @prologic@twtxt.netās is on Golang. All the client creator needs to know is in the documentation already, coding language agnostic.
si4er3q
. See https://twtxt.dev/exts/twt-hash.html, a timezone offset of +00:00
or -00:00
must be replaced by Z
.
just a note that we are doing that on PHP: https://github.com/eapl-gemugami/twtxt-php/blob/master/docs/03-hash-extension.md#php-72
That PHP snippet could be merged into https://twtxt.dev/exts/twt-hash.html
@david@collantes.us @andros@twtxt.andros.dev The correct hash would be si4er3q
. See https://twtxt.dev/exts/twt-hash.html, a timezone offset of +00:00
or -00:00
must be replaced by Z
.
(That said, thereās a bug in jenny as well. It only replaces +00:00
, not -00:00
. š¤”)
@andros@twtxt.andros.dev the hash on @aelaraji@aelaraji.comās last message (as I type this) is:
[si4er3q] [2025-04-16 22:49:11+00:00] [Am I tripping or `rsync` is actually THIS effing faster than `scp`!!? š«Ø]
So, si4er3q
@prologic@twtxt.net @bender@twtxt.net
What is the hash of the last message from?: https://aelaraji.com/twtxt.txt
dm-only.txt
feeds. š
@bender@twtxt.net @aelaraji@aelaraji.com The client should ignore twts if itās not compatible or not addressed to me. itās a simple regex to add! Itās similar to Twt Hash Extension, should they be in another file? They are child messages, not flat twt. Not of course!
@prologic@twtxt.net interesting. What would happen on a hash collision? š¤
@bender@twtxt.net Itās a bug in the UI for sure. The hash is the primary key.
@david@collantes.us Yeah, weāve been debugging that a bit yesterday. Looks like the wrong input (sometimes) gets fed to the hash function ā broken threads.
@movq@www.uninformativ.de @kat@yarn.girlonthemoon.xyz Heck yeah, thatās crazy! :-) Fingers crossed! (tt
also agrees with the right⢠hash)
./yarnc debug <your feed url>
:
The actual hash is fs7673q
.
./yarnc debug <your feed url>
:
@prologic@twtxt.net thatās not what I see. The hash znf6csa
cannot be found.
@prologic@twtxt.net There was no edit according to my Git history. š¤ On my end, the hash is fs7673q
and thatās also what kat used to reply.
Doesnāt look like it Hmmm
sqlite> select * from twts where content LIKE '%Linux installation%';
hash = znf6csa
feed_url = https://www.uninformativ.de/twtxt.txt
content = I wonder if my current Linux installation will actually make it to 20 years:
$ head -n 1 /var/log/pacman.log
[2011-07-07 11:19] installed filesystem (2011.04-1)
Itās not toooo far into the future.
It would be crazy ⦠20 years without reinstalling once ⦠phew. š„“
created = 2025-04-07T19:59:51Z
subject = (#znf6csa)
mentions = []
tags = []
links = []