@kat@yarn.girlonthemoon.xyz yes, both the newsletter and the podcast, from time to time.
@prologic@twtxt.net I was not expecting much, but since the list of restaurants near company buildings, was hard coded into it, I did expect it to at least copy the menu text, from the websites, in its database. Ironically, the only restaurant where it got something right, is the only one, where the websites has the text as a transparent PNG, the AI has to convert to text.
I asked ChatGPT what it knows about Twtxt š And surprisingly itās rather accurate:
Twtxt is a minimalist, decentralized microblogging format introduced by John Downey in 2016. It uses plain text files served over HTTPāno accounts, databases, or APIs.
In 2020, James Mills (@prologic@twtxt.net) launched Yarn.social, an extended, federated implementation with user discovery, threads, mentions, and a full web UI.
Both share the same .twtxt.txt format but differ in complexity and social features.
Democrats: Doge building a āmaster databaseā of Americansā sensitive information
Comments ā Read more
@bender@twtxt.net Exactly. I suspect it was because of sqlitebrowser
also accessing the database in parallel to debug the original issue.
So far, I have not found the exact reason why some replies donāt show up. When I do not filter for unread messages and show all, though, I actually see them. So, thereās that.
I just noticed that my unread messages counter was off by quite a bit. It showed 8, but I only saw one unread message. Even after restarting my client, which recalculates the number of unread messages, it remained at eight. Weird. Looking in the database revealed that this is indeed correct.
Apparently, my query to build up the message tree must be incorrect. It somehow misses seven messages. They all are orphaned, maybe thatās a clue. However, generating missing root messages (and thereby including the replies) typically works just fine. Hmm.
@movq@www.uninformativ.de json and database put together sounds terrifying. i must try jenny
jenny really isnāt well equipped to handle edits of my own twts.
For example, in 2021, this change got introduced:
https://www.uninformativ.de/git/jenny/commit/6b5b25a542c2dd46c002ec5a422137275febc5a1.html
This means that jenny will always ignore my own edits unless I also manually edit its internal ājson databaseā. Annoying.
That change was requested by a user who had the habit of deleting twts or moving them to another mailbox or something. I think that person is long gone and I might revert that change. š¤
A threat model for opposing authoritarianism
A decade ago, I published a book on privacy āDragnet Nation: A Quest for Privacy, Security, and Freedom in a World of Relentless Surveillance.ā In the book, and since then, in articles and speeches, I have been dispensing advice to people on how to protect their privacy. But my advice did not envision the moment we are in ā where the government would collaborate with a tech CEO to strip-mine all of our data from government databases and use i ⦠ā Read more
Windows Recall returns, and its companion feature does not keep data on-device
Remember Windows Recall, the Windows feature that would take a screenshot of your desktop every three seconds, stored them in a database, and then let you search through them at later dates? The feature has been hobbled by implementation problems, security issues, and privacy troubles, and has been released in preview and pulled since its original unveiling. Well, itās back in ⦠ā Read more
@prologic@twtxt.net is it twice on database, or simply rendering twice? If you manually expunge it, will it affect the yarn?
@xuu@txt.sour.is Wow, thatās a giant graveyard. In my new database I have 16,428 messages as of now. Archive feed support is not yet available, so itās just the sum of all the 36 main feeds.
tt
reimplementation that I already followed with the old Python tt
. Previously, I just had a few feeds for testing purposes in my new config. While transfering, I "dropped" heaps of feeds that appeared to be inactive.
Thanks, @movq@www.uninformativ.de!
My backing SQLite database with indices is 8.7 MiB in size right now.
The twtxt
cache is 7.6 MiB, it uses Pythonās pickle
module. And next to it there is a 16.0 MiB second database with all the read statuses for the old tt
. Wow, super inefficient, it shouldnāt contain anything else, itās a giant, pickled {"$hash": {"read": True/False}, ā¦}
. What the heck, why is it so big?! O_o
A collection of postgreSQL patterns that you can use in other databases
https://mccue.dev/pages/3-11-25-life-altering-postgresql-patterns
#postgresql #databases
(Back in tt
.) Well, it kinda worked. At least appending to the file. But my cache database got screwed up. I do not yet support replies, so the subject and and root hash columns have not been set at all, resulting in a message that is just not shown at all. I gotta do something about that next. The good thing is, though, after simply fixing the two columns the message appeared on screen.
wahhh i wanna work towards my dream of offering pay as you can web hosting (static & dynamic) but i donāt know how!!!!! i keep drifting towards hosting panels but i donāt exactly have fresh linux servers for those nor do i like the level of access they require. so iām like ok i can do the static site part with SFTP chroot jails and a front-end like filebrowser or somethingā¦. but then what about the dynamic sites!!!!!!! UGH
granted i doubt iād get much interest in dynamic sites but iād like to do this old school where i can offer people isolated mySQL databases or something for some project (iām thinking PHP based fanlistings), which means i could do it the old school way of⦠people ask me to run it and i do it for them. but i kind of want to let people have access to be able to do it themselves just short of giving them SSH access which isnāt happening
@andros@twtxt.andros.dev If something fits in a CSV file, it typically doesnāt require a database. I agree with that. Depending on the application, more complicated queries might benefit from a database, though. I donāt know awk very well, but I could imagine that grep, sed and cut reach their CSV processing limits rather quickly when you have to deal with escaped (multiline) fields.
I only very rarely have to deal with CSV files or databases in my day to day life. Maybe, these classic Unix tools offer some tricks Iām not aware of. When I have some more complicated CSV input, I generally reach for Python.
pls elaborate on a āp2p databaseā, āall storyā and āRegistriesā.
My first thought takes me to something like secure-scuttlebutt
which itās painful to sync data using clients, and too slow compared to downloading a text file.
Also Iād like for twtxt to avoid becoming an ActivityPub. Works well but itās uses too many resources IMO.
https://kingant.net/2025/02/mastodon-the-cost-of-running-my-own-server/
Iām defending being able to self-host your Web client (like youād do with a Wordpress, twtxt is a micrologging, at the end), instead of federated instances, so in a first thought Iād say Registries have many disadvantages being the first one that someone has to maintain them active.
What does the #twtxt community think about having a p2p database to store all history? This will be managed by Registries.
@prologic@twtxt.net We often turn to a database when we can use a plain text file, such as a CSV. With sed or awk, you can run simple queries without using a database.
Did I get the context right? š
The other day, after a discussion online, we came to the conclusion that using awk+sed+tr could replace much of the development that requires a database. However, using SQLite to have a SQL syntax isnāt a bad idea either. What do you think?
Iām continuing my tt
rewrite in Go and quickly implemented a stack widget for tview. The builtin Pages is similar but way too complicated for my use case. I would have to specify a mandatory name and some additional options for each page. Also, it allows me to randomly jump around between pages using names, but only gives me direct access the first, however, not the last page. Weird. I donāt wanna remember names. All I really need is a classic stack. You open a new fullscreen dialog and maybe another one on top of that. Closing the upper most brings you back to the previous one and so on.
The very first dialog I added is viewing the raw message text. Unlike in @arne@uplegger.euās TwtxtReader, Iām not able to include the original timestamp, though. I donāt have it in its original form in the database. :-/
Next up is a URL view.
I think it is not easy to implement, you need a database. Timeline is an elegant solution: read and sort.
FINALLY!! Got #Caddy server up and running and got rid of nginx proxy manager and Mysql database containers š„³š„³š„³
What is clean architecture? Thatās a good question.
You think of a pattern for ordering code with good decisions isolating technologies (you can change the web framework or database without break the business logic), easy to test (you only test interfaces and use cases), sharing code between frameworks (entities and use cases), scalability, modulations and standardizing names. Clean architecture is not perfect, it has a learning curve and some abstraction in each technology. You can even find rejection with yours colleagues.
I have a good article on this topic.
https://programadorwebvalencia.com/implementando-arquitectura-limpia-en-python/
#python
@kat@yarn.girlonthemoon.xyz I approve! Thatās how I learned HTML (version 4 at the time and XHTML shortly after) and making websites, too. Some of them are still made like this to this day. Hand-written HTML. Hardly any <div>
and class nonsense. I canāt remember with which editor I started out with, but I upgraded to Webweaver (later renamed to Webcraft) quickly. Yeah, this were the times when there was just a single computer for the whole family.
Free hosting on Arcor, Freenet and I donāt know anymore how they were all called. Like this author, I uploaded everything via FTP. Oh dear, when was the last time I used that? And I had registered plenty of free .de.vu
domains.
Being on Windows at the time, everything was ISO-8859-1 for me. No UTF-8, I donāt think Iāve heard about it back then.
Later, I wrote my own CMSes in PHP. Man, were they bad in retrospect. :-D Of course, MySQL databases were used as backends. I still exactly know the moment I read the first time about SQL injections. I tried it on my own CMS login and was shocked when I could just break in. The very next thing I did was to lock down everything with an .htaccess until I actually fixed my broken PHP code. Hahaha, good memories.
I swear by Atom or RSS feeds. Many of my sites offer them. I daily consume feeds, theyāre just great.
been playing with making fun scripts using charm CLIās gum library :P
one that gets lyrics from an open lyrics databaseās API and accepts input for artist & song names: https://asciinema.org/a/697860
and one that uses a user-provided last.fm API key to pull whatās currently playing or what last played on your account :) https://asciinema.org/a/697874
for example, ejabberd, redka, and litefs. all using sqlite+litefs for their database needs allows agents to communicate over xmpp, matrix, mqtt, and sip. other applications can use sqlite for storage or speak the redis protocol to redka. ejabberd can also handle file uploads, static file publishing, identity, and various other web application services. when scaling, litefs integrates with consul to manage replication which grants the network access to service disco, encrypted mesh networking, and various other features that can be used to build secure service grids. ejabberd and redka can be scaled to multiple nodes that coordinate over the litefs replication protocol without any changes to the db storage config. other components can be configured to plug into this framework fairly easily as well. we keep the network config fairly simple by linking nodes together with yggdrasil to flatten the address space and then linking app nodes together using consul to provide secure routing for the local grid service. yggdrasil also offers utility for buliding federated networks in a similarly flat address space, for more secure communications i2p is also available in yggdrasil mode. minibase is wonderful, and we have not even started to talk about secure IoT.
i am working on very smol deployments, where a server may use two or so replicated sqlite databases instead of a db server like postgres to seamlessly move from single to multi-node arrangements as needed. there is a clear performance limit here, but the goal is not to serve a huge number of clients. just to do as much as possible with a small number of useful components that can be upgraded to handle up to medium size workloads, without difficult data conversions or migrations. scaling beyond that point should be done via federation.
@prologic@twtxt.net that ālittle database that couldā is simply amazing, isnāt it? I run Conduwuit (nevermind, this one is RocksDB), and GoToSocial using it as a backend, no issues. And, of course, sqlite is the database of choice for a lot of things under iOS.
I demand full 9 digit nano second timestamps and the full TZ identifier as documented in the tz 2024b database! I need to know if there was a change in daylight savings as per the locality in question as of the provided date.
BTW this code doesnāt incorporate existing twts into jennyās database. Itās best used starting from scratch. Iāve been testing it using a custom XDG_CACHE_HOME and XDG_CONFIG_HOME to avoid messing with my ārealā jenny data.
I wrote some code to try out non-hash reply subjects formatted as (replyto ), while keeping the ability to use the existing hash style.
I donāt think we need to decide all at once. If clients add support for a new method then people can use it if they like. The downside of course is that this costs developer time, so I decided to invest a few hours of my own time into a proof of concept.
With apologies to @movq@www.uninformativ.de for corrupting jennyās beautiful code. I donāt write this expecting you to incorporate the patch, because it does complicate things and might not be a direction you want to go in. But if you like any part of this approach feel free to use bits of it; I release the patch under jennyās current LICENCE.
Supporting both kinds of reply in jenny was complicated because each email can only have one Message-Id, and because itās possible the target twt will not be seen until after the twt referencing it. The following patch uses an sqlite database to keep track of known (url, timestamp) pairs, as well as a separate table of (url, timestamp) pairs that havenāt been seen yet but are wanted. When one of those āwantedā twts is finally seen, the mail file gets rewritten to include the appropriate In-Reply-To header.
Patch based on jenny commit 73a5ea81.
https://www.falsifian.org/a/oDtr/patch0.txt
Not implemented:
- Composing twts using the (replyto ā¦) format.
- Probably other important things Iām forgetting.
Can I get someone like maybe @xuu@txt.sour.is or @abucci@anthony.buc.ci or even @eldersnake@we.loveprivacy.club ā If you have some spare time ā to test this yarnd
PR that upgrades the Bitcask dependency for its internal database to v2? š
VERY IMPORTANT If you do; Please Please Please backup your yarn.db
database first! š
Heaven knows I donāt want to be responsible for fucking up a production database here or there š¤£
Hmmmm, I somehow run into an encoding problem where my inserted data end up mangled in the database. But, both SQLite and Go use UTF-8. Whatās happening here? :-?
@bender@twtxt.net Yes, they do 𤣠Implicitly, or threading would never work at all š Nor lookups 𤣠They are used as keys. Think of them like a primary key in a database or index. I totally get where youāre coming from, but there are trade-offs with using Message/Thread Ids as opposed to Content Addressing (like we do) and I believe we would just encounter other problems by doing so.
My money is on extending the Twt Subject extension to support more (optional) advanced āsubjectsā; i.e: indicating you edited a Twt you already published in your feed as @falsifian@www.falsifian.org indicated š
Then we have a secondary (bure much rarer) problem of the āidentityā of a feed in the first place. Using the URL you fetch the feed from as @lyse@lyse.isobeef.org ās client tt
seems to do or using the # url =
metadata field as every other client does (according to the spec) is problematic when you decide to change where you host your feed. In fact the spec says:
Users are advised to not change the first one of their urls. If they move their feed to a new URL, they should add this new URL as a new url field.
See Choosing the Feed URL ā This is one of our longest debates and challenges, and I think (_I suspect along with @xuu@txt.sour.is _) that the right way to solve this is to use public/private key(s) where you actually have a public key fingerprint as your feedās unique identity that never changes.
Correct, @bender@twtxt.net. Since the very beginning, my twtxt flow is very flawed. But it turns out to be an advantage for this sort of problem. :-) I still use the official (but patched) twtxt
client by buckket to actually fetch and fill the cache. I think one of of the patches played around with the error reporting. This way, any problems with fetching or parsing feeds show up immediately. Once I think, Iāve seen enough errors, I unsubscribe.
tt
is just a viewer into the cache. The read statuses are stored in a separate database file.
It also happened a few times, that I thought some feed was permanently dead and removed it from my list. But then, others mentioned it, so I resubscribed.
today i will start trying to extract my dots from my memex database and manage the dependency tree entirely using nix flakes
Haha, yeah sorry about that, I wasnāt even trying to nuke the database either but it worked out that way š©
@prologic@twtxt.net Righteo, so rookie error - I obviously had some untracked, rather important files for starting my pod and I ran a make clean
. Why I originally had them in the git directory is anyoneās guess. Anyway it blew away those files including the database so thatās that. So your good self and @bender@twtxt.net etc - apologies but your profiles got nuked as well (as did my own but easily recreated).
Another thing I noticed which was the reason I ran make clean
in the first place. I noticed my pod was being built with Go 1.22.4. Could this be a problem @prologic? preflight.sh
actually errors out about itā¦
@bender@twtxt.net I have nothing against GoToSocial, but:
GoToSocial stores statuses, accounts, etc, in a database. This can be either SQLite or Postgres.
snac
is simpler. Some JSON files and thatās it. I can read them with jq
and less
. I can use tar
to back them up. I can hand edit them in a text editor.
I think @abucci@anthony.buc.ci and @stigatle@yarn.stigatle.no are running snac? I didnāt have a closer look at snac (no intention of running it), but if that is a relatively small daemon (maybe comparable to Yarn?) that gives you access to the whole world of ActivityPub, then, well, yeah ⦠Thatās tough to beat.
Yes, I am running snac
on the same VPS where I run my yarn pod. I heard of it from @stigatle@yarn.stigatle.no, so blame him š snac
is written in C and is one simple executable, uses very little resources on the server, and stores everything in JSON files (no databases or other integrations; easy to save and migrate your data) . Itās definitely like yarn in that respect.
I havenāt been around yarn much lately. Part of that is that Iāve been very busy at work and home and only have a limited time to spend goofing off on a social network. Part of it is that Iām finding snac
very useful: Iāve connected with friends Iād previously lost touch with, Iāve found useful work-related information, Iāve found colleagues to follow, and even found interesting conferences to attend. Thereās a lot more going on over there.
I guess if I had to put it simply, Iād say I have limited time to play and there are more kids in the ActivityPub sandbox than this one. Thatās not a ding on yarnāI like yarn and twtxtāIām just time constrained.
@mckinley@twtxt.net I canāt say for sure. I didnāt even know how three-way merges work till I looked it up. I guess itās more of git thing that would prove useful in the case of using passwordstore/pass.
As for Keepass, all I do is syncing itās database file across devices using syncting. Never felt the need to try anything else.
I guess it is safe enough for my use case, with Backup database before saving on and custom Backup Path Placeholders as Backup plan in case of an Eff up.
@shreyan@twtxt.net ever tried KeepassXC or Pass/Password Store ? They are worth giving a try ⦠Then you can keep your KeepassXD database in synch across your devices with (NOT /R/s/y/n/c) I meant Syncthing or git in the case Pass (using a git repo in within your local network of course) šš¼(edited)
Thinking of building a simple āThings our kids sayā database form, using Node, Express and SQlite3. Going beyond simple text files.
Markdown + Git as a database / object store? š¤
From my small experience in writing an event database, I am inclined to agree with this.