@movq@www.uninformativ.de and, finally, I see the ASCII. :-)
Test: Just ASCII
Perfect ASCII diagram builder
#ascii
@kat@yarn.girlonthemoon.xyz just spent like an hour playing with this and adding newjeans ASCII art this is the cutest shit ever
this is sooo cute and so fun i got it for timer stuff bc lord knows i need a timer on my computer and now i’m staring at animated ASCII cats that kiss https://github.com/poetaman/arttime
Notes on monospace, fonts, ascii, unicode | https://wonger.dev/posts/monospace-dump
@movq@www.uninformativ.de Non-ASCII characters were broken. Like U+2028, degrees (°), etc.
Turns out I used a silly library to detect the encoding and transform to UTF-8 if needed. When there is no Content-Type header, like for local files, it looks at the first 1024 bytes. Since it only saw ASCII in that region, the damn thing assumed the data to be in Windows-1252 (which for web pages kinda makes sense):
// TODO: change default depending on user's locale?
return charmap.Windows1252, "windows-1252", false
https://cs.opensource.google/go/x/net/+/master:html/charset/charset.go;l=102
This default is hardcoded and cannot be changed.
Trying to be smart and adding automatic support for other encodings turned out to be a bad move on my end. At least I can reduce my dependency list again. :-)
I now just reject everything that explicitly specifies something different than text/plain
and an optional charset other than utf-8
(ignoring casing). Otherwise I assume it’s in UTF-8 (just like the twtxt file format specification mandates) and hope for the best.
@movq@www.uninformativ.de the true 7 bit ascii
@shreyan@twtxt.net The only problem is that there is no such thing as “plain text”. Is it ASCII? UTF-8? DOS or UNIX line endings? Something else?
.txt
or “plain text” are ambiguous terms, I’m afraid. 🫤
Other than that, it looks neat and interesting. 😅
“ç”, I think. Anything above 7-bit ASCII would’ve done it, though.