Yarn #t6wt7ja

twtxt.net

@movq@www.uninformativ.de I’m very curious…

What I like about this whole computer stuff is that you can explore how
things work. You can dig through problems and solve them. Nothing is
more satisfying than finally understanding something after you scratched
your head for some hours.

Surely you could do the same with AI? Tinker with how it works, study it, understand it, build your own and realize what it really is (without all the big tech hype)?

⤋ Read More

movq

www.uninformativ.de

Fri, May 29 11:07PM (6w ago)

@prologic@twtxt.net Yeah, it’s hard to get my point across here. I tried to address that a few paragraphs down.

Yes, I can tinker with AI techniques on a general level. That’s cool but not really my area of interest.

What I certainly can’t do is learn how specific AI products work. I can’t possibly find out why Claude Code produced that particular line of code. Claude is just a magic box that does something and I have to trust it.

⤋ Read More

prologic

twtxt.net

Sat, May 30 12:21AM (6w ago)

@movq@www.uninformativ.de I think your points are pretty clear to me, that’s fine. I’m just seeing if you can perhaps see things a different way maybe?🤔 I would challenge the assertion that you cannot understand how Claude Code generated an output; which I can demonstrate easily with a fairly trivial example by the input:

Write a program in Go that sums a list of numbers from stdin and prints the result.

⤋ Read More

prologic

twtxt.net

Sat, May 30 12:25AM (6w ago)

Which it does so in seconds, faster than I can type. The code is correct, it compiles and does exactly what I wanted. And the code looks pretty reasonable. It handles flotas, has error handling and handles space or line separated numbers on stdin.

⤋ Read More

prologic

twtxt.net

Sat, May 30 12:27AM (6w ago)

So going back to the understanding of how it generated this, is quite simply the most statistically relevant search space of it’s weights it has been trianed on and it has basically just produced a series of tokens, one after another that are relevant to the input, the next token and so on. It’s a trivial example I know, but it basically pattern matches it’s way through it’s vast search space just producing outputs based on context.

⤋ Read More

prologic

twtxt.net

Sat, May 30 12:29AM (6w ago)

And every time I ask it to do the same thing, it produces basically the same result. It will sometimes not produce a go.mod, but that’s probably because doing so isn’t as statically high as writing the code to sum numbers from stdin.

⤋ Read More

movq

www.uninformativ.de

Sat, May 30 1:06AM (6w ago)

@prologic@twtxt.net Ahh, I see. Okay, I’m with you there. On this high level, I can understand how the thing works.

Maybe my wording isn’t good. 🤔 Let’s take a real life example from what we do at work.

There’s this AI chatbot. It gets support requests from users, so the user says something like “I need access to a particular system”. This triggers the bot to “run” the instructions stored in a large Markdown file, like “check if the user is authorized to do this, then issue the following API requests”, and so on. This is essentially like running a little script, except it’s written in natural language (German) and there’s no “script interpreter” but just the AI.

Now, suppose that the AI doesn’t quite do what was intended. There’s some subtle bug. How do you debug this? How do you find out how the AI came to the “conclusion” to run step A instead of step B? And how do you find out how exactly you have to change your prompt so this doesn’t happen again next time?

If this was an actual script/program instead of AI, you could repeat the request and attach a debugger or throw in some printf() or whatever. How do you do that kind of thing with AI? How do you pinpoint exactly what the problem was?

(Or is this just a stupid idea? Do we have to give up that way of thinking when using AI? Is the era of debuggability over?)

⤋ Read More

prologic

twtxt.net

Sat, May 30 1:23AM (6w ago)

@movq@www.uninformativ.de I’m kind of flag you bring thi sup, because you simply can’t. You wouldn’t even be able to in an atypical neural network either (which is what ehse things are anyway). The problem here really isn’t the so-called “AI” (I wish we’d stop calling it AI), but the flawed usage(s) thereof. I believe I even stated earlier in this thread that sometimes it may not do what you expect, it’s “probabilistic” not “deterministic” – those pushing for greater use need to understand this, those not happy with the “push”, should educate the ignorant here (especailly managers pushing for weak, insecure and bad uses).

⤋ Read More

prologic

twtxt.net

Sat, May 30 1:26AM (6w ago)

It’s one of the reasons in fact I’ve been working on bob so I have a very concrete and strong foundation for how these things work, how they behave and how bad or good they can be. I am on-purpose building bob to be not only a decent coding tool and general task completion tool, but with serious security boundaries, sanitation, auditing and compliance. If I’m going to succeed at building autoonmous agents that can cope with a wider array of varying inputs (mostly natural language, some structural language) then it needs to be both a) Safe and b) Robust

⤋ Read More

prologic

twtxt.net

Sat, May 30 1:27AM (6w ago)

LIke with almost everything “big-tech” has done, it’s not the tech you should not trust, but the companies themselves. For example, accessing and using the models (because let’s face it, they have clusters of much larger and more powerful GPU clusters than we could ever afford to build and own ourselves, at least for now) is fine, but trusting their end-user products/services, not so much.

⤋ Read More

movq

www.uninformativ.de

Sat, May 30 1:50AM (6w ago)

@prologic@twtxt.net

it’s “probabilistic” not “deterministic”

Yep, I know. And when I tell that to people and tell them “if we use AI here, we lose the ability to debug this stuff”, then all I get is: “But it’s good enough. We don’t need to debug this. Non-deterministic computing has its use cases.”

But that is just not how I’d like to model/implement our business processes. 🤔 I want something reliable, not “it mostly works”.

⤋ Read More

movq

www.uninformativ.de

Sat, May 30 2:07AM (6w ago)

@prologic@twtxt.net (I hope I’m not too incoherent. I didn’t sleep very well recently and have a lot of unrelated stuff on my mind. 🤣)

⤋ Read More

prologic

twtxt.net

Sat, May 30 2:50AM (6w ago)

@movq@www.uninformativ.de All good, I’m tired too. Work has been burning me out lately 🥵

⤋ Read More

movq

www.uninformativ.de

Sat, May 30 3:06AM (6w ago)

@prologic@twtxt.net Oh yeah, same here. 😞 Let’s all just win the lottery and stop with this damn work thing. 🤣

⤋ Read More

prologic

twtxt.net

Sat, May 30 3:15AM (6w ago)

@movq@www.uninformativ.de Amen 🙏

⤋ Read More

prologic

twtxt.net

Sat, May 30 3:17AM (6w ago)

Bought 2 tickets 🎫 Wish me luck! 🍀

⤋ Read More

movq

www.uninformativ.de

Sat, May 30 4:01AM (6w ago)

@prologic@twtxt.net You actually did? 😅 Good luck. 😅 I never dared to, I’d probably get addicted. 🤣

⤋ Read More

prologic

twtxt.net

Sat, May 30 4:55AM (6w ago)

@movq@www.uninformativ.de Hah 😅

⤋ Read More

prologic

twtxt.net

Sat, May 30 6:48AM (6w ago)

@movq@www.uninformativ.de I guess I’m not so lucky haha 🤣 Only won $32 AUD 😅

⤋ Read More

movq

www.uninformativ.de

Sat, May 30 8:25AM (6w ago)

@prologic@twtxt.net lol, well, better than nothing, eh? What did the tickets cost? 😅

⤋ Read More

prologic

twtxt.net

Sat, May 30 8:39AM (6w ago)

@movq@www.uninformativ.de $95 🤣 So I’m down a fair bit 😳

⤋ Read More

bender

twtxt.net

Sat, May 30 9:59AM (6w ago)

@prologic@twtxt.net my gosh, that’s some expensive lottery! I have the feeling I would win, but I never play. 🤭 Wife, on the other hand, does. East and west coast, with a bunch of her friends (a pool). They win nothing, for many years. 😅

⤋ Read More

prologic

twtxt.net

Sat, May 30 11:39AM (6w ago)

@bender@twtxt.net LOL 😂

⤋ Read More

movq

www.uninformativ.de

Sat, May 30 11:48AM (6w ago)

@prologic@twtxt.net Jesus, that’s expensive. 🥴

⤋ Read More

Participate