Searching yarn

Twts matching #SRE
Sort by: Newest, Oldest, Most Relevant
In-reply-to » grafana is confusing af i deployed it again for my job (that is so wild to say...) and i'm like HOW DO THESE ALERTS WORK

Move beyond basic threshold alerts! Define clear Service Level Objectives (SLOs) and measure Service Level Indicators (SLIs) to track real user impact. Use Prometheus to alert when your SLOs are at risk, ensuring you focus on what truly matters to your users. #Monitoring #SRE #Prometheus

⤋ Read More
In-reply-to » I am sure it wasn’t your intention (not even remotely), but it sounds a lot like corporate bullshit. Hahahaha! Are you sure you haven’t been institutionalised?

@bender@twtxt.net Bahahah 🤣😂 mate, me and one of my SRE colleagues actually came up with the terminology ourselves! 😛

⤋ Read More
In-reply-to » This weekend (as some of you may now) I accidently nuke this Pod's entire data volume 🤦‍♂️ What a disastrous incident 🤣 I decided instead of trying to restore from a 4-month old backup (we'll get into why I hadn't been taking backups consistently later), that we'd start a fresh! 😅 Spring clean! 🧼 -- Anyway... One of the things I realised was I was missing a very critical Safety Controls in my own ways of working... I've now rectified this...

This is an example of what I believe every SRE should master and whatever Post Incident Review (PIR) should focus on. Where did the system fail. What are the missing or incomplete Safety Controls.

⤋ Read More

I did a take home software engineering test for a company recently, unfortunately I was really sick (have finally recovered) at the time 😢 I was also at the same time interviewing for an SRE position (as well as Software Engineering).

Got the results of my take-home today and whilst there was some good feedback, man the criticisms of my work were harsh. I’m strictly not allowed to share the work I did for this take-home test, and I really can only agree with the “no unit tests” piece of the feedback, I could have done better there, but I was time pressured, sick and ran out of steam. I was using a lot of libraires to do the work so in the end found it difficult to actually think about a proper set of “Unit Tests”. I did write one (in shell) but I guess it wasn’t seen?

The other points were on my report and future work. Not detailed enough I guess? Hmmm 🤔

Am I really this bad? Does my code suck? 🤔 Have I completely lost touch with software engineering? 🤦‍♂️

⤋ Read More

Signal Status

Signal is experiencing technical difficulties. We are working hard to restore service as quickly as possible.

One thing I’d like to have one day (and it would be nice if it were integrated into twtxt.net and other pods with a familiar and pleasant user experience on Desktop, Web and Mobile) is an e2e encrypted messaging that is self-hosted and federated that doesn’t suck operationally (so many complicated solutions that exist that are hard to setup even for a Senior DevOps/SRE)

⤋ Read More