DEV Community

Site Reliability Engineering

Site Reliability Engineering principles, practices, and culture.

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Kubernetes resource requests and limits explained: scheduling, throttling, and OOMKill

Kubernetes resource requests and limits explained: scheduling, throttling, and OOMKill

Comments
13 min read
Planning network checks before running them: a local-first workflow pattern

Planning network checks before running them: a local-first workflow pattern

1
Comments
4 min read
Hiring SREs: What I Look For After Interviewing 100+ Candidates

Hiring SREs: What I Look For After Interviewing 100+ Candidates

Comments
3 min read
Log Management at Scale: How We Cut Costs 70% Without Losing Signal

Log Management at Scale: How We Cut Costs 70% Without Losing Signal

Comments
2 min read
LiveOps Rollback Planning: What to Do When a Game Event Goes Wrong

LiveOps Rollback Planning: What to Do When a Game Event Goes Wrong

1
Comments
7 min read
The P50 and P99 rule

The P50 and P99 rule

Comments
1 min read
Canary Deployments: The Pattern That Cut Our Rollback Rate by 80%

Canary Deployments: The Pattern That Cut Our Rollback Rate by 80%

Comments
2 min read
Platform Engineering: Building an Internal Developer Platform That Teams Actually Use

Platform Engineering: Building an Internal Developer Platform That Teams Actually Use

Comments
2 min read
Why did one day of AI cost more than a month of servers?

Why did one day of AI cost more than a month of servers?

Comments
5 min read
Chaos Engineering for Teams That Aren't Netflix

Chaos Engineering for Teams That Aren't Netflix

Comments
3 min read
Blameless Postmortems in Practice

Blameless Postmortems in Practice

Comments
3 min read
Daftar Periksa Kesiapan Produksi AI Setelah POC: Dari Sandbox ke Sistem Nyata

Daftar Periksa Kesiapan Produksi AI Setelah POC: Dari Sandbox ke Sistem Nyata

Comments
7 min read
The Golden Signals: A Practical Implementation Guide

The Golden Signals: A Practical Implementation Guide

Comments
2 min read
Kubernetes 1.36: 8 Features Worth Your Attention

Kubernetes 1.36: 8 Features Worth Your Attention

Comments
3 min read
Kubernetes Observability: What to Monitor and Why

Kubernetes Observability: What to Monitor and Why

Comments
2 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.