DEV Community

# resilience

Designing systems that can withstand and recover from failures gracefully.

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Sovereign AI Systems Require Governed Environments

Sovereign AI Systems Require Governed Environments

Comments
2 min read
Circuit breaker in Go: surviving exchange outages

Circuit breaker in Go: surviving exchange outages

Comments
6 min read
Your Python API Calls Will Fail. Here's How to Handle It.

Your Python API Calls Will Fail. Here's How to Handle It.

Comments
4 min read
Evaluating and Improving Proposed Architecture for Production Application Suitability

Evaluating and Improving Proposed Architecture for Production Application Suitability

Comments
14 min read
Mastering Kubernetes Chaos Engineering: Strategies for Building Resilient Cloud-Native Applications

Mastering Kubernetes Chaos Engineering: Strategies for Building Resilient Cloud-Native Applications

1
Comments
4 min read
AWS UAE Data Center Fire Causes Service Disruptions: EC2, RDS, DynamoDB Affected, Slow API Calls Reported

AWS UAE Data Center Fire Causes Service Disruptions: EC2, RDS, DynamoDB Affected, Slow API Calls Reported

1
Comments
7 min read
Graceful Exit Strategies: How to Fail at a Project Without Crashing Your Life

Graceful Exit Strategies: How to Fail at a Project Without Crashing Your Life

Comments
9 min read
When Cloud Infrastructure Fails: The Iranian Drone Attacks And What Comes Next

When Cloud Infrastructure Fails: The Iranian Drone Attacks And What Comes Next

Comments
6 min read
When Bet365 Goes Dark: What a Betting Outage Says About the Cloud in 2026

When Bet365 Goes Dark: What a Betting Outage Says About the Cloud in 2026

Comments
7 min read
Kubernetes Probe Anti-Pattern: Stop Restarting Pods That Don't Need It

Kubernetes Probe Anti-Pattern: Stop Restarting Pods That Don't Need It

2
Comments
5 min read
What Event Sourcing Taught Us About Building Resilient Delivery Systems

What Event Sourcing Taught Us About Building Resilient Delivery Systems

Comments
4 min read
Chaos Engineering: Testing System Resilience

Chaos Engineering: Testing System Resilience

Comments
7 min read
Testing Redis Circuit Breaker with Toxiproxy

Testing Redis Circuit Breaker with Toxiproxy

Comments
8 min read
How to Handle AI Service Overload Without Breaking Your Entire System

How to Handle AI Service Overload Without Breaking Your Entire System

1
Comments
3 min read
How to Build Resilient Distributed AI Agent Systems That Survive Gateway Failures

How to Build Resilient Distributed AI Agent Systems That Survive Gateway Failures

1
Comments 1
2 min read
đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.