Building a Crash-Resistant AI Gateway: Lessons from a Week-Long Incident
After a week-long incident with my self-hosted AI gateway, here’s what I rebuilt: LaunchAgent watchdog, config backups, graceful restart handling, and the five things that make a gateway actually resilient.
