Hi
I'd like to share Meya’s POV related to yesterday's AWS outage. Hopefully, this helps give insights into Meya’s direction.
In response to a community question:
"Obviously large parts of AWS-US East went in to meltdown yesterday which affected a lot of our SaaS integrations including Meya. Is there anyway you can mitigate the risk of BOT's being affected in future?"
preface: as a platform, it’s Meya’s responsibility to offer reliable service
short answer: redundancy
longer answer: strategically invest redundancy in balance with product development over time
Details
- yesterday’s event was a big deal and affected a huge number of big Internet players (Github, Docker, Gitlab, Twilio, Zendesk, Quora, Slack, Giphy, Trello, Heroku, Apple iTunes, and ironically StatusPage, http://www.isitdownrightnow.com/ and AWS status page)
- we learn from events such as this (in a detailed way), and take steps to prevent this and similar events
- medium term: we will take steps to run sub-systems in multiple AWS availability zones
- medium term: host static content spread over multiple different CDNs
- longer term: redundancy on bot hosting contexts (AWS, GCE, self-hosted)
- yesterday’s event demonstrated the vulnerability of the Internet as a whole when relying on a single point of failure (AWS)
Erik