Amazon’s massive AWS outage was caused by human error
Amazon finally reaveled today the huge outage happend due to human error, blamed human error. That took unavailble large number of internet sites have been down for many hours on Tuesday afternoon. They have informed in a blog post, one of his employees was investigating an issue with the systems related to billing and by mistake more servers offline. That human error has started a domino effect that took many other servers subsystems and related systems continued for a while and was not available for serveral hours .
Removing a significant portion of the capacity caused each of these systems to require full reboot/restart. While these subsystems were being restarted, S3 was not reachable to service requests. US-EAST-1 is rely on S3 for storage, it also includes AWS EC2- Amazon Elastic Compute cloud Amazon Elastic Block Store (EBS) volumes (when data was needed from a S3 snapshot), and AWS Lambda were also impacted while the S3 APIs were unavailable.”
In Feedback, AWS Company said that it is making some changes to ensure that to avoid these human errors, wouldn’t have as large an impact.This changes can implement many changes like no employee can delete or remove any server capacity will no longer avilable to remove such quickly as they previously could.
Amazon the company aslo informed it is making many changes to prevent the AWS service Health Dasboard- It will moniotor AWS services are operating information like normally or any issues- from starting and stopping with similar issues.
AWS, which leases out computing power and data storage to companies big and small, is on pace to be a $14 billion business over the next year. It also drives a large portion of Amazon’s operating income.