Abstract
Storage backend for today’s cloud deployments require integration of several discrete software and hardware components which need to interoperate correctly with each other. They need to meet the high standards of reliability and availability that end customers expect from a cloud platform. Cloud deployments scaling to 1000s of VMs are comprised of interesting elements like workloads, migrations, admin actions, software faults, hardware faults, planned/unplanned failovers etc. This talk goes over how we simulate cloud deployment reliability aspects with fault injection to meet customer expectations. Besides having reliable cloud platform it is important to perform capacity planning to determine tipping points or targets for acceptance performance at scale. This talks also explains the methodology, success criteria and learnings from implementing cloud scale reliability and performance testing infrastructure.
Learning Objectives
Understand how the test team modeled end to end solutions for testing cloud deployments
Understand elements of cloud simulation and applying customer driven success metric for cloud reliability and performance