Gabe Westmaas Rackspace @westmaas
Transcription
Gabe Westmaas Rackspace @westmaas
Monitoring OpenStack at Scale Gabe Westmaas Rackspace @westmaas What Scale? Tens of Thousands of Hosts Hundreds of Thousands of Instances Why are we monitoring? Uptime 99.99% API Availability 99.9% Build Success Rate 100% Data Plane Availability Performance Build Times API Latency Capacity Memory Utilization Empty Hosts IPv4 Addresses Tools nagios Sampling stacktach/ceilometer E-mail slinky* *More on this later graphite Fixing the Cloud Mean Time to Resolution slinky Performance and Uptime Improvements 45% Build Time Reduction 99.95% API Availability 99.5% Build Success Rate 99.99% Data Plane Availability What’s next? Thank You https://gist.github.com/westmaas/7227895