A Comprehensive, Proven Approach to IT Scalability from Two Veteran
Software, Technology, and Business Executives
In The Art of Scalability, AKF Partners cofounders Martin L. Abbott and Michael
T. Fisher cover everything IT and business leaders must know to build technology
infrastructures that can scale smoothly to meet any business requirement. Drawing on their
unparalleled experience managing some of the world’s highest-transaction-volume Web
sites, the authors provide detailed models and best-practice approaches available in no
other book.
Unlike previous books on scalability, The Art of Scalability doesn’t limit
its coverage to technology. Writing for both technical and nontechnical decision-makers,
this book covers everything that impacts scalability, including architecture, processes,
people, and organizations.
Throughout, the authors address a broad spectrum of real-world challenges, from
performance testing to IT governance. Using their tools and guidance, organizations can
systematically overcome obstacles to scalability and achieve unprecedented levels of
technical and business performance.
Coverage includes
- Staffing the scalable organization: essential organizational, management,
and leadership skills for technical leaders
- Building processes for scale: process lessons from hyper-growth companies,
from technical issue resolution to crisis management
- Making better “build versus buy” decisions
- Architecting scalable solutions: powerful proprietary models for
identifying scalability needs and choosing the best approaches to meet them
- Optimizing performance through caching, application and database splitting,
and asynchronous design
- Scalability techniques for emerging technologies, including clouds and
grids
- Planning for rapid data growth and new data centers
- Evolving monitoring strategies to tightly align with customer requirements
Martin L. Abbott and Michael T. Fisher are founding
partners of AKF Partners, where they advise companies on scaling technology platforms,
organizations, leadership, and processes. Previously, Marty was COO of the advertising
technology startup Quigo, where he was responsible for product strategy and management,
technology, and client services. Marty also spent nearly six years at eBay, most recently
as SVP of Technology and CTO. Mike spent two years as CTO of Quigo, serving as President
during the transition following its acquisition by AOL. Prior to that, Mike led a
development organization of more than two-hundred engineers as Paypal’s VP of
Engineering and Architecture
Table of Contents
Foreword xxi
Acknowledgments xxiii
About the Authors xxv
Introduction 1
Part I: Staffing a Scalable Organization 7
Chapter 1: The Impact of People and Leadership on Scalability 9
Introducing AllScale 9
Why People 10
Why Organizations 11
Why Management and Leadership 17
Conclusion 20
Chapter 2: Roles for the Scalable Technology Organization 21
The Effects of Failure 22
Defining Roles 23
Executive Responsibilities 25
Organizational Responsibilities 29
Individual Contributor Responsibilities and Characteristics 32
An Organizational Example 35
A Tool for Defining Responsibilities 37
Conclusion 41
Chapter 3: Designing Organizations 43
Organizational Influences That Affect Scalability 43
Team Size 46
Organizational Structure 55
Conclusion 60
Chapter 4: Leadership 101 63
What Is Leadership? 64
Leadership–A Conceptual Model 66
Taking Stock of Who You Are 67
Leading from the Front 69
Checking Your Ego at the Door 71
Mission First, People Always 72
Making Timely, Sound, and Morally Correct Decisions 73
Empowering Teams and Scalability 74
Alignment with Shareholder Value 74
Vision 75
Mission 78
Goals 79
Putting Vision, Mission, and Goals Together 81
The Causal Roadmap to Success 84
Conclusion 86
Chapter 5: Management 101 89
What Is Management? 90
Project and Task Management 91
Building Teams–A Sports Analogy 93
Upgrading Teams–A Garden Analogy 94
Measurement, Metrics, and Goal Evaluation 98
The Goal Tree 101
Paving the Path for Success 102
Conclusion 103
Chapter 6: Making the Business Case 105
Understanding the Experiential Chasm 105
Defeating the Corporate Mindset 109
The Business Case for Scale 114
Conclusion 117
Part II: Building Processes for Scale 119
Chapter 7: Understanding Why Processes Are Critical to Scale 121
The Purpose of Process 122
Right Time, Right Process 125
When Good Processes Go Bad 130
Conclusion 131
Chapter 8: Managing Incidents and Problems 133
What Is an Incident? 134
What Is a Problem? 135
The Components of Incident Management 136
The Components of Problem Management 139
Resolving Conflicts Between Incident and Problem Management 140
Incident and Problem Life Cycles 140
Implementing the Daily Incident Meeting 141
Implementing the Quarterly Incident Review 143
The Postmortem Process 143
Putting It All Together 146
Conclusion 148
Chapter 9: Managing Crisis and Escalations 149
What Is a Crisis? 149
Why Differentiate a Crisis from Any Other Incident? 150
How Crises Can Change a Company 151
Order Out of Chaos 152
Communications and Control 157
The War Room 158
Escalations 160
Status Communications 160
Crises Postmortems 161
Crises Follow-up and Communication 162
Conclusion 163
Chapter 10: Controlling Change in Production Environments 165
What Is a Change? 166
Change Identification 168
Change Management 170
The Change Control Meeting 178
Continuous Process Improvement 178
Conclusion 179
Chapter 11: Determining Headroom for Applications 183
Purpose of the Process 184
Structure of the Process 185
Ideal Usage Percentage 189
Conclusion 192
Chapter 12: Exploring Architectural Principles 195
Principles and Goals 196
Principle Selection 199
AKF’s Twelve Architectural Principles 200
Scalability Principles In Depth 204
Conclusion 208
Chapter 13: Joint Architecture Design 211
Fixing Organizational Dysfunction 211
Designing for Scale Cross Functionally 214
Entry and Exit Criteria 217
Conclusion 219
Chapter 14: Architecture Review Board 221
Ensuring Scale Through Review 221
Board Constituency 223
Conducting the Meeting 225
Entry and Exit Criteria 228
Conclusion 230
Chapter 15: Focus on Core Competencies: Build Versus Buy 233
Building Versus Buying, and Scalability 233
Focusing on Cost 234
Focusing on Strategy 235
“Not Built Here” Phenomenon 236
Merging Cost and Strategy 237
AllScale’s Build or Buy Dilemma 240
Conclusion 242
Chapter 16: Determining Risk 243
Importance of Risk Management to Scale 244
Measuring Risk 245
Managing Risk 252
Conclusion 255
Chapter 17: Performance and Stress Testing 257
Performing Performance Testing 257
Don’t Stress Over Stress Testing 264
Performance and Stress Testing for Scalability 270
Conclusion 271
Chapter 18: Barrier Conditions and Rollback 273
Barrier Conditions 274
Rollback Capabilities 278
Markdown Functionality–Design to Be Disabled 282
Conclusion 283
Chapter 19: Fast or Right? 285
Tradeoffs in Business 285
Relation to Scalability 289
How to Think About the Decision 290
Conclusion 295
Part III: Architecting Scalable Solutions 297
Chapter 20: Designing for Any Technology 299
An Implementation Is Not an Architecture 300
Technology Agnostic Design 300
The TAD Approach 306
Conclusion 308
Chapter 21: Creating Fault Isolative Architectural Structures 309
Fault Isolative Architecture Terms 310
Benefits of Fault Isolation 312
How to Approach Fault Isolation 317
When to Implement Fault Isolation 319
How to Test Fault Isolative Designs 321
Conclusion 322
Chapter 22: Introduction to the AKF Scale Cube 325
Concepts Versus Rules and Tools 325
Introducing the AKF Scale Cube 326
Meaning of the Cube 328
The X-Axis of the Cube 328
The Y-Axis of the Cube 331
The Z-Axis of the Cube 333
Putting It All Together 334
When and Where to Use the Cube 336
Conclusion 337
Chapter 23: Splitting Applications for Scale 339
The AKF Scale Cube for Applications 339
The X-Axis of the AKF Application Scale Cube 341
The Y-Axis of the AKF Application Scale Cube 343
The Z-Axis of the AKF Application Scale Cube 344
Putting It All Together 347
Practical Use of the Application Cube 349
Conclusion 354
Chapter 24: Splitting Databases for Scale 357
The AKF Scale Cube for Databases 357
The X-Axis of the AKF Database Scale Cube 358
The Y-Axis of the AKF Database Scale Cube 362
The Z-Axis of the AKF Database Scale Cube 365
Putting It All Together 367
Practical Use of the Database Cube 370
Conclusion 374
Chapter 25: Caching for Performance and Scale 377
Caching Defined 378
Object Caches 381
Application Caches 384
Content Delivery Networks 389
Conclusion 390
Chapter 26: Asynchronous Design for Scale 393
Synching Up on Synchronization 393
Synchronous Versus Asynchronous Calls 395
Defining State 401
Conclusion 405
Part IV: Solving Other Issues and Challenges 409
Chapter 27: Too Much Data 411
The Cost of Data 412
The Value of Data and the Cost-Value Dilemma 414
Making Data Profitable 416
Handling Large Amounts of Data 420
Conclusion 423
Chapter 28: Clouds and Grids 425
History and Definitions 426
Characteristics and Architecture of Clouds 430
Differences Between Clouds and Grids 434
Conclusion 436
Chapter 29: Soaring in the Clouds 439
Pros and Cons of Cloud Computing 440
Where Clouds Fit in Different Companies 448
Decision Process 450
Conclusion 453
Chapter 30: Plugging in the Grid 455
Pros and Cons of Grids 456
Different Uses for Grid Computing 461
Decision Process 465
Conclusion 467
Chapter 31: Monitoring Applications 469
“How Come We Didn’t Catch That Earlier?” 469
A Framework for Monitoring 472
Measuring Monitoring: What Is and Isn’t Valuable? 478
Monitoring and Processes 480
Conclusion 481
Chapter 32: Planning Data Centers 483
Data Center Costs and Constraints 483
Location, Location, Location 485
Data Centers and Incremental Growth 488
Three Magic Rules of Three 490
Multiple Active Data Center Considerations 496
Conclusion 498
Chapter 33: Putting It All Together 501
What to Do Now? 502
Case Studies 505
References 509
Appendices 511
Appendix A: Calculating Availability 513
Hardware Uptime 514
Customer Complaints 515
Portion of Site Down 516
Third-Party Monitoring Service 517
Traffic Graph 518
Appendix B: Capacity Planning Calculations 521
Appendix C: Load and Performance Calculations 527
Index 535
592 pages, Paperback