Out of the Crisis

Out of the Crisis
by W. Edwards Deming
1982, reprinted 2018

Review by Hugh (AKA Hugo) Fisher, Canberra

Who was W. Edwards Deming? He was a management consultant, specialising in the manufacturing industries. He is credited with inspiring the revolution in Japanese manufacturing in the 1950s, when "made in Japan" became largely synonymous with quality as opposed to before. This particular book can be considered his effort to get Americans in particular and westerners in general to learn the same lessons.

Who am I? I'm a software designer and developer, not a project manager, but I've suffered under some disastrous large scale project management. I emailed some of my thoughts and conclusions about Agile software development to Ludicity, who was kind enough to reply and recommend this book to me. Having some time on my hands recently, I wrote this review to help other people decide whether it is worth reading.

If like me you consider yourself a developer rather than a manager, don't stop here. In chapter 2 Deming states that one of his goals is that everyone working within an organisation will have a basis for judging how their management is performing. This is a book about management, but not just for managers.

Note on style

There's a thing called Critical Literacy, which seems mostly to consist of scanning for particular words or phrases that can be declared "problematic" or otherwise grounds for dismissing an entire book without really reading it. And there are stylistic issues with this book for a 21st C reader.

First, this book uses "he", "him", and "man" exclusively rather than trying to be gender neutral. In the 1980s this was just how people wrote, especially people such as W. Edwards Deming who were born in 1900. From the numerous acknowledgements given to the ideas and work of women in the field of management throughout the text, Deming does not have any problem accepting women as equals.

Second, the book has what seems today an odd fixation with Japanese businesses as all powerful and dominating. This is just a 1980s trope, which was equally visible in science fiction of that decade such as William Gibson's classic novel Neuromancer. It's not based on any notion of racial characteristics, because as Deming repeatedly points out, Japanese businesses are succeeding because they apply the ideas and principles he talks about and there are no reasons Americans couldn't do the same.

Preface

According to Deming, "The book makes no distinction between manufacturing and service industries. The service industries include government service, among which are education." Applying this to software development is therefore not such a stretch as you'd think if you only know Deming from manufacturing. However my opinion is that while Deming certainly attempts to extend beyond manufacturing, and someone in a service organisation can learn a lot from this book, it really is focused on mass production.

While there are common themes throughout the book, most chapters have a particular focus and for software development some seem much more relevant than others. So I'm going to review from beginning to end.

1 Chain Reaction: Quality, Productivity, Lower Costs

Deming first wants to dispel the notion that quality and productivity are opposites. The chain reaction is that productivity increases as quality improves, because there is less rework and waste: "lower costs, better competitive position, happier people on the job". For somebody like Deming who is an enthusiastic supporter of capitalism and not a fan of excessive government regulation, I found it surprising how often he brings up happiness and personal satisfaction as expected outcomes of his principles.

This chapter introduces the idea of a "stable system of trouble" where there is predictably random variation in quality, and people trying things to change the outcome, but nothing seems to work. Such systems are referenced often in this book, because in such a system you need to change the inputs, not fiddle with individual actions. This will be expanded on later in the book.

There is discussion of some ideas that won't work, and these still ring true today in software development. He gives the example of a factory where an inspector was responsible for testing every item and filling out a report card. When the report cards piled up to a certain height, the inspector would throw the bottom half out. Think most organisations do any better with the burndown charts and whatnot used in todays Agile environments?

Deming argues against relying on new machinery and gadgets to solve apparent deficiencies, arguing instead that most gains come from learning to use effectively what you have, not installing the latest types of automation. This seems very prescient in view of the current AI craze.

Perhaps the most important point of the chapter: "measures of productivity do not lead to improvement in productivity." Quality assurance too often means a deluge of figures, with comparisons between months and years, that can tell management what has been happening but not what to do to improve.

2 Principles for Transformation

This is the heart of the book, the 14 points for management and an initial explanation of each. Subsequent chapters go into more detail on specific aspects.

So what are the 14 points? I'm not going to list them all here.

This is because one of the points is to eliminate slogans and exhortations. Deming is quite savage on managers who assume that just putting up a poster or a checklist, or mindlessly repeating slogans, is enough. So me writing down the points without context and at least some explanation would only demonstrate that I'd not understood. The rest of the chapter is an initial discussions on what each point means, and I will summarise a few here.

Points 3 and 5 are concerned with quality. Deming writes that inspections do not automatically improve quality nor guarantee quality, and performing multiple inspections actually makes it more likely for defects to go undetected, not less! (Because inspectors tend to take shortcuts, assuming someone else will cover for them.) He argues that because some variability is unavoidable, trying for distribution within a limited range is more effective than precise and detailed specifications.

Deming returns to the "stable system of trouble" where a system isn't working, but management just apply pressure rather than change. Holding people accountable for following a broken process and/or not meeting meaningless quotas won't improve anything, not morale, not trust in management, not productivity, not quality. "Do it right the first time" sounds good, but if management won't allow the workers to take their time, or to use the right tools, it becomes a joke. You can't "be a quality worker" or "take pride in your work" if management obviously don't care about quality themselves, or are unwilling to change the system or process so that workers can actually do this.

For point 12, removing barriers of pride in workmanship, Deming gives a long series of examples where production workers are ignored or actively prevented from doing quality work. At first sight an assembly line worker might not seem to have much in common with a programmer, but many of the stories ring true. Supervisors who know nothing about the job and don't care that they don't know, as they'll be leaving soon anyway. Detailed instructions that nobody reads because they are too confusing. Being told that even though the product will be crap, it has to be made and delivered anyway. Trying to report that one third of incoming material is defective, for three years, with no result.

Point 14 is mostly discussing manufacturing again, but does make the point that "one of a kind" products are more common outside software than we software developers often imagine. Airports and hotels are all large investments and each is unique. As my first Agile teacher told us, you really can't go back and refactor the concrete foundations for a tower block. Sometimes there is no Minimum Viable Product, and you can't build a jet airliner in a series of sprints.

While this chapter works through all the points, it does demonstrate the major weakness of this book for software development. Deming is a manufacturing guy, and while he states that his principles work for other kinds of industries, most of his detailed discussion and examples are for manufacturing. I estimate that less than half of the 65 pages are of interest for software development.

3 Diseases and Obstacles

This chapter discusses the diseases and obstacles in the way of implementing the 14 points, the "diseases" being the more serious problems.

The third "disease" is an interesting one: Deming comes down hard against numerical evaluation of performance. This encourage individual selfishness: "merit rating rewards people that do well in the system. It does not reward attempts to improve the system." And since managers don't understand statistics and variation, even supposedly objective measurements applied to performance are usually meaningless. Deming is very blunt: "fair ratings are impossible".

Worse still, such evaluations make everyone afraid, even to ask questions. And Deming points out that senior managers are usually the only people in favour of numerical or ranking evaluations, because it's the system that they did well within.

The fifth is basing management on visible numbers alone. A common proverb is that you can't manage what you can't measure. Deming is all in favour of meaningful measurements, but he also warns that there are important aspects that cannot be reduced to numbers. True for production and service organisations according to Deming, and I would add true for software development as well: consider attempts at counting lines of code. And in a book written in 1982, Deming is very prescient in predicting that desktop PCs and computer networks would allow a massive expansion of numbers and charts and figures, with no actual improvement.

As obstacles rather than diseases, the book lists the search for automation and gadgets, without also paying attention to people and processes.

More directly for software development is the "search for examples" obstacle, copying someone who seems to be more successful than you are. Very often the people that are more successful than you doesn't actually know why they are doing better. Management fads and "best practices" are too often this kind of copying, without understand the theory behind something like say Kanban. (Yes, Deming mentions Kanban in the book. Unlike Scrum, Kanban is a methodology that's actually been shown to work elsewhere.) In particular, merely creating a quality control department might tick some box, but usually the function of quality control or quality assurance is to provide hindsight at best, at worst massive numbers of meaningless charts. Deming states again that you cannot install quality, nor can you decree zero defects.

The chapter finishes with several pages discussing various example problems in manufacturing, and differences between Japanese and American management of production. Not really relevant for software development.

4 When? How Long?

This is a short chapter about the prospects of US business - in 1980 - being able to catch up with Japan -as in 1980. Skip this chapter.

5 Questions to Help Managers

Another relatively short chapter which the title describes perfectly. This isn't Joel on Software's 12 Steps to Better Code: Deming gives no less than 66 questions for managers to ask, often with subquestions (a), (b), ... Some are specifically for service industries, but the vast majority are for manufacturers. Again, software developers can skip this chapter.

6 Quality and the Consumer

This short chapter discusses what "quality" actually means, and not just for the consumer. While most software developers are not developing software to be sold, we do have consumers, if only within the organisation. This chapter looks at not just manufactured products but also (briefly) medical care and education. There is a section discussing linear, circular, and spiral models of product/process improvement, and finishes with the advice that new ideas usually come from within the producing organisation, not by asking consumers. My takeaway is that as software designers it is our responsibility to design, not just write down anything the user asks for as a story item.

7 Quality and Productivity in Service Organizations

This much longer chapter (55 pages vs 14) starts well. The list of characteristics that Deming uses to distinguish service industries from manufacturing match software development almost perfectly. But most of the chapter is detailed examples from various service industries, none of them software, and so detailed that I couldn't see how to extract more general advice.

8 Some New Principles of Training and Leadership

The new principles are mostly statistical, and this is definitely a maths heavy chapter. (The statistical techniques are presented both as formulas and as graphs, so at least one should work for most readers.) The opening paragraphs warn that leadership is not finding and recording failure but removing the causes of failure and helping people do a better job. People shouldn't be blamed for performance caused by factors they cannot control; interestingly they should not be rewarded either.

Most of the chapter is how to use statistics and analyse variations in performance; in particular accurately distinguishing between random variation and more predictable variation - these require different responses from management. As well as good use of statistics there are some examples of how recognise when the statistics themselves might be invalid: a common case being where there is a cluster of results just under the "maximum" permitted error rate in organisations that rely on punishment rather than encouragement. There are short examples on how to properly make comparisons of both people and processes, and the chapter finishes with a warning that since most performance measurements have some random variation, rewarding the "best" performance each week or month is in reality a lottery.

While this chapter and the examples might not seem directly applicable to software development, there is a lot to think about here. At minimum, the lesson I draw is that in large development projects or organisations it would be stupid to compare say the number of tickets handled by different teams in a given period, let alone comparing imaginary numbers like burn down rates.

9 Operational Definitions, Conformance, Performance

And this chapter is also about a very mathematical concept, operational definitions, or how we can make measurements that can be meaningfully compared and used. Deming explains this through examples, starting with the seemingly simple but actually complicated definition of what it means for something to be "round" and moving on to the differences between purely scientific measurements and what are needed for applied science and engineering. It's very well explained, and will make you deeply skeptical (if you weren't already) about almost every number, percentage, or ratio you will encounter in daily life.

For me, this clarified a point where I'm in disagreement with Ludicity: I actually see value in "story points" and even the Fibonacci sequence limit. If story points are used solely as the measure of relative estimated difficulty / complexity of tasks, by consensus among the same group of people (ie a team), then these story points do have a meaningful operational definition. Where Ludicity is right is that when story points are measured per sprint or other time unit; or story point completions are compared across teams, they are worthless.

10 Standards and Regulations

Another short chapter, briefly describing the value of industry standards and (less so) government regulations. You might be surprised that in a book about efficiency and productivity, Deming is all in favour of standards and regulations for manufacturers, consumers, and society in general and (in 1980) wanted American industry to do more standard setting, not less.

11 Common Causes and Special Causes of Improvement

Now that the reader understands the statistics, this chapter begins the application of the new principles.

Interestingly, this chapter begins with a warning that the first step before applying statistical techniques is to check that your data is suitable for statistical analysis, a vital step usually skipped in textbooks. He gives an example where the testing results for a manufacturing run look like they follow a normal distribution, but when plotted over time shows that the initial outputs are all too high, but then there is a steady drop over time. If continued, results will just get lower and lower and the difference between the first measurements and the latest measurements wider and wider. The "mean" is an illusion.

Deming expands on the difference between common and special causes of improvement, and what is a stable system. (If you're familiar with Cynefin, this is reminiscent of the differences between Complex and Chaotic.) Stable systems are those that are predictable and periodic; but not uniform and not perfectly consistent because that is impossible. Unstable systems have massive variation and haven't settled down.

Common causes in stable systems are just that, common causes of variation, because neither people nor processes are perfect and unchanging. People will occasionally make mistakes, machinery cannot run indefinitely. Special causes in a stable system are the unexpected, such as a hurricane. But in an unstable system, everything is special.

The real lesson is that common causes and special causes require very different management, and that not recognising the difference can be disastrous. (If this sounds like chapter 8, it is. Deming revisits this topic, in even more detail, because it's really important.) A common mistake is to chase towards or away from whatever happened most recently, thinking that an individual measurement that is lower or higher needs some kind of corrective action. In reality this is perfectly normal expected variation and the correct response is not to change anything. Since the urge to "do something" is so ingrained in managers, Deming spends a lot of pages and a lot of examples explaining why this is so. He even has a couple of actual physical experiments that can be carried out in a classroom. While it is very detailed, once again this is a dense block of text and diagrams about manufacturing that is difficult to generalise from if you're a software developer.

I would conclude from this chapter that the common practice of holding a retrospective at the end of each sprint and deciding what to differently for the next is wrong, wrong, wrong. Every six months, maybe.

12 More Examples of Improvement

A short chapter, exactly what the title says, although the emphasis is more on the people involved than the analysis. It does make the point on the first page that testing possible factors independently is not always sufficient: you also need to test for interactions between two or more possible factors. (eg soap and detergent: either will work, combining the two will be worse than either alone, not better.)

13 Some Disappointments in Great Ideas

To finish this group of statistical chapters, four short examples of how NOT to use statistics.

14 Two Reports to Management

Again this is exactly what it says, two examples of reports by Demings (and colleagues) on problems discovered and suggestions for solutions, for real companies although with names replaced. Since they're both about manufacturing and very detailed, there's not much to learn about software development. I do find them noteworthy for the language used: how well would statements such as "There are cheaper ways to produce 7.5% defective product, if that were your aim" or "This final inspection is obviously a joke" be received in most organisations?

15 Plan for Minimum Average Total Cost

A long and detailed chapter on how to do quality assurance in a manufacturing process, from checking the incoming raw materials and components all the way to the final product. Now, despite rumours, a lot of software development does actually include quality assurance, but this chapter is another full of theories and techniques that are not applicable to software.

There are however a couple of useful principles to take away. Seemingly obvious is that rules for quality assurance must be simple in administration, otherwise they'll be misinterpreted or skipped under pressure. To inspect for "quality" you must have operational definitions (chapter 9) otherwise it's just opinions and guesses. And there's a useful short section on how consensus in quality inspection doesn't work. If you do need to have more than one person involved, they must all do independent inspections, and only then compare results.

16 Organization for Improvement of Quality and Productivity

A short chapter in which Deming lays out the ideal organisational plan, and 9 rules for consultants. As far as I can tell both plan and rules would work for software development, and I wish there was a chance they were widely adopted.

If you're a manager without the power to restructure the entire organisation but would still like to make a difference in your own area, you could do worse than follow Deming's advice. "Every appalling example in this book turned up because I was there, on the line, on the job, trying to be helpful by looking for sources of improvement and wrong practices. If I had waited for them to come for help, I'd still be waiting."

17 Some Illustrations for Improvement of Living

This short chapter wraps up with examples in daily life, such as train times and road signs. A pleasant surprise is a plea to be sympathetic: "The usual reaction of almost everyone, when an accident occur, it to attribute it to somebody's carelessness ... It is wise not to jump to this conclusion". I'll end on this positive note.

Conclusion

This is a book primarily about manufacturing, secondarily about service organisations. It is still widely read and recommended today because, as the 2018 foreword puts it, "Much has changed since 1982 when Out of the Crisis was first published. Yet, the teachings of Dr. W. Edwards Deming continue to help us manage to achieve increasing productivity and quality, better use of resources, and greater joy in work."

Is this book Out of the Crisis worth reading for software project managers and software developers? I can't give a simple yes or no. For insights into people, at all levels of the organisation, and how they act, very definitely. This book may be forty five years old, but human beings remain human beings and organisational patterns persist, even in IT. For how to use, or avoid misusing, statistics and fault tracking, yes but you'll have to dig out the relevant bits. Deming makes good points, and even if you think you know all this stuff, it won't hurt to revise. For examples of how to improve software development, no. Sometimes there are parallels between Deming's world of manufacturing and our world of IT, but mostly the detailed case studies just don't mean anything to a software developer.

If you don't have much time, I recommend reading chapters 1, 3, 6 in full; and chapter 2 but skimming most of the text. For chapters 8, 9, 11 start reading, but stop as soon as the content moves away from general principles. And read the first report in chapter 14, if only to daydream of being successful enough that you could be that honest.

Better yet, perhaps someone has written a book about applying the Deming principles specifically to software development? Anyone?