Editor’s note. This post has a deep background story. The author, Mike Derivan, was the shift supervisor at the Davis Besse Nuclear Power Plant (DBNPP) on September 24, 1977 when it experienced an event that started out almost exactly like the event at Three Mile Island on March 28, 1979.
The event began with a loss of feed water to the steam generator. The rapid halt of heat removal resulted in a primary loop temperature increase, primary coolant expansion and primary system pressure exceeding the set point for a pilot operated relieve valve in the steam space of the pressurizer. As at TMI, that relief valve stayed open after system pressure was lowered, resulting in a continuing loss of coolant. For the first twenty minutes, the plant and operator response at Davis Besse were virtually identical to those at TMI.
After that initial similarity, Derivan had an ah-ha moment and took actions that made the event at Davis Besse turn into an historical footnote instead of a multi-billion dollar accident.
When Three Mile Island happened and details of the event emerged from the fog of initial coverage, Mike was more personally struck than almost anyone else. He has spent a good deal of time during the past 35 years trying to answer questions about the event, some that nagged and others that burned more intensely.
In order to more fully understand the below narrative, please review Derivan’s presentation describing the events at Davis Besse, complete with annotated system drawings to show how the event progressed.
Warning – this story is a little longer and more technical than most of the posts on Atomic Insights. It is intended to be a significant contribution to historical understanding of an important event from a man with a unique perspective on that event. If you are intensely curious about nuclear energy and its history, this story is worth the effort it requires. Mike Derivan (mjd) will participate in the discussion, so please ask for clarification if you still have questions after you have read the story and the background material.
The rest of this post is his story and his analysis, told in his own words.
By Mike Derivan
My first real introduction to the Three Mile Island (TMI) accident happened on Saturday March 31, 1979. At Davis Besse Nuclear Power Plant (DBNPP) in Ohio we heard something serious had happened as early as the day of the event, March 28, and interest was high as it was a sister plant.
Actual details were sketchy for the next couple of days, and mainly by watching the nightly TV news it became clear to me that something serious was going on. It was clear from watching the TV news reports that conflicting information was being reported. Some reports indicated there had been radiation releases and also reports by the utility, Metropolitan Edison the owner of TMI, of no radiation releases.
I even remember hearing the words “core damage” first mentioned. It was that Saturday on a TV news report I saw the first explanation using pictures of the system to the suspected sequence of events and it became clear to me the Pilot Operated Relief Valve (PORV) had stuck open.
My reaction was gut-wrenching and I was also in disbelief that TMI did not know what had happened at Davis Besse. That evening I watched the Walter Cronkite news report. I sat there with total disbelief as he discussed potential core meltdown. Disbelief because if you were a trained Operator in those days it was pretty much embedded in your head that a core meltdown was not even possible; and here that possibility was staring me right in the face.
Cronkite’s report was also my first exposure to the infamous hydrogen bubble story. I had had enough Loss of Coolant Accident (LOCA) training to understand that some hydrogen (H2) could be generated during Loss of Coolant Accidents; after all we had Containment Vessel hydrogen concentration monitoring and control systems installed at our plant. But the actual described scenario at TMI seemed incredible, except it had apparently happened.
I would expect my reaction was the same as many nuclear plant Operators at that time. With the exception that the apparent initiating scenario had actually happened to me eighteen months earlier at Davis Besse and I just couldn’t get the question out of my mind: “Why didn’t they know”?
The Real Root Cause of the TMI Accident
Since the time of the TMI accident virtually hundreds of people have stuck their nose into the root cause of the TMI accident. Both the Kemeny and Rogovin investigations identified a lot of programmatic “stuff” that needed to be fixed, and I agree with most of it.
However I feel both of them skirted one important issue by using different flavors of “weasel words” in the discussion of Operator Error. The two reports handled that specific topic a bit differently, but the discussions got couched with side topics of contributing factors. But the general consensus of all the current discussion summaries I read is TMI was caused by Operator Error.
The TMI Operators did make some Operator Errors and I am not denying that. But my contention is all the errors they made were after the fact that they got outside of the Design Basis understanding of PWRs at that time. It is no surprise to anyone that when a machine this complicated gets outside of its design basis anything might happen. You basically hope for the best, but you are going to have to take what you get.
Fukushima proves that, and everyone knows why/how Fukushima got outside of their design basis. The how/why TMI Operators got outside of their design basis is going to be the focus of my discussion. I will also discuss the fact that I think this was understood at the time of the investigations, but it was consciously decided not to pursue it.
My whole point of contention is the turning off the High Pressure Injection flow early in the event in response to the increasing Pressurizer level is the crux of the whole Operator Error argument. All discussions say if the Operators hadn’t done that, the TMI event would have been a no-never-mind. And I agree.
But nobody really wants to believe they were told to do that for the symptoms they saw.
In other words they were told to do that, by their training, compounded by tunnel vision bad procedure guidance. I have believed this since the day I understood what happened at TMI. Furthermore the TMI Operators were trying to defend their actions from a position of weakness; their core was melted, nobody wanted to believe them.
I am not in a position of weakness on this issue, my event came out OK at DBNPP, and so I have no reason to not be totally honest or objective on this issue. During the precursor event at DBNPP we also turned off High Pressure Injection early in the event in response to the symptoms we saw, and for the same reason the TMI Operators did it eighteen months later; we were told to do it that way.
This fact is apparently a hard pill to swallow. But if it is hard for you to accept, just imagine how I felt watching TMI unfold in real time.
And right there is the crux of the issue. Once those High Pressure Injection pumps were off, both plants were then outside the Design Basis understanding for that particular SBLOCA (small break loss of coolant accident).
So you hope for the best, but take what you get. But still obviously an error has been made if not taking that action would have made the event a no-never-mind.
So who exactly made the error? Both the Kemeny and Rogovin Reports discuss the problems with the B&W Simulator training for the Operators. The important point they both apparently missed (or didn’t want to deal with, which I prefer as the explanation) is that this is really an independent two-part problem.
I will refer to controlling High Pressure Injection during a SBLOCA as part A of the problem, and to the actual physical PWR plant response to a SBLOCA during a leak in the Pressurizer steam space as part B of the problem.
It really is that simple. B&W was training correctly for High Pressure Injection control (part A) for SBLOCAs in the water space of their PWR. But neither they nor Westinghouse correctly understood the correct plant response for a SBLOCA in the Pressurizer steam space.
By omission they were not training correctly for a SBLOCA in the Pressurizer steam space (part B). To make matters worse B&W was overstressing in training the importance of the part A “rules”, to the extent an Operator would fail a B&W administered Operator Certification Exam for failure to correctly implement the part A rules.
Thus when fate would have it and the two occurrences, part A and part B, combined in the real world, where the plant responds per the rules of Mother Nature, the B&W training and procedures ended up leading the Operators to actions that put them outside the actual Design Basis, not the falsely perceived (and trained upon) Design Basis.
Up until very recently my argument has been one using just simple logic and sheer numbers of Operators involved. In the Davis Besse September ’77 event there were 5 licensed Operators in involved in that decision, either by direct action or complacent compliance. In other words all 5 agreed it was the right thing to do. Of course it wasn’t the right thing to do, but nobody objected because it was the correct part A thing to do and nobody understood the part B of the problem.
Eighteen months later at TMI, March ‘79 an additional number of Operators (just how many depends on the time line) repeated the same initial wrong actions. So we have about a dozen Operators, at 2 independent plants eighteen months apart all doing the same thing, and all convinced they were doing the right thing.
Is it even conceivable to think they did not all believe they did the right thing according to part A? I just don’t believe so; of course we are all arguing from a position of weakness. It is the wrong thing to do for part A and part B combined, so nobody really wants to believe we were trained to do it.
But as I explained it is really the two-part problem that created the issue. My point can be further emphasized by the fact that NRC Region III had heartburn over the report DBNPP submitted for our event. They did not like the fact that the report did not say the Operators made an error turning off High Pressure Injection.
I know why that happened. The person most responsible for writing the report narrative was actually in the Control Room during the event. He did not believe the action was wrong based on his same training relative to part A of the problem. So why would he put that statement in the report? He was so convinced his own (complacent) agreement was correct, that saying otherwise would be a false statement.
Just recently new information came to my attention that absolutely confirms my belief that B&W was in fact totally emphasizing High Pressure Injection control in their training based solely on their understanding of the part A problem, with no understanding on B&W’s part of the part B problem or its affect when combined with the part A problem.
My understanding comes directly from seeing the whole infamous Walters’ response memo of November 10, 1977 to the original Kelly memo of November 1, 1977. It is absolutely remarkable to me that 35 plus years after the DBNPP event and almost the same amount of time after TMI that a totally unrelated Google search turns up a complete version of the Walters memo.
After half a life time of studying all the TMI reports I had only seen one “cherry picked” excerpt from the Walters’ memo, basically saying he agreed with the Operator’s response at DBNPP. The whole memo in context basically confirms that the Operator claims of “we were trained to do it” are correct.
The original Kelly memo also confirms that Kelly still didn’t grasp the significance of the part B problem, as related to the DBNPP event; or if he did he didn’t relate it thoroughly and clearly in his memo. Both memos are presented and discussed below; make up your own conclusions. (The source document is here.
The Kelly Memo
The referenced source document is basically a critique of these memos by textual communications experts. Here’s a summary. First, Kelly is talking “uphill” in the organization, so he couches his memo with that in mind. He asks no one for a decision, but basically asks for “thoughts”. And he makes a non-emphatic recommendation for “guidelines.”
My personal additional notations are he dilutes the importance of and possibly adds confusion to the recommendation by adding “LPI” to the discussion, but most importantly he totally misses any part B problem discussion. He does say “the operator stopped High Pressure Injection when Pressurizer level began to recover, without regard to primary pressure.”
But there is no mention about the fact that the system response was not as expected, e.g. the pressurizer level went up drastically in response to the RCS boiling. He never articulates that the Operator’s reluctance to re-initiate High Pressure Injection, even after we understood the cause of the off-scale Pressurizer level indication, was based solely on that indicated Pressurizer level and our training. Thus the memo totally misses addressing the part B problem point that the system response was not as expected by anybody, which was crucial to getting the guidance fixed.
The other thing I notice is the memo is not addressed to Walters. I’ve also “been there, done that” in a large organization. I can easily understand how the recipient (Walters’ boss) upon receiving this memo, with no specific articulation of a new problem (part B), would pass it to Walters with a “handle it, handle it… make it go away.” I also note the N.S. Elliott on the distribution. He was the B&W Training Department Manager, thus B&W training was directly in the loop on this issue also.
The Walters Response Memo
Note the original Walters’ response memo to Kelly was hand written, so it has been apparently typed someplace along the line. This is how it appears in the reference source, typos and all.
I’m omitting the communications expert’s comments, they are in the reference. Here are my comments. First in simple Operator Lingo this response is a “smart ass slap down” to Kelly, including all the accompanying sarcasm. But there are some very important admissions revealed here. First, an admission, including Walters’ discussion with the B&W Training Department, that we responded in the correct manor considering how we were trained, and also including the bases behind our training. T
his is what we Operators had been claiming all along, but nobody wanted to believe it. Second, Walters clearly states both as his personal assumption and the B&W Training Department assumption that RC pressure and Pressurizer level will trend in the same direction during a LOCA. Bingo. He has just admitted they don’t get it, still, the specific part B contribution to the problem.
So they are in fact training wrong for this event because they don’t understand part B. Further this discussion is happening after the DBNPP event, as a result of the Kelly concerns, and well before TMI. Third, the tone of Walters’ sarcastic comments about a “hydro” (hydrostatic pressure testing) of the RCS every time High Pressure Injection is initiated shows the disproportional emphasis that the B&W training was placing on “never let High Pressure Injection pump you solid.” Again something the Operators were claiming that nobody wanted to believe.
My conclusion, and it hasn’t changed in 35 years, is that the root cause of the TMI accident was the B&W simulator training and inadequate procedures put the TMI Operators in a box, outside of their Design Basis understanding for that specific small break loss of coolant. And a contributing cause is B&W itself didn’t understand the actual plant response to that steam space loss of coolant event because it was never analyzed correctly. Then, they also missed the warning the Davis Besse event provided.
For a long time I wondered why both the Kemeny and Rogovin investigations didn’t reach the same specific conclusion as I have. After all, both investigations had some very smart people involved in both processes, and they both looked at the same evidence. My thinking today is that they did reach that same conclusion. But I don’t actually know what they may have seen as the bottom line purpose for their investigations either.
If you consider that no investigation report was going to change the condition of TMI, it may have been as simple as there is enough wrong that needs fundamental changing, so let’s just get those changes done and move forward. So neither group saw a need to identify the actual bottom line root cause, rather they just gave recommendations for prevention of another TMI type accident.
Further, by the time those 2 reports were published, it was well understood there was going to be a law suit between GPU and B&W. If one of those reports had specifically identified B&W with partial liability for the root cause, that conclusion along with the report that made it, would be inherently dragged into the law suit.
I have no doubt this was actually discussed at the time. And I will further speculate it was actually decided there was no reason to identify the actual true single root cause in the reports because the law suit itself would decide that liability issue independently of the reports. My problem with that is the law suit, which started in ’82, never really settled the liability issue as it was mutually “settled” in ’83 before a conclusion was reached.
Another thing I think was actually discussed at that time was the fact that if the reports stated the root cause was because the B&W training put the Operators outside of the Design Basis understanding for that event; because the event wasn’t understood by B&W, it would open Pandora’s Box. They didn’t want to deal with “What else do you have wrong?”, and there was well over a hundred billion dollars worth of these NPPs still operating.
This conclusion is strongly reinforced for me by the Kemeny Report section “Causes of the Accident”. This section of the report lists a “fundamental cause” as Operator Error, and specifically lists turning off High Pressure Injection early in the event. And then the report lists several “Contributing Factors” including B&W missing the warning provided by the Davis Besse event.
If you read the list of contributing factors listed there is a screaming omission; it is never stated B&W (actually the whole PWR industry if you consider the precursors) did not understand the actual plant response to a leak in the Pressurizer steam space (what I refer here as part B of the problem). And that is why B&W and the NRC both missed the DBNPP warning. Virtually nothing will ever convince me that all those smart people did not put that truth together.
Thus it was both their fear of opening Pandora’s Box, and a conscious decision that there was no need to implicate B&W with any partial liability ruled the process. By doing that they collectively decided to throw the TMI Operators under the bus as the default position.
My conclusion for the missing Contributing Factor problem is an Occam’s razor solution; it is not “missing” at all with respect to they didn’t “Get It”; it was a decision not to include it. After all, if that Contributing Factor had been included, who on earth will believe it is an Operator Error when they simply did what they were told to do in that situation? So they just simply did not want to deal with the real issue; who made the error?
A Simple Analogy
For years I struggled with finding a simple analogy to explain the position the TMI Operators were placed in by their training. One that could be understood by common everyday knowledge everyone was familiar with; not the technical detail that required understanding the complications of nuke plant operations. One of the reasons that was difficult was that it required a “phenomena” that is commonly understood today, but was not understood at all at the time of the training. This is the best I can come up with.
Suppose in learning to drive a car you are being trained to respond to the car veering to the left. It’s simple enough, simply turn the steering wheel to the right to recover. It is also what your basic instinct would lead you to do, so there is no mental conflict in believing it.
It is also actually reinforced and practiced during actual driver training on a curvy road. That response is soon imbedded as the right thing to do. Now suppose your driver training also includes training on a Car Simulator training machine. It is where you learn and practice emergency situation driving. After all, nobody is going to do those emergency things in an actual car on the road.
Here’s where it gets complicated. Assume virtually no one yet understands that when the car skids to the left on ice (because of loss of front wheel steering traction); the correct response is to turn the steering wheel into the skid direction, or to the left. This is just the opposite of the non-ice response. And to make matters worse, because no one understands that yet, including the guy who built the Car Simulator, the Car Simulator has been programmed to make this wrong response work correctly on the Simulator.
So in your emergency driver training you practice it this way, the Simulator responds wrong to the actual phenomena, but it shows the successful result, you recover control. Since this probably also agrees with your instinct, and you see success on the Simulator, this action is also embedded as the right thing to do. One additional point, if you don’t do this wrong action, you will flunk your Simulator driver training test.
So you know where this is going, now you are out driving on an icy road for the first time and the car skids to the left. You respond exactly as you were instructed to do and exactly as the Simulator showed was successful, and you have an accident because the car responds to the real world rules of Mother Nature.
An investigation is obviously necessary because, I forgot to tell you, the car cost $4 billion dollars and you don’t own it. During the subsequent investigation everything is uncovered; the unknown phenomenon is finally correctly understood, the Simulator incorrect programming is discovered, it is uncovered the previously unknown phenomenon had been discovered before your accident, and your accident was even predicted as possible.
But the investigation results are published and the finding is the accident was caused by your error of turning the steering wheel the wrong way on the ice. Nobody else is found to have made an error in the stated conclusions but you; it is simply a case of Driver Error. Do you feel you have been screwed? This happened to the TMI Operators.
For everybody out there who doesn’t like my conclusions, I’ll just say that many of the principals of the investigations are still alive, but choose not to talk, so simply ask them. Especially the principals in the GPU vs. B&W law suit which should have determined any liability issues. Ask them why it didn’t happen. My idea of justice involves getting the truth, the whole truth, and nothing but the truth exposed. That process is still unfinished.