Root-cause analysis is a key technique used in this quality improvement methodology?

In medicine, it's easy to understand the difference between treating the symptoms and curing the condition. A broken wrist, for example, really hurts! But painkillers will only take away the symptoms; you'll need a different treatment to help your bones heal properly.

Índice Show

The Root Cause Analysis Process
Step One: Define the Problem
Step Two: Collect Data
Step Three: Identify Possible Causal Factors
Step Four: Identify the Root Cause(s)
Step Five: Recommend and Implement Solutions

But what do you do when you have a problem at work? Do you jump straight in and treat the symptoms, or do you stop to consider whether there's actually a deeper problem that needs your attention? If you only fix the symptoms – what you see on the surface – the problem will almost certainly return, and need fixing over, and over again.

Click here to view a transcript of this video.

However, if you look deeper to figure out what's causing the problem, you can fix the underlying systems and processes so that it goes away for good.

Root Cause Analysis (RCA) is a popular and often-used technique that helps people answer the question of why the problem occurred in the first place. It seeks to identify the origin of a problem using a specific set of steps, with associated tools, to find the primary cause of the problem, so that you can:

Determine what happened.
Determine why it happened.
Figure out what to do to reduce the likelihood that it will happen again.

RCA assumes that systems and events are interrelated. An action in one area triggers an action in another, and another, and so on. By tracing back these actions, you can discover where the problem started and how it grew into the symptom you're now facing.

You'll usually find three basic types of causes:

Physical causes – Tangible, material items failed in some way (for example, a car's brakes stopped working).
Human causes – People did something wrong, or did not do something that was needed. Human causes typically lead to physical causes (for example, no one filled the brake fluid, which led to the brakes failing).
Organizational causes – A system, process, or policy that people use to make decisions or do their work is faulty (for example, no one person was responsible for vehicle maintenance, and everyone assumed someone else had filled the brake fluid).

RCA looks at all three types of causes. It involves investigating the patterns of negative effects, finding hidden flaws in the system, and discovering specific actions that contributed to the problem. This often means that RCA reveals more than one root cause.

You can apply RCA to almost any situation. Determining how far to go in your investigation requires good judgment and common sense. Theoretically, you could continue to trace the root causes back to the Stone Age, but the effort would serve no useful purpose. Be careful to understand when you've found a significant cause that can, in fact, be changed.

The Root Cause Analysis Process

RCA has five identifiable steps.

Step One: Define the Problem

What do you see happening?
What are the specific symptoms?

Step Two: Collect Data

What proof do you have that the problem exists?
How long has the problem existed?
What is the impact of the problem?

You need to analyze a situation fully before you can move on to look at factors that contributed to the problem. To maximize the effectiveness of your RCA, get together everyone – experts and front line staff – who understands the situation. People who are most familiar with the problem can help lead you to a better understanding of the issues.

A helpful tool at this stage is CATWOE. With this process, you look at the same situation from different perspectives: the Customers, the people (Actors) who implement the solutions, the Transformation process that's affected, the World view, the process Owner, and Environmental constraints.

Step Three: Identify Possible Causal Factors

What sequence of events leads to the problem?
What conditions allow the problem to occur?
What other problems surround the occurrence of the central problem?

During this stage, identify as many causal factors as possible. Too often, people identify one or two factors and then stop, but that's not sufficient. With RCA, you don't want to simply treat the most obvious causes – you want to dig deeper.

Use these tools to help identify causal factors:

Appreciation – Use the facts and ask "So what?" to determine all the possible consequences of a fact.
5 Whys – Ask "Why?" until you get to the root of the problem.
Drill Down – Break down a problem into small, detailed parts to better understand the big picture.
Cause and Effect Diagrams – Create a chart of all of the possible causal factors, to see where the trouble may have begun.

Step Four: Identify the Root Cause(s)

Why does the causal factor exist?
What is the real reason the problem occurred?

Use the same tools you used to identify the causal factors (in Step Three) to look at the roots of each factor. These tools are designed to encourage you to dig deeper at each level of cause and effect.

What can you do to prevent the problem from happening again?
How will the solution be implemented?
Who will be responsible for it?
What are the risks of implementing the solution?

Analyze your cause-and-effect process, and identify the changes needed for various systems. It's also important that you plan ahead to predict the effects of your solution. This way, you can spot potential failures before they happen.

One way of doing this is to use Failure Mode and Effects Analysis (FMEA). This tool builds on the idea of risk analysis to identify points where a solution could fail. FMEA is also a great system to implement across your organization; the more systems and processes that use FMEA at the start, the less likely you are to have problems that need RCA in the future.

Impact Analysis is another useful tool here. This helps you explore possible positive and negative consequences of a change on different parts of a system or organization.

Another great strategy to adopt is Kaizen, or continuous improvement. This is the idea that continual small changes create better systems overall. Kaizen also emphasizes that the people closest to a process should identify places for improvement. Again, with Kaizen alive and well in your company, the root causes of problems can be identified and resolved quickly and effectively.

Root Cause Analysis is a useful process for understanding and solving a problem.

Figure out what negative events are occurring. Then, look at the complex systems around those problems, and identify key points of failure. Finally, determine solutions to address those key points, or root causes.

You can use many tools to support your RCA process. Cause and Effect Diagrams and 5 Whys are integral to the process itself, while FMEA and Kaizen help minimize the need for RCA in the future.

As an analytical tool, RCA is an essential way to perform a comprehensive, system-wide review of significant problems as well as the events and factors leading to them.

Click on the button below to download a template that will help you log problems, likely root causes and potential solutions. Thanks to Club member weeze for providing the basis for this.

Download Worksheet

Author information Copyright and License information Disclaimer

The Joint Commission on Accreditation of Healthcare Organizations has begun requiring root cause analyses for all sentinel events. These analyses can be of enormous value. They capture both the big-picture perspective and the details. They facilitate system evaluation, analysis of need for corrective action, and tracking and trending. Regarding trending, managers will be able to determine how often a particular error— such as an instrument error—occurs or how often a particular floor or unit of the hospital is involved. This information may provide clues to the problem. Root cause analysis is as useful and perhaps even more efficacious in the near-miss scenario. The technique is applicable not only to laboratory medicine but to all health care–associated disciplines.

A root cause analysis should be performed as soon as possible after the error or variance occurs. Otherwise, important details may be missed. All of the personnel involved in the error must be involved in the analysis. Without all parties present, the discussion may lead to fictionalization or speculation that will dilute the facts. Asking for this level of involvement may cause staff to feel hostile, defensive, or apprehensive. Managers must explain that the purpose of the root cause analysis process is to focus on the setting of the error and the systems involved. Managers should also stress that the purpose of the analysis is not to assign blame. The comfort level with the technique increases with use, but the analysis will always be somewhat subjective.

In this article, several different techniques for root cause analysis are applied to an employee safety event that occurred within the Department of Pathology.

A laboratory aide was cleaning one of the gross dissection rooms where the residents work. This aide was a relatively new employee who had transferred to the department just a few days prior to the event. When she was cleaning the sink in the dissection room, she accidentally ran her thumb along the length of a dissecting knife—an injury that required 10 to 15 stitches. Since there had been other less serious accidents in this room and several previous attempts to address the safety issues had not been effective, the department completed a root cause analysis.

The simplest way to perform a root cause analysis is to ask why 5 times. In the case above, the answers might read as follows:

The laboratory aide was cut by a dissection knife.
The knife was left by the sink.
The area was not cleared on the previous day.
Clearing is not a daily habit.
Standard operating procedures/documentation for clearing do not exist.

In a causal tree, the worst thing that happened or almost happened is placed at the top. In near-miss situations, a recovery or prevention side is added to capture how an error was prevented. This step is important in identifying the safety nets that exist, such as a person or piece of equipment that checks processes. Having a written record of these safety nets can be important if the department is reorganized or the budget is cut. Proven safety measures should not be eliminated.

If the error did occur, the causal tree does not have a prevention or recovery side, as the event happened and was not prevented.

In either the near-miss scenario or the full-blown event scenario, the team's next step is to provide the causes for the top event, followed by the causes for those secondary causes, and continuing on until the endpoints are reached. These endpoints are the root causes. The team may identify several root causes and will need to select the most important 2 or 3 for focused prevention efforts and possible corrective action.

Table 1 lists 20 codes frequently used in a medical environment. Assigned codes are useful for tracking and trending. The department or organization can plot the frequency of recurring codes to identify common threads that drive events. For example, if the work culture is listed as a significant organizational root cause in 15 of 20 analyzed events, then the organization must give high priority to adjusting its work culture.

The Figure shows an example of a causal tree with codes for the safety case. Determining why the laboratory aide saw the knife but did not move it required asking her some questions. At first, management team members debated among themselves whether or not she had seen the knife and, if she had, why she might have thought she couldn't move it. The team realized that such discussions were not helpful: facts were needed. By speaking with the employee, the team gained an important insight. The aide had a history of intimidation from physicians, not in the pathology department but elsewhere. Her experience had been that “doctors got mad if you moved their stuff.” The team also further investigated how the aide could have missed that the blade was faced toward the sink. When the group examined the knife in question, they found that it differed from a kitchen knife, and the cutting edge of the knife was not readily apparent.

In the end, this basic safety case had a number of different types of causes. Only one was a human error: the fact that the aide ran her hand into the knife. Seven were organizational factors: four related to protocols and procedures, one related to culture, one related to transfer of knowledge/training, and one external to the department. The factors related to the knife were technical and outside of the control of the department.

If desired, the team can go one step further and use a decision table to determine how best to respond to the root causes that were uncovered. Use of this tool helps prevent the knee-jerk reaction: the memo or procedure change resulting from each error, regardless of its severity. Often, when errors occur, the only thing required is to monitor for reoccurrence.

The decision table considers the severity levels of events: whether the event was potentially life threatening or involved a serious injury, had potential for minimal harm or temporary injury, or had no realistic potential for harm. The table also considers the probability of recurrence and the detectability of the event. In a transfusion service, the key detectability issue is whether the error was detected prior to “release to the patient.” If the error is caught within the transfusion service, then the danger to the patient is lessened. However, if the error passes through the system and is released to the patient, the chance that the error will result in harm is increased. If the error progresses to “given to the patient,” then the error is full blown and has progressed to an undesirable endpoint.

Suggested actions based on these factors are listed in Table 2. An external report would be required for events reportable to the Food and Drug Administration and for sentinel events, as defined by the Joint Commission. Another example of an external report is the Web-based reporting now used at Baylor University Medical Center. This meets the definition of an external report in that the occurrence or variance becomes known outside of the hospital department and becomes part of the institution's event data-base. Each department should have specific criteria for these reports. For example, within the Department of Pathology, all mislabeled (misidentified) laboratory specimens are reported through the Web-based event-reporting system. This criterion is used whether the sample was mislabeled by a pathology employee or by an employee working in one of the clinical areas.

Critical assessment of what happened and why must be balanced with caution in making changes. Frequent and poorly conceived changes to processes and procedures can be a significant root cause of future events. The decision table is a systematic approach to judge the need for change.

The Department of Pathology has used root cause analysis for a variety of variances and, through this tool, has identified system changes that were indicated. When reviewing why patient samples had been reversed during testing, the team discovered that those particular tests required more manipulation and concentration—yet were being performed in a part of the laboratory closest to the hall where technologists were subject to frequent interruptions. By moving the activity to a quieter area, errors were reduced to a nonoccurrence level.

The use of root cause analysis also requires a cultural change. Baylor University Medical Center's move to occurrence reporting on the Web has generated more reports and more openness. However, there is still room for improvement. We encountered a situation in which a technician was late drawing blood, causing a delay in the administration of a drug. However, the next technician came at the originally scheduled time for the drug level test, without realizing that the timing had been delayed. When the team investigated this incident, the nurse involved asked, “Who's going to get in trouble for this? Will it be the first person who drew blood too late or the second who came too soon?” We had to explain that the purpose of the investigation and analysis was to look at the entire operation and determine how to improve the system—not to identify who was at fault. With root cause analysis, the focus is on the what (the event) and the why (the system), not the who.

Root cause analysis is a valuable management tool that can be readily learned by managers as well as frontline personnel. It can be conducted at several levels of depth and complexity. As shown in the safety case, this technique has moved us beyond comforting the employee after the event and reminding her to be careful—which is what we might have done in the past. The approach has helped us make meaningful changes. We have developed protocols related to cleaning and the handling of sharps. We have set the stage for reducing error.

1. Battles JB, Kaplan HS, Van der Schaaf TW, Shea CE. The attributes of medical event-reporting systems: experience with a prototype medical event-reporting system for transfusion medicine. Arch Pathol Lab Med. 1998;122:231–238. [PubMed] [Google Scholar]