Читать книгу Reliability Assessment: A Guide to Aligning Expectations, Practices, and Performance - Daniel Daley - Страница 9

Оглавление

Chapter 3

Assessing What You Have a Right to Expect

One who asks a question is a fool for five minutes;one who does not ask a question remainsa fool forever.

Chinese proverb

This chapter provides an introduction to the rest of the book. I hope the first two chapters whetted your interest in understanding the elements that affect reliability. More specifically, I hope you have begun to ask yourself the question, “What do I have a right to expect?” If so, the chapters were successful. If not, I hope that your curiosity will lead you at least a little further into this book.

Generally speaking, individuals who are responsible for managing complex equipment and systems cannot afford to be in a position where they do not know the answer to that question. If they do not know the answer:

•They do not understand the extent of the lost opportunity.

•They do not know how difficult or how easy it might be to capture that opportunity.

As a starting point for this chapter, I would like to create a term that is much easier to use than “What Do You Have a Right to Expect?” For the sake of simplicity, I will use the term Wide-Hart (WDYHARTE) as a shorthand notation for the comprehensive assessment of your reliability opportunity.


One of the unfortunate characteristics of reliability is that there are so many elements that determine the reliability of a system over its entire lifecycle. Dropping your guard with respect to any one of these elements can lead to poor reliability. It is not acceptable to be good for 90% of the elements and ignore the last 10%.

Consider, for example, the owners of a high-end car like a Mercedes-Benz or a BMW. They have purchased a product with good inherent reliability. Let’s assume that the owners drive their cars in a sensible caring manner and they perform all the required preventive maintenance using the highest quality materials. The owners have done everything they should up to the point that the engine needs an overhaul. Rather than purchasing a “crate” engine that was assembled with the same care and sensitivity as the original car, they allow a local mechanic (who normally handles only oil changes) to perform the overhaul in a non-certified corner garage. After the “backyard mechanic” overhaul, the car is never again the same. The reliability suffers until the owners decide to replace the car.


In this example, it could have been a poor mechanic, overaggressive operation, or poor inherent reliability in the original product, but only one lapse can result in poor reliability. It is possible to recover from some of these situations by correcting the deterioration they caused or by eliminating the defects they introduced into the system. However, this approach is useful only in instances where the population is small and relatively few problems need corrective action. Individual owners can correct their own cars. But if a fleet manager allows an entire fleet to become run-down, it will be near impossible to flush out all of the defects.

Some situations involve hundreds or thousands of pieces of equipment. The likelihood of managing all the corrective actions needed to address all lapses is small. The only way to guarantee reliability is to prevent problems in the first place. This philosophy applies to each and every element that affects reliability. In maintenance, this philosophy is called Preventive Maintenance. There are other preventive approaches in each and every activity that affects reliability over the entire lifecycle of a system.

The following outline for a Wide-Hart assessment describes the elements that should be included in a comprehensive assessment of how you deal with all the choices and activities that affect reliability of your systems.

Outline for a Wide-Hart Assessment

Assess Cost of Unreliability

It is best to begin a reliability assessment with an evaluation of the overall cost of unreliability. In this context, I am using the term “cost of unreliability” to mean the overall cost resulting from all situations caused by reliability-related failures. This cost will include both the direct and indirect costs associated with all reliability issues that could have been prevented by adherence to good reliability practices.

These costs include the cost of repairing equipment after failure. They also include the lost value of the asset while it is unavailable to perform its intended function after a failure. In addition, they include the cost associated with off-spec product made while equipment was in the process of failing and the cost of energy consumed while shutting down and re-starting. If poor design practices have resulted in additional maintenance costs to support a system with inadequate inherent reliability, the cost of the added maintenance must be included. The cost of unreliability includes all costs resulting in any manner from poor reliability.

In fact, it makes best sense to evaluate the overall cost of unreliability twice:

•Once before performing a detailed assessment of each element that contributes to reliability.

•Once after performing the detailed assessment of each element affecting reliability

The first approach is a macro view from the outside-in of how much business is being lost because of lost production or cost being added. The second approach provides a micro view from the inside-out of how individual weaknesses in each element of reliability add up. The first approach provides the impetus to move forward with the assessment. The second approach highlights detailed areas of loss that you never knew existed.

There are two advantages in developing an accurate Cost of Unreliability from both the top-down and the bottom-up. They are:

1.It is important for senior managers to know the total Cost of Unreliability from a business perspective to accurately understand the value of the entire opportunity. Without this information, senior managers may think that corrective action is too expensive. With an accurate Cost of Unreliability, they will know it is a good investment.

2.It is important to know the total Cost of Unreliability from a detailed perspective to provide a basis for closure. If there is a significant difference between the total “tops-down” Cost of Unreliability and the sum of the individual parts, the assessment will not pass the “smell test”. Either you have missed something or have exaggerated the value of something.

High Level Cost of Unreliability

In assessing the Cost of Unreliability from a top-down or outside-in perspective, we will be trying to understand the total loss of money that results from poor reliability. We want to assess the cost as senior managers, accountants, or investors would. They are not particularly interested in what is causing the loss of revenue. They are only interested the bottom line.

The first category of Costs of Unreliability is direct costs, which are those factors that have a direct cause-and-effect relationship with a reliability event. These costs include:

•The value of lost production — or the income that could have been made if production had not been interrupted.

•The cost of maintenance needed to perform repairs and restore operation.

The second category of Costs of Unreliability is indirect costs. These costs frequently have no direct cause-and-effect relationship, but are the result of poor reliability nonetheless. These costs include:

•The cost of being a reactive organization — or the cost of having to be prepared to respond to failures. An organization that performs a great deal of reactive maintenance needs to be larger than a proactive organization. It needs people both to keep things running and to respond to failures. It needs a larger staff to manage all the problems. Managing problems keeps senior managers from focusing on future improvement and keeps them focused on the past.

•The costs of sloppiness — sloppiness is impossible to confine to one thing. It is impossible to confine a management philosophy that condones poor reliability to reliability only. Poor reliability tends to infect other areas like quality, safety, and environmental performance. In assessing the Cost of Unreliability, it is important to include the impact poor reliability has on those areas.

•The cost of lost business — or the impact on your business from missing deliveries or making poor products while affected by poor reliability. Companies that accept poor reliability have two choices. First, their production and quality can suffer from poor reliability. If they want to prevent their poor reliability from affecting delivery schedule and quality, they have to have sufficient manufacturing capacity to both accommodate the losses and meet customer demands. Second, they can have an inefficient operation that ultimately affects product costs. In either case, the customer will ultimately be unhappy and look for another supplier.

Detailed Cost of Unreliability

In assessing the Cost of Unreliability from a bottom-up or inside-out perspective, we will be trying to identify each and every issue that results in poor reliability and to quantify the relative value of that specific problem. Although the accountants and investors are not interested in this level of detail, this information is needed to build a plan of attack for corrective action. It is important to understand specifically what weakness is resulting in poor reliability and how large an impact is being produced. To be effective in making changes, we need to know what to attack and in which order we should attack each problem.

The following sections go through each element in the lifecycle of a system and describe the issues that play a part in ensuring that the system is reliable, As the relative strengths or weaknesses of the individual elements are identified, it will be necessary for you to measure the impact by quantifying the cost of the fallout resulting from that problem.

Assess Basic New Unit Development Practices

How do you go about procuring a new system? How much effort goes into designing reliability into it? There are some items that would seem to be bullet proof and can be left to the “kindness of strangers.” By this I mean that if your design process adequately addresses integrity requirements, the reliability aspects are likely to take care of themselves. For example, when you purchase a compressor from a hardware store, you trust that the design requirements needed to ensure that the pressure vessel will not explode will also ensure that it will provide a long reliable life. That paradigm may or may not be correct. “Ruggedness” may ensure the reliability of very unsophisticated components, but not components that are delicate or “intelligent.”

But for now, let’s get back to the basic question. How much attention is typically paid to reliability as a part of the basic system development process? An even more basic question is: How do you manage the reliability aspects of the design process? Are the reliability aspects of the design process even understood?

Let’s begin this discussion by answering these questions. How are they addressed as part of the normal design process for commercial products where “design” is a matter of selecting desired characteristics? For commercially available products, the design process is a matter of selecting characteristics that describe form, fit, and function. When ordinary people purchase a new car, they address reliability in the very limited way, if at all. Apart from form, fit, and function, there are a number of integrity-related and reliability-related issues that purchasers typically choose to trust to others.

A few examples involve features that particularly careful buyers may change after they purchase a new car because the design features are not readily available from manufacturers or dealerships. One example is tires. It is not uncommon for particularly careful people to go to a tire store immediately after leaving the dealership with a new car. Doing so, they are able to trade the almost new tires on the car for a set of new tires with which they are more confident. Another example is based on personal experiences with car enthusiasts who choose to “blueprint” new cars as soon as they are delivered. The process of blueprinting a new car is typically reserved for high value or collector cars. It entails disassembling a significant portion of the car looking for missing or loose connectors and for key settings that are misadjusted during manufacturing. These individuals have little trust for the typical factory worker.

In either case, if it were possible to specify the way cars are assembled, some individuals would demand:

•Different and better tires

•Different quality control practices

•A run-in procedure prior to delivery to eliminate components likely to experience infant mortality

In most cases, however, car buyers would typically pick the color, the number of doors, and the kind of transmission and trust everything else to the manufacturer and the dealer.

Moving beyond typical purchases made by individuals, examples of integrity-related issues may involve the adequacy of the structural design and assembly. The complexities of these issues are beyond the understanding of most non-engineers. Therefore, most people tend to trust that they are being handled in an appropriate manner, which is not always the case. Here are a few examples:

•One of the major locomotive manufacturers chose to use an unqualified manufacturer for the pressure vessels containing high pressure air for brakes and other pneumatic systems. After these vessels began to explode without warning, the manufacturer implemented a program to replace them.

•By now, you may be aware that several important elements of the design of the twin towers in New York were such that they jeopardized the integrity of the buildings’ structure in unusual situations.

•It is unusual but not unheard of for a bridge to collapse. In 2007 the I-35W bridge spanning the Mississippi river in Minneapolis collapsed. The NTSB (National Transportation Safety Board) reported that a flaw in the design combined with unusual loading at the time of the collapse contributed to the failure.

In each of these examples, the basic integrity of the system was taken for granted. Viewed purely from the ability of those systems to perform their intended function, they experienced reliability (as well as integrity) failures. From these examples we can see that integrity is a critical element of reliability.

Despite the counter-examples described above, many design processes contain elements that adequately address the integrity-related issues that ensure the safety and functionality of the system being designed. When these elements are adequately addressed, they are accompanied by a measure of reliability that goes hand-in-hand with the integrity.

Reliability Assessment: A Guide to Aligning Expectations, Practices, and Performance

Подняться наверх