Read Decoupling: A Systems Engineering Technique to Move Beyond Replicable Measurements text version

Paper #108

Decoupling: A Systems Engineering Technique to Move Beyond Replicable Measurements

Michael S. McBeth

Joint Test Director Joint Methodology to Assess C4ISR Architecture Joint Test and Evaluation 7025 Harbour View Blvd. Suite 105 Suffolk, VA 23435 [email protected]

Abstract

Science is closely associated with replicable measurements. Replicable measurements allow engineers and scientists to uncover errors, discount results due entirely to chance, and discredit fraudulent claims. These reasons make it desirable to use replicable measurements to help assure that results are reliable. Unfortunately, many problems do not lend themselves to replicable measurements. Examples can be found in destructive testing, in-situ measurement of soil formations, social systems, and situations that are too complex and costly to replicate. This paper introduces the concept of decoupling as a systems engineering technique to move beyond replicable measurements. Decoupling refers to ways of achieving independence in measurement that allow results from nonreplicable measurements to be used to solve problems and make decisions. The use of decoupling is illustrated using six examples.

Introduction

The Role of Replication. Replication has been identified as one of the defining characteristics of science and the scientific method. (Christensen 1994) (Popper 2002) (Derry 1999) Replication refers to the ability 1

to reproduce experimental results and observations. Replicable measurements help to establish the reliability of scientific results. Replicable measurements allow engineers and scientists to uncover errors, discount results due entirely to chance, and discredit fraudulent claims. These reasons make it desirable to use replicable measurements, when possible, to help assure that results are reliable. Why Move Beyond Replicable Measurements? Unfortunately, many problems do not lend themselves to replicable measurements. Examples can be found in destructive testing, in-situ measurement of soil formations, social systems, and situations that are too complex and costly to replicate. In the context of non-replicable destructive testing, (Benham 2002) discusses the problem of conducting Gage Repeatability and Reproducibility (GRR) studies to determine how much of observed process variation is due to measurement systems variation. Since the part is destroyed in the first test, the traditional GRR technique of using different appraisers to measure and re-measure the same part cannot be used. Measurements to measure soil properties that disturb soil formations are non-replicable measurements since the measurement cannot be repeated at precisely the same location.

(Baecher 1999) and (Jaksa et al. 1997) discuss how curve fitting techniques can be used to estimate the measurement error of these nonreplicable measurements of soil properties. Examples of social systems with nonreplicable measurements can be drawn from the study of crime, real-estate transactions, and election campaigns. These examples are characterized by human behavior and economics. (Dubner 2003) describes the work of the University of Chicago economist Stephen Levitt who "devises a way to measure an effect that veteran economists had declared unmeasurable." There are many measurement situations that are too complex and costly to replicate. This situation often occurs within the Joint Test and Evaluation (JT&E) Program sponsored by the U.S. Office of the Secretary of Defense (OSD). In this program, Joint Tests are nominated to develop enhanced processes for tactics, techniques, and procedures (TTP) or improved test methodologies. Many of these test concepts depend on leveraging large military training exercises as test event venues. Traditionally, a Joint Test baselines a process in an initial test event and uses a later test event to measure the effect of enhancements. Unfortunately, each training exercise is different. Different combat units participate and different scenarios are used. Combat units can have different systems and equipment. Depending on the nature of the test, these differences can make measurements associated with these training exercises essentially non-replicable. Systems Perspective. The ideas behind the concept of decoupling are not new. The value of their exposition in this paper lies in collecting examples from several different disciplines and generalizing from them. This results in a way of thinking about system problems that gives analysts a viable option when the use of replicable measurements is not possible. However, before discussing

decoupling techniques, it is useful to briefly review the scientific method in which the application of decoupling should be embedded. Scope of Paper. The paper begins by reviewing the scientific method. This includes a discussion of the role of theory in the scientific method. Three characteristics of science including control, operational definition, and replication are also discussed. Building on this background, the concept of decoupling or achieving independence in measurement is introduced and developed. The technique is discussed in terms of a generic process that follows the scientific method. The rationale for using decoupling as a technique to move beyond replicable measurements is discussed. The paper concludes by illustrating the use of decoupling with six examples. This discussion is followed with a summary of key points and some recommendations for future work.

Scientific Method

Five Step Process. The scientific method is a universally accepted way of attaining knowledge. It is a method for acquiring information that is objective, based on empirical evidence, and subject to public scrutiny. Although there are alternate ways of formulating and describing the scientific method, the description here follows a five step process described in (Christensen 1994). The five steps identified by (Christensen 1994) are: 1) identifying the problem and forming a hypothesis, 2) designing the experiment, 3) conducting the experiment, 4) testing the hypothesis, and 5) communicating the research results. Although these steps can be followed sequentially, in practice, they are often performed iteratively and in parallel. This is necessary because the steps are interrelated and a change in one step often results in changes to others. 2

Role of Theory. (Christensen 1994) describes the role of theory in science as being a way to summarize, integrate, and explain large amounts of data in a compact form as well as a way to predict future outcomes under the appropriate circumstances. Theory can also be used to guide experimentation which, in turn, serves to lend credence to or refute the theory. See (McBeth 2003) for an example of using theory to narrow the focus and guide the collection of data. The ability of theory to explain cause and effect and predict outcomes is critical to the concept of decoupling. Control, Operational Definition, and Replication. (Christensen 1994) discusses three characteristics of the scientific method. They are control, operational definition, and replication. Control is the way in which researchers reduce or eliminate the influence of extraneous variables that could affect experimental results. (Christensen 1994) identifies control as "the single most important element" in the scientific method and it is at the heart of decoupling. In this case, it is not only extraneous variables that must be controlled but also variables that are intertwined and coupled together in complex ways. Operational definition simply refers to using variables that are defined by operations. For example, (Christensen 1994) explains that instead of referring to a state of "hunger" one would define hunger operationally as "eight hours of food deprivation." The main value of operational definitions is in the ability to clearly communicate results. Replication has already been discussed as a way to insure the reliability of results. The key point here is that when replicable measurements are not possible or feasible, other steps must be taken to assure reliability of results. These steps can include control and operational definition.

Decoupling

Independence in Measurement. In this section, the concept of decoupling is introduced as a systems engineering technique to move beyond replicable measurements. Decoupling refers to ways of achieving independence in measurements. Since the value of replication lies primarily in establishing the reliability of results, decoupling attempts to leverage the other characteristics of control and operational definition to assure reliable results. This is not easy and usually calls for quite a bit of creativity to get it right. Decoupling is a way of achieving control and independence in measurement by the sorting of causal factors and the use of normalization techniques to relate measures to a norm or standard. A Generic Process. The process begins with observation and categorization. To arrive at a decoupling scheme for a non-replicable measurement, one must understand enough about the problem under study to form hypotheses about cause and effect. It often requires a level of sophistication where plausible explanations about the structure and interactions of important variables can be discussed. A theory is needed to guide the control to achieve independence in measurement. Once some ideas about causes, effects, and interactions have been identified, the focus moves toward control in the form of designing and conducting the experiment. For some problems in economics and social systems there may not be an actual "experiment" to conduct, but relevant data will need to be collected or accessed on the system under study. It is here that a careful consideration of the "variables to be decoupled" is used to shape the experiment to provide a "reliable" answer. After conducting the experiment, the hypothesis is tested to determine the results.

3

The results either refute or lend credence to the hypothesis. When the results refute the hypothesis the theory needs to be refined or discarded. When the results lend credence to the hypothesis this provides additional assurance that the hypothesis is correct, but it does not "prove" the hypothesis because future experiments or tests may end up disproving it. Since non-replicable measurements cannot be repeated, a logic of design is needed to substantiate the correctness of the results. In the case of non-replicable measurements from economics, this could take the form of a plausible assertion such as "it is highly unlikely that election cycles that drive police hiring are directly related to street crime," or that "real-estate agents are highly likely to insist on the highest price when it is their own house for sale." In other cases, this can take the form of a mathematical property or condition. For example, in soil property measurements the random measurement error is determined by extrapolating the observed autocorrelation function back to the origin. This technique works because the other components of the autocorrelation function "disappear when the shift is zero" leaving only the measurement error. Finally, the results must be communicated to the larger community. This step involves not only the measurement results but also how they were arrived at and the rationale for their reliability.

Illustrative Examples of Decoupling

In this section the use of decoupling is illustrated using six examples. These examples are drawn from the disciplines discussed in the introduction to explain the need for non-replicable measurements. Destructive Testing. This example of decoupling is from the field of measurement systems analysis where so-called gage studies are conducted to quantify how much of the

observed variation is due to measurement system error. The traditional technique for conducting a Gage Repeatability and Reproducibility (GRR) study calls for measuring and remeasuring the same part by different appraisers. However, in a destructive test the part is destroyed and is no longer available for subsequent measurements. (Benham 2002) cites a "destructive weld test where a weld nut is pushed off a part and the peak amount of pushout force before destruction is measured" as an example of a non-replicable destructive test. (Benham 2002) describes a technique for determining measurement variation in a nonreplicable destructive test by structuring the test so that a number of identical or "duplicate" parts are chosen to be used in trials that would be conducted with a single part in the traditional method. This group of "duplicate" parts are often taken in a consecutive sequence from a production process. The parts selected to represent a "single" part need to be "like" each other. (Benham 2002) notes that "the assumption must be made that all the parts sampled consecutively (within one batch) are identical enough that they can be treated as if they were the same. If the particular process of interest does not satisfy this assumption, this method will not work." [emphasis in original] However, the group of parts selected to represent part number 1 needs to be "unlike" the parts selected to represent part number 2. (Benham 2002) suggests considering part variation in the form of "part-to-part, shift-toshift, day-to-day, lot-to-lot, batch-to-batch, and week-to-week" to achieve the desired "unlikeness" between groups of "duplicate" parts representing each part in the gage study. In this example the decoupling is achieved by selecting groups of "like" parts to represent a part measured by different

4

appraisers in the study that are decoupled (read "unlike") another group of "like" parts. Test results are interpreted using standard analysis of variance techniques. Refer to (Benham 2002) for specific details on the conduct of these tests. In-situ Measurement of Soil Formations. This example of decoupling is from the field of geotechnical engineering. The goal is to estimate the measurement error associated with in-situ measurement of soils. The objective of these measurements is to quantify the spatial variability of soils using electrical cone penetration tests that result in plots of cone tip resistance (in MegaPascals) versus position. Cone penetration tests are non-replicable because it is not possible to repeat the measurement at precisely the same location. The reader is referred to (Jaksa et al. 1997) and their references for specific details on the techniques and methods of data manipulation and interpretation for this example as the description here is at a high level intended only to summarize the key points related to decoupling. Basically, the method takes advantage of the mathematical property that the total uncertainty is the sum of each of the contributing uncertainties. The problem is that it is difficult to separate and quantify each contributing factor. When replicable measurements can be performed, it becomes a straight forward matter to separate the measurement uncertainty from the other contributing factors. In fact, this is the basis for the traditional GRR studies discussed in the destructive testing example. So the goal here was to figure out how to decouple the random measurement error from the other contributing factors. In this example the decoupling is achieved by extrapolating the autocorrelation function of a series of measurements back to the origin. This technique works because the other components of the autocorrelation function

disappear when the shift or lag is zero. The components that vanish with zero lag correspond to the other contributing factors leaving only the measurement error. Care must be taken when using this technique to assure reliable results. However, (Baecher 1999) states that "If reasonable judgment is used in applying the method, curve extrapolation may yet provide insight." Economics. Three examples from the field of economics are discussed here. They are taken from the work of Steven Levitt, a University of Chicago economist. These examples deal with human behavior and social systems. The first example addresses how to measure the relationship between the number of police and the crime rate. (Dubner 2003) describes the problem as: "Do more police translate into less crime? The answer would seem obvious -- yes -- but had never been proved: since the number of police officers tends to rise along with the number of crimes, the effectiveness of the police was tricky to measure." In this example Levitt achieved decoupling by realizing that police officers are sometimes hired in connection with political elections that are unrelated to the crime rate. By focusing only on police hiring associated with election promises, Levitt was able to unlink or decouple the crime rate from police hiring. His results show that adding more police officers did indeed reduce violent crime. The second example addresses the question of whether real-estate agents represent the seller's best interest. Real-estate agents are supposed to get the highest price possible for the seller. The agent earns a higher commission from a higher selling price so, on the surface, agents would seem to have a strong incentive to work for the highest sales price. Levitt noticed that sellers' agents sometimes provided buyers with subtle cues to

5

underbid. He reasoned that the sellers' agents are often interested in making a deal now to collect commissions faster rather than maximizing the value of each commission. In this example Levitt achieved decoupling by understanding that real-estate agents would be highly likely to insist on a high selling price when it was their own home for sale. Levitt looked at records of over 50,000 home sales in Cook County, Illinois to compare sales for homes owned by real-estate agents to sales where agents only acted on behalf of another seller. According to (Dubner 2003) Levitt reported that "the agents' homes stayed on the market about 10 days longer and sold for 2 percent more." The third example addresses the relationship between election outcomes and campaign expenditures. The idea is that money wins elections and that the candidate that spends the most money is more likely to win the election. Levitt realized that only candidates that are seen as having a chance of winning are able to raise a lot of money and only incumbents who feel they have a chance of losing spend a lot of money. (Dubner 2003) relates that the problem was "that his data couldn't tell him who was a good candidate and who wasn't. It was therefore impossible to tease out the effect of the money. As with the police/crime rate puzzle, he had to trick the data." In this example Levitt achieved decoupling by realizing that many times the same two candidates competed in multiple races. By focusing only on these races, Levitt was able to decouple good candidates from bad candidates and then look at the amount of campaign money spent. According to (Dubner 2003) Levitt's results showed that "campaign money has about one-tenth the impact as was commonly accepted." Joint Test and Evaluation. This example of decoupling is from the Joint Methodology to Assess Command, Control,

Communications, Computers, Intelligence, Reconnaissance, and Surveillance (C4ISR) Architecture (JMACA) project of the Joint Test and Evaluation (JT&E) Program sponsored by the U.S. Office of the Secretary of Defense (OSD). The problem facing the JMACA Joint Test is how to measure enhancements in an architecture assessment methodology intended to diagnose problems in Joint Task Force (JTF) information architectures. Traditionally, a Joint Test baselines a process in an initial test event, usually a large-scale training exercise and uses a later test event to measure the effect of enhancements. However, each JTF architecture and training exercise is different. Different units and scenarios are used. Different units often have different systems and equipment. This can make these measurements essentially nonreplicable. (Swets et al. 2000) describes a technique called Relative Operating Characteristic (ROC) analysis that can be used to provide a measure of accuracy that is independent of fault event frequencies and decision bias. ROC analysis provides a way to measure enhancements in the architecture assessment methodology even when using different JTF architectures and exercises. In this example, the test team can achieve decoupling by virtue of the way in which a ROC curve is constructed. Two conditional probabilities of a positive decision, the probability of a true-positive decision and the probability of a false-positive decision are used to represent the diagnostic performance. (Swets et al. 2000) explains how the decoupling is achieved "These two probabilities are independent of the prior probabilities (by virtue of using the priors in the denominators of their defining ratios). The significance of this fact is that ROC measures do not depend on the proportions of positive and negative instances in any test sample,

6

and hence, generalize across samples made up of different proportions." A notional ROC curve showing results from two hypothetical test events is shown in figure 1. Although each test event can be considered a non-replicable measurement of the diagnostic accuracy, the ratios used to build the true-positive and false-positive probabilities allow the results to be compared.

Figure 1. Notional ROC curve showing results from two hypothetical test events Summary of Decoupling Examples. Table 1 provides a summary of the six decoupling examples discussed in the previous paragraphs. The reasons why the measurements in these examples are nonreplicable can be categorized as either physical or economic. In two examples where the reason for non-replicability is physical, the test sample is either destroyed or disturbed to the point that re-measurement is not possible. In the four other examples, although replicating the measurement may be theoretically possible, it is not feasible due to the cost and complexity of replicating the relevant measurement conditions. In the physical examples, mathematical properties and behavior are exploited to

achieve decoupling. In the destructive testing example, a controlled production process is required to provide "duplicate" parts that can be treated as "like" parts for measurement and analysis purposes. For the test to be valid, results from the "like" parts must be clustered in a recognizable way while results from "unlike" parts need to be separated in a recognizable way. In the in-situ soil property measurement example, a property of the autocorrelation function is used to extrapolate a curve to provide an estimate of the random measurement error. In the economic examples, creativity is required to gain insights into theories of cause and effect that can be exploited to decouple key variables and factors that make establishing the relationship to be measured problematic. The three examples of Steven Levitt's work all involved decoupling within the analysis of existing data sets. The decoupling was not achieved by collecting data in a special way, but by interpreting it in special ways. This includes the insight that police are sometimes hired as a result of politicians' campaign promises that allowed Levitt to decouple police hiring from the crime rate. This includes the insight that real-estate agents selling their own homes could provide a way to decouple and compare their performance from instances where agents were acting on another seller's behalf. This includes the insight that elections where the same two candidates faced off multiple times could be used to provide a basis to measure the effect of campaign expenditures on election outcomes. The Joint Test and Evaluation example shows where ROC analysis can be used to represent a measure of diagnostic accuracy that allows meaningful comparisons even though the measurements were performed using different exercises with potentially different underlying fault populations.

7

Table 1. Summary of Decoupling Examples Non-Replicable Measurement Case Destructive Testing GRR Studies In-situ Soil Property Measurement What's Decoupled? "Like" parts from "Un-Like" parts Random measurement error from the spatial variability of the soil Crime rate from police hiring Reason for NonReplicability Physical: Part is destroyed in test Comments on Decoupling Mechanism

Relationship between the number of police and the crime rate Whether realestate agents represent the seller's best interest Relationship between election outcomes and campaign expenditures Effect of enhancements to architecture assessment methodology on diagnostic performance

Sales of homes owned by realestate agents from sales of homes that agents did not own Good candidates from bad candidates

Measure of accuracy from underlying fault event frequencies and decision bias

Extreme care is required to make sure pre-requisites for using technique exist and in selecting the samples to be tested. Physical: cone Random measurement error is penetration tests determined by extrapolating the can only be observed autocorrelation function performed once at back to the origin. Care must be precisely the taken in trend removal and sample same location spacing size Economic: not Creativity is required to notice that feasible to increases in police hiring are some replicate due to times driven by political election costs and cycles independent of the crime rate complexity Economic: not Creativity is required to understand feasible to that agent incentive to get the replicate due to highest selling price might shift costs and when their own homes were for sale complexity versus a client's home and then looking at comparable data to see if there was a significant difference Economic: not Creativity is required to separate feasible to good candidates from bad by replicate due to looking at the same two candidates costs and in multiple races. Challengers only complexity raise a lot of money when they have a chance of winning and incumbents only spend a lot of money when they have a chance of losing Economic: not Relative Operating Characteristic feasible to (ROC) analysis is used to show replicate due to changes in diagnostic accuracy costs and although different exercises and complexity forces are used for test events. Fault event frequencies are normalized and decision thresholds are systematically varied.

8

Conclusion

Summary of Key Points. In this paper, decoupling has been introduced as a systems engineering technique to move beyond replicable measurements. The first step in applying decoupling is to recognize when you are facing a non-replicable measurement situation. The next step is to apply the scientific method to develop theories to explain cause and effect and predict outcomes. These theories can provide ways of looking at relationships between variables and factors that can lead to identifying a mechanism to decouple them. Next, control and operational definition are used to implement the decoupling mechanism and get results. Six examples were used to illustrate decoupling. Future Work. This paper has only scratched the surface of how to use decoupling for non-replicable measurements. In some cases, it may not be possible to decouple certain variables. They may be so highly coupled that no amount of ingenuity will result in a measurement approach that decouples them. Are there guidelines and heuristics that can be identified to help systems engineers apply decoupling in the non-replicable measurement situations they face? Can a theory of decoupling be developed that is useful for applying decoupling in different categories of nonreplicable measurements and predicting under which circumstances a decoupling strategy is likely to achieve reliable results? Future work is needed to answer these questions.

References

Baecher, Gregory B., "Discussion on Inaccuracies Associated with Estimating Random Measurement Errors," Journal of Geotechnical and Geoenvironmental Engineering, January 1999, pp. 79-81. Benham, David, "Non-Replicable GRR Case Study." Case Study, Automotive Industry 9

Action Group, June 5, 2002. (http://www.aiag.org/publications/quality/ msa3.html) [Accessed December 30, 2003] Christensen, Larry B., Experimental Methodology. Allyn and Bacon, Needham Heights, MA, 1994. Derry, Gregory N., What Science Is and How It Works. Princeton University Press, Princeton, NJ, 1999. Dubner, Stephen J., "The Probability That a Real-Estate Agent Is Cheating You (and Other Riddles of Modern Life)" The New York Times, August 3, 2003. (http://www.nytimes.com/2003/08/03/mag azine/03LEVITT.html [Accessed August 6, 2003] Jaksa, Mark B., Peter I. Brooker, and William S. Kaggwa, "Inaccuracies Associated with Estimating Random Measurement Errors," Journal of Geotechnical and Geoenvironmental Engineering, May 1997, pp. 393-401. McBeth, Michael S., "A Theory of Interoperability Failures," 8th International Command and Control Research and Technology Symposium, National Defense University, Washington, DC, June 17-19, 2003. (http://www.dodccrp.org/8th_ICCRTS/Tra cks/track_1.htm) [Accessed November 3, 2003] Popper, Karl R., The Logic of Scientific Discovery. Routledge Classics, New York, 2002. Swets, John A., Robyn M. Dawes, and John Monahan, "Psychological Science Can Improve Diagnostic Decisions," American Psychological Society, Psychological Science in the Public Interest, Vol. 1, No. 1, May 2000.

Biography

Michael S. McBeth is the Joint Test Director for the Joint Methodology to Assess C4ISR Architecture (JMACA) Joint Test and

Evaluation project in Suffolk, Virginia. He is a senior member of the Institute of Electrical and Electronics Engineers and a member of the International Council on Systems Engineering. Mr. McBeth has a Master of Arts in national security and strategic studies from the Naval War College.

10

Information

Decoupling: A Systems Engineering Technique to Move Beyond Replicable Measurements

10 pages

Report File (DMCA)

Our content is added by our users. We aim to remove reported files within 1 working day. Please use this link to notify us:

Report this file as copyright or inappropriate

1334958


You might also be interested in

BETA
Decoupling: A Systems Engineering Technique to Move Beyond Replicable Measurements