Regularities in data from factorial experiments

Regularities in Data from Factorial XIANG LI,1 NANDAN SUDARSANAM,2 AND DANIEL D. FREY1,2 Massachusetts Institute of Technology, 1Department of Mechanical Engineering; and 2Engineering Systems Division, Cambridge, Massachusetts 02139 This paper was submitted as an invited paper resulting from the "Understanding Complex Systems" conference held at the University of Illinois–Urbana Champaign, May 2005 Received May 3, 2005; revised March 4, 2006; accepted March 6, 2006 This article documents a meta-analysis of 113 data sets from published factorial experiments. The studyquantifies regularities observed among factor effects and multifactor interactions. Such regularities are knownto be critical to efficient planning and analysis of experiments and to robust design of engineering systems. Threepreviously observed properties are analyzed: effect sparsity, hierarchy, and heredity. A new regularity isintroduced and shown to be statistically significant. It is shown that a preponderance of active two-factorinteraction effects are synergistic, meaning that when main effects are used to increase the system response, theinteraction provides an additional increase and that when main effects are used to decrease the response, theinteractions generally counteract the main effects. 2006 Wiley Periodicals, Inc. Complexity 11: 32– 45, 2006 Key Words: design of experiments; robust design; response surface methodology nisms. The authors have carried out meta-analysis of 113 cover regularities arising in natural, artificial, and so- science and engineering disciplines. The goal was to identify cial systems and to identify their underlying mecha- and quantify regularities in the experimental data regardingthe size of factor effects and interactions among factors.
These regularities appear to arise from the interplay of thephysical behavior of the systems and the knowledge of the Corresponding author: Daniel D. Frey, Department of Me- experimenters. Therefore our results should be interesting chanical Engineering and Engineering Systems Division, to a broad range of investigators in complex systems includ- Massachusetts Institute of Technology, 77 Massachusetts Av- ing engineers, statisticians, physicists, cognitive scientists, enue, Cambridge, MA 02139 (e-mail: [email protected]) and social scientists.
2006 Wiley Periodicals, Inc., Vol. 11, No. 5 DOI 10.1002/cplx.20123 This article is organized as follows: Section 2 presents the experiment [7]. In the 2k design, the main effect of a factor motivation for the study and provides some necessary back- is computed by averaging of all the responses at each ground in the Design of Experiments; Section 3 describes level of that factor and taking the difference.
the research methodology; Section 4 gives an example of the Interaction: The failure of a factor to produce the same analysis using one of our data sets; Section 5 presents the effect at different levels of another factor [7]. An interac- results of the meta-analysis; Section 6 presents an investi- tion that can be modeled as arising from the joint effect of gation of nonlinear transformation of the responses and its two factors is called a two-factor interaction. Similarly, influence on the regularities; and Section 7 presents con- three-factor interactions and higher order interactions clusions and suggestions for future research.
may be defined.
2.2. Why Are Regularities in Experimental DataImportant? 2.1. What is Design of Experiments and Why Is It Based on experience in planning and analyzing many ex- periments, practitioners and researchers in DOE have iden- Experimentation is an important activity in design of sys- tified regularities in the interrelationships among factor ef- tems. Most every existing engineering system was shaped by fects and interactions. Such regularities are frequently used a process of experimentation including preliminary investi- to justify experimental design and analysis strategies [8].
gation of phenomena, subsystem prototyping, and system This section reviews three regularities noted in the DOE verification tests. Major, complex systems typically require literature describing their nature, origins, and influence on thousands of experiments [1]. Consequently, experimenta- theory and practice. These regularities are effect sparsity, tion is a significant driver of development cost and time to hierarchical ordering, and effect heredity.
market. There is pressure to drive down the resource re- Effect sparsity refers to the observation that number of quirements of experimentation, especially in commercially relatively important effects in a factorial experiment is gen- erally small [9]. This is sometimes called the Pareto Principle The mathematical and scientific discipline of Design of in Experimental Design, based on analogy with the obser- Experiments (DOE) seeks to provide a theoretical basis for vations of the 19th century economist Vilfredo Pareto, who experimentation across many domains of inquiry. Com- argued that, in all countries and times, the distribution of monly articulated goals of DOE include: making scientific income and wealth follows a logarithmic pattern resulting investigation more effective and reliable [2]; efficient pro- in the concentration of resources in the hands of a small cess and product optimization [3]; and improvement of number of wealthy individuals.
system robustness to variable or uncertain ambient condi- Effect sparsity appears to be a phenomenon character- tions, internal degradation, manufacturing, or customer use izing the knowledge of the experimenters more so than the profiles [4 – 6]. The use of DOE in engineering appears to be physical or logical behavior of the system under investiga- rising as it is frequently disseminated through industry "Six tion. Investigating an effect through experimentation re- Sigma" programs, corporate training courses, and university quires an allocation of resources—to resolve more effects typically requires more experiments. Therefore, effect spar- This article relies on several concepts and terms from sity is in some sense an indication of wasted resources. If DOE. To make the discussion clear to a broad audience of the important factor effects could be identified during plan- investigators in complex systems, the following definitions ning, then those effects might be investigated exclusively, resources might be saved, and only significant effects wouldbe revealed in the analysis. But experimenters are not nor- Response: An output of the system to be measured in an mally able to do this. Effect sparsity is therefore usually evident, but only after the experiment is complete and the Factor: A variable that is controlled by the experimenter data have been analyzed.
to determine its effect on the response.
Researchers in DOE have devised means by which the Active factor: A factor that experiments reveal to have a sparsity of effects principle can be exploited to seek efficien- significant effect on the system response.
cies. Many experiments are designed to have projective Level: The discrete values a factor may take in an exper- properties so that when dimensions of the experimental space are collapsed, the resulting experiment will have de- Full factorial experiment: An experiment in which every sired properties. For example, the fractional factorial 23⫺1 possible combination of factor levels is tested. In a system design may be used to estimate the main effects of three with k factors, each having two levels, the full factorial factors A, B, and C. As Figure 1 illustrates, if any of the three experiment is denoted as the 2k design.
dimensions associated with the factors is collapsed, the Main effect: The individual effects of each factor in an resulting design becomes a full factorial 22 experiment in 2006 Wiley Periodicals, Inc.
the remaining dimension [10]. Latin Hypercube Sampling has become popular for sampling computer simulations ofengineering systems, suggesting that its projective proper-ties provide substantial practical advantages for engineeringdesign. Although effect sparsity is widely accepted as auseful regularity, better quantification seems to be needed.
Reliance on effect sparsity has led to strong claims aboutsingle array methods of robust design, but field investiga-tion have shown that crossed arrays give better results [11].
Degrees of reliance on effect sparsity may be the root causeof some disagreements about methodology in robust de-sign.
Hierarchical ordering (sometimes referred to as simply "hierarchy") is a term denoting the observation that maineffects tend to be larger on average than two-factor inter-actions, two-factor interactions tend to be larger on averagethan three-factor interactions, and so on [12]. Effect hierar-chy is illustrated in Figure 2 for a system with four factors A,B, C, and D. Figure 2 illustrates a case in which hierarchy isnot strict—for example, that some interactions (such as thetwo-factor interaction AC) are larger than some main effects(such as the main effect of B).
The phenomenon of hierarchical ordering is partly due to the range over which experimenters typically explorefactors. In the limit that experimenters explore smallchanges in factors and to the degree that systems exhibitcontinuity of responses and their derivatives, linear effectsof factors tend to dominate. Therefore, to the extent thathierarchical ordering is common in experimentation, it isdue to the fact that many experiments are conducted for thepurpose of minor refinement rather than broad-scale explo-ration.
The phenomenon of hierarchical ordering is also partly determined by the ability of experimenters to transform theinputs and outputs of the system to obtain a parsimoniousdescription of system behavior [13]. For example, it is wellknown to aeronautical engineers that the lift and drag ofwings is more simply described as a function of wing areaand aspect ratio than by wing span and chord. Therefore,when conducting experiments to guide wing design, engi- The projective property of the fractional factorial 23⫺1 design of an neers are likely to use the product of span and chord (wing area) and the ratio of span and chord (the aspect ratio) asthe independent variables. Therefore, one might say thatthe experimenters have performed a nonlinear transforma- the two remaining factors. Projection, in effect, removes a tion of input variables (span and chord) before conducting factor from the experimental design once it is known to the experiments. In addition, after conducting the experi- have an insignificant effect on the response. Projective ments, further transformations might be conducted on the properties of fractional factorial experiments can enable an response variable. In aeronautics, lift and drag are often investigator to carry out a full factorial experiment in the transformed into a nondimensional lift and drag coeffi- few critical factors in a long list of factors without knowing cients by dividing the measured force by dynamic pressure a priori which of the many factors are the critical few.
and wing area. It is also common in statistical analysis of Similarly, Latin Hypercube Sampling enables an experi- data to apply transformations such as a logarithm as part of menter to sample an n-dimensional space so that, when n ⫺ exploration of the data. A key aspect of hierarchical ordering 1 dimensions collapse, the resulting sampling is uniform in is its dependence on the perspective and knowledge of the 2006 Wiley Periodicals, Inc.
The hierarchy and heredity among main effects and interactions in a system with four factors A, B, C, and D.
experimenter as well as conventions in reporting data. It is face methodology, high-resolution experiments (e.g., cen- important in assessing regularities in published experimen- tral composite designs) are frequently used with a small tal data that we do not alter the data as it was presented in number of factors only after screening and gradient-based any ways that affect its hierarchical structure. Section 4 will search bring the response into the neighborhood where provide some exploration of this issue.
interactions among the active factors are likely. Effect he- Effect hierarchy has a substantial effect on the resource redity can also provide advantages in analyzing data from requirements for experimentation. A full factorial 2k exper- experiments with complex aliasing patterns, enabling ex- iment allows one to estimate every possible interaction in a perimenters to identify likely interactions without resorting system with k two-level factors, but the resource require- to high-resolution designs [15].
ments grow exponentially as the number of factors rises. A The effect structures listed above have been identified saturated, resolution III fractional factorial design allows through long experience by the DOE research community one to estimate main effects in a system with k two-level and by practitioners who plan, conduct, and analyze exper- factors with only k ⫹ 1 experiments, but the analysis may be iments. The effect structures figure prominently in discus- seriously compromised if there are large interaction effects sion of DOE methods, including their theoretical underpin- in the system. Better quantification of effect hierarchy nings and practical advice on their use. However, effect seems to be needed to guide choice between these alterna- structures have not been quantified by formal empirical tives and the many other options for experimental planning.
methods. Further, there has been little effort to search for For example, the degree to which systems exhibit hierarchy other regularities that may exist in experimental data across has been shown to strongly determine the effectiveness of many domains. These gaps in the literature motivated the robust design methodologies [14]. If such decisions among investigation described in the next sections.
robust design methods can be based on empirical studies,further efficiencies may be possible.
3. RESEARCH METHODOLOGY Effect heredity (sometimes referred to as "inheritance") The present study was performed using a set of 46 pub- implies that, in order for an interaction to be significant, at lished engineering experiments that includes 113 responses least one of its parent factors should be significant [8]. This in all. A General Linear Model was used to estimate factor regularity can strongly influence sequential, iterative ap- effects in each data set and the Lenth method was used to proaches to experimentation. For example, in response sur- identify active effects. Then, across the set of 113 responses, 2006 Wiley Periodicals, Inc.
the model parameters and the relevant conditional proba- linear combination of functions of the experimental factors.
bilities were analyzed. Details of the approach are given in In DOE, the GLM often takes a form of a polynomial. If the the following seven subsections.
experiment uses only two levels of each factor, then anappropriate model should include only selected polynomial 3.1. The Set of Experimental Data terms resulting in the following equation: We assembled a set of 46 full factorial 2k experiments pub-lished in academic journals or textbooks [16 – 60]. The ex- periments come from a variety of fields including biology, 兲 ⫽ ␤ ⫹ 冘 ␤ ⫹ 冘 冘 ␤ chemistry, materials, mechanical engineering, and manu- 1, x 2, . . , x n facturing. The reason we used full factorial designs is thatwe did not want to assume the existence of any given effect 冘n n structure in this investigation, we want to test it and quan- ijkx ix jx k · · · ⫹ ␧. (1) tify it. Full factorial experiments allow all the interactions in i⫽1 ji kj a system to be estimated. The reason that we used two-levelexperiments is that they are much more common in the The term ␤ is a constant that represents the mean of the literature than other full factorial experiments and we response. The terms ␤ quantify the main effects of the wanted a large sample size.
factors x on the system response. The terms ␤ determine Many of the 46 experiments contain several different the two-factor interactions involving factors x and x . Sim- responses since a single set of treatments may affect many ilarly, terms ␤ quantify the three-factor interactions. In different observable variables. Our set of 46 experiments two-level designs, the input variables are frequently normal- includes 113 responses in all. Table 1 provides a complete ized into coded levels of ⫺1 and ⫹1. Given this normaliza- list of these responses. Table 2 summarizes some relevant tion, the sizes of the coefficients ␤ can be compared directly facts about the overall set. For example, Table 2 reveals that to assess the relative influence of the factor effects.
the vast majority of the experiments had either 3 or 4 fac-tors. The number of main effects and interactions are also 3.3. The Lenth Method for Effect Analysis listed, but this is not based on analysis of the data, but only An effect in an experiment is the observed influence of a on the number of effects resolvable by the experimental factor or combination of factors on a response. An effect is design. It is notable that the data set includes 569 two-factor sometimes said to be "active" if it is judged to be a signifi- interactions and only 383 three-factor interactions because cant effect by one of various proposed statistical tests.
the 54 responses from 23 designs each contribute only one Among the commonly used test for "active" effects are the potential three-factor interaction. Note that the one re- Normal Plot (or Half-Normal Plot) method [61], Box-Meyer sponse from a 27 experiment contributes 35 potential three- method [9], and the Lenth method [62]. In this investigation, factor interactions that represent about 9% of the potential the Lenth method was selected because it is applicable to three-factor interactions in the entire set.
unreplicated factorial experiments, because it is computa- All the experimental data in this research were recorded tionally simple, and because it can be automated without in our database in the form they were originally reported in applying many arbitrary assumptions. In the Lenth method, the literature. No nonlinear transformations were per- a plot is made of the numerical values of all effects and a formed before entry into the database nor were nonlinear threshold for separating active and inactive effects is calcu- transformations conducted during the meta-analysis pre- lated based on the standard error of effects. In the first step, sented in Section 5; therefore the regularities we report in a parameter s is formed: that section are regularities in data as they are presented by experimenters. As it is widely known in the statistics com-munity, nonlinear transformation of the response can 1.5 ⫻ median兩␤兩, sometimes lead to more parsimonious models and reduceactive interactions. Therefore, to explore how nonlinear where ␤ includes all estimated effects including main effects transformations affect regularities, we conducted a fol- and interactions ␤ , ␤ , . . , ␤ , . . Then the pseudo stan- low-up study using the same methods, but performing the dard error (PSE) and margin of error of the effects are analysis of the data after a log transform was applied (these defined to be, respectively, results are in Section 6). This issue of transformation of datais also briefly explored via an example in Section 4.
PSE ⫽ 1.5 ⫻ median兩␤兩 3.2. The General Linear Model The General Linear Model (GLM) is frequently used in sta-tistics. The GLM represents the response of a system as a Margin of Error ⫽ t 2006 Wiley Periodicals, Inc.
List of the Responses Subjected to Meta-analysis Engineering System [Ref.] Engineering System [Ref.] Remediating aqueous heavy metals [16] Finish turning [38] Epitaxial layer growth [8] Limestone effects [39] Processing of incandescent lamps [17] Lumens fluct.
Init. setting time Final setting time Cr toxicity and L. nimor [40] Power fluct.
Life fluct.
Wood sanding oper. [41] Cherry removal rate Glass fiber composites [18] Stiffness tans.
Maple removal rate Pine removal rate Strength trans.
Cherry surface rough Solvent extraction of cocaine [19] Maple surface rough Plasma spraying of ZrO2 [20] Oak surface rough Pine surface rough Grinding of silicon wafers [42] Post-exp. bake in x-ray mask fab. [21] Concrete mix hot clim. [43] Compressive strength EDM of carbide composites [22] Color-improved lamps [44] Polymerization of microspheres [23] Machinability study [45] Diffusion welding [46] Electrocoagulation [47] Fine grinding [48] Max grinding force Max motor current Ball burnishing of an ANSI 1045 [24] Grinding cycle time Abrasive wear of Zi-Al alloy [25] Surface roughness Leaching of manganese [49] Surface morphology of films [26] Aqueous SO2 leaching [50] Ident. of radionuclide [51] Crystal growth [52] Experimental scores Pilot plant filtration rate [28] Friction measurement machine [29] Frict coeff val.
Chl and tetracycline [54] Frict coeff fluct.
Detonation spray process [30] Erosion durability [55] Antifungal antibiotic [56] Antifungal antibio. act.
Production of surfactin [31] Xylitol production [57] Steam-exp. laser-printed paper [32] Thermal fatigue of PWBs [58] Hydrosilylation of polypropylene [33] Wire EDM process [59] Solid polymer electrolyte cells [34] Simulation of earth moving sys. [35] Fractionation of rapeseed lecithin [36] Wet clutch pack [60] Deter. of reinforced concrete [37] *This experiment was not a full factorial design, but contained a full factorial design as a subset. Only the full factorial settings were used in themeta-analysis.
2006 Wiley Periodicals, Inc.
A Summary of the Set of 113 Responses and the Potential Effects Therein is the 0.975th quantile of the t-distribution 4. Calculate the confidence intervals (␣ ⫽ 0.05) for the and df is the statistical degrees of freedom. Lenth [62] sug- percentages of potential effects that are active. As some gests that the degrees of freedom should be one third of the of the active numbers of interactions are very small, we total number of effects.
construct exact two-sided confidence intervals based on The margin of error for effects is defined to provide the binomial distribution.
approximately 95% confidence. A more conservative mea-sure, the simultaneous margin of error (SME) is also defined 3.5. Method for Quantifying Hierarchy To test and quantify effect hierarchy, we compared the sizeof main effects with that of two-factor interactions, and the SME t␥ ⫻ size of two-factor interactions with that of three-factor in-teractions. As the responses in different data sets are in different units, we need to normalize them in order to makecomparisons. We choose to make an affine transformation 共1 ⫹ 0.951/m兲 so that the minimum response and maximum response in each experiment were each, respectively, 0 and 100. Thisnormalization was only required in our assessment of hier- where m is the total number of effects. In the Lenth method, archy and did not influence our assessment of other regu- it is common to construct a bar graph showing all effects larities discussed in this article. The following steps sum- with reference lines at both the margin of error and at the marize the procedure we used to assess hierarchy: simultaneous margin of error. In this article, we needed toselect one consistent criterion of demarcation between ac- 1. Normalize the responses of each experiment by means of tive and inactive effects. We judged it was more appropriate an affine transformation so that they all range over the to use the margin of error as the criterion in study of full same interval [0, 100].
factorial experiments and that the alternative simultaneous 2. For each experiment, estimate all the main effects and margin of error criterion is more appropriate for screening interactions as described in Section 3.2.
3. Use conventional statistical tools such as box-plots to analyze the absolute values of the main effects, two- 3.4. Method for Quantifying Effect Sparsity factor interactions, and three-factor interactions.
To quantify effect sparsity in the set of data, we used the 4. Calculate the ratio between main effects and two-factor following procedure: interactions, two-factor interactions and three-factor in-teractions.
1. For each experiment, estimate all the main effects and interactions as described in Section 3.2.
2. Apply the Lenth method and label each effect as either 3.6. Method for Quantifying Heredity active or inactive as described in Section 3.3.
To quantify heredity in the set of data, we analyzed proba- 3. Categorize the effects into main effects, two-factor inter- bilities and conditional probabilities of effects being active.
actions, and three-factor interactions, etc. Calculate the Following the definitions and terminology of Chipman et al.
percentage of active effects within each category.
[15], we define p as the probability that a main effect is 2006 Wiley Periodicals, Inc.
active and define a set of conditional probabilities for two factor interactions: Pr共AB is active兩neither A nor B is active兲 (7) Pr共AB is active兩either A or B is active兲 Pr共AB is active兩both A and B are active兲.
Extending the terminology of Chipman et al. [15], we defined conditional probabilities for three-factor interac-tions as follows: Pr共ABC is active兩none of A, B, C are active兲 (10) A wet clutch pack (adapted from Lloyd [60]).
Pr共ABC is active兩one of A, B, C is active兲 (11) Pr共ABC is active兩two of A, B, C are active兲 (12) 4. Calculate the percentage of inactive two-factor interac- tions that are synergistic and antisynergistic.
Pr共ABC is active兩all of A, B, C are active兲. (13) 5. Calculate 95% confidence intervals for the synergistic and antisynergistic percentages using the binomial dis- On the basis of these definitions, we estimate the con- ditional probabilities as the frequencies observed in our setof 113 responses and associated factor effects.
4. AN ILLUSTRATIVE EXAMPLE FOR A SINGLE DATASET 3.7. Method for Quantifying Asymmetric Synergistic Before presenting the meta-analysis of the complete data- Interaction Structure base of 113 responses, it is helpful to observe how the We use the term "asymmetric synergistic interaction struc- method discussed in Section 3 reveals the effect structures ture" (ASIS) to describe the degree to which the signs of evident in a single data set. Lloyd [60] published a full main effects provide information about the likely signs of factorial (27) experiment regarding drag torque in disen- interaction effects. Given the GLM described in Section 2.2, gaged wet clutches. A wet clutch, such as the one depicted a synergistic two-factor interaction will satisfy the inequality in Figure 3, is a device designed to transmit torque from an ␤ ␤ ␤ ⬎ 0 and an antisynergistic two-factor interaction will input shaft that is normally connected to a motor or engine satisfy the inequality ␤ ␤ ␤ ⬍ 0. To evaluate the null hy- to an output (which in Figure 3 is connected to the outer pothesis that synergistic two-factor interactions and anti- case). When a wet clutch pack is disengaged, it should synergistic two-factor interactions are equally likely, we fol- transmit no torque and thereby create no load on the motor.
lowed these steps: In practice, wet clutch packs result in a nonzero drag torque Step 1: For each response resulting in power losses.
The study in [60] was conducted at Raybestos Manhattan 1. Estimate the main effects and interactions for each re- Inc., a designer and manufacturer of clutches and clutch sponse as described in Section 3.2.
materials. The experiment was designed to assess the influ- 2. Label each two-factor interaction as either synergistic or ence of various factors on power loss and was likely a part of antisynergistic according to our definition.
a long-term effort to make improvements in the design ofclutches. The factors in the study were oil flow ( A), pack Step 2: Carry out statistics on the set of 113 responses.
clearance (B), spacer plate flatness (C), friction materialgrooving (D), oil viscosity (E), friction material (F), and 1. Calculate the percentage of all two-factor interactions rotation speed (G). Most of these factors are normally under that are synergistic and antisynergistic.
the control of the designer; however, some of these variables 2. Use the Lenth method to discriminate between active such as oil viscosity might vary substantially during opera- effects and inactive effects.
tion and therefore were probably included in the study to 3. Calculate the percentage of active two-factor interactions assess there influence as noise factors. However, for the that are synergistic and antisynergistic.
purpose of the experiment, it must have been the case that 2006 Wiley Periodicals, Inc.
The Main Effects from the Clutch Case Study The Main Effects from the Clutch Case Study Using a log Transform Drag Torque (ft lbs) all these factors were brought under the control of theexperimenter to a substantial degree. Each factor was varied strongly a function of the number of factors involved.
between two levels and the drag torque was measured as Among main effects, 5 of 7 are active. Among two-factor the response. The complete results of the full factorial ex- interactions, 9 of 21 are active. Among three-factor inter- periment are too lengthy to present here, but the main actions, only 7 of 35 are active.
effects and active two-factor interactions as determined by Effect inheritance is strongly indicated. The four largest the Lenth method are presented in Tables 3 and 4. This is two-factor interactions involved two factors both with slightly different from Lloyd's analysis in the original article active main effects. Of the remaining five two-factor in- because there he simply assumed effects of order 4 or higher teractions, all involved at least one active main effect.
were all insignificant.
The hypothesized regularity, ASIS, was strongly evident.
Every major effect structure under investigation in this Seven of nine active two-factor interactions meet the study can be observed in this data set: criterion because the sign of the interaction effect equalsthe sign of the product of the participating main effects.
Effect sparsity is strongly indicated in the sense that there This example raises an important point about ASIS. Many are 127 effects estimable within this experiment, but only find the regularity to be surprising because, in their ex- 21 were active, 5 main effects, 9 two-factor interactions, perience, a response becomes increasingly difficult to and 7 higher order interactions. Effect sparsity is only further improve as successive improvements are made.
weakly indicated by the main effects since 5 out of 7 were ASIS is not necessarily inconsistent with this general active in the study, but is strongly indicated among inter- trend. In this example, to reduce drag torque, the main actions since only 14 of 122 possible interactions were effects suggest that both oil flow ( A) and grooving (D) should be set to coded levels of ⫺1. However, the signif- Effect hierarchy is strongly indicated because the propor- icant AD interaction would lead to far less reduction of tion of potential effects that actually prove to be active is drag torque than one would expect from the linear model.
In fact, the interactions will most likely determine thepreferred level of D rather than the main effect.
The Active Two-Factor Interactions from the Clutch Case Study Nonlinear transformations of responses can strongly af- fect regularities in data. To illustrate this, we applied a log Drag Torque (ft lbs) transformation to the drag torque of the wet clutch packand repeated our analysis of the data. The main effects and active two-factor interactions as determined by the Lenth method are presented in Tables 5 and 6. For this particular data set, the log transform failed to improve the hierarchical ordering of the data. The number of active two-factor inter- actions actually increased from 9 to 12. It is also important to note that in the original data, the synergistic interactions were more numerous, and in the transformed data the synergistic and antisynergistic interactions are equally rep-resented. This motivated an effort to assess the influence of 2006 Wiley Periodicals, Inc.
Figure 4 depicts a box plot of the absolute values of factor effects for each of three categories: main effects, two-factorinteractions, and three-factor interactions. The median of The Active Two-Factor Interactions from the Clutch Case Study Using main effect strength is about four times larger than the median strength of two-factor interactions. The median strength oftwo-factor interactions is more than two times larger than the median strength of three-factor interactions. However, Figure 4 also reveals that many two- and three-factor interactions were observed that were larger than the median main effect.
Again, the trends in this study support the principle of hierar- chy, but suggest caution in its application.
Table 8 presents the conditional probabilities of observ- ing active effects. This data strongly support the effect he- redity principle. Whether the factors participating in an interaction have active main effects strongly determines the likelihood of an active interaction effect. It is noteworthy that, under some conditions, a two-factor interaction is about as likely to be active as a main effect. In addition, it isobserved that, under the right conditions, a three-factorinteraction can be fairly likely to be active, but still only half transformations on ASIS through a second meta-analysis as likely as a main effect.
reported in Section 6.
Table 9 presents the results of our investigation into ASIS. First, it is noteworthy that about two-thirds of all 5. RESULTS OF META-ANALYSIS OF 133 DATA SETS two-factor interaction are synergistic. The confidence inter- The methods described in Section 3 were applied to the set vals for that percentage do not include 50%, so we can reject of 113 responses from published experiments (Table 1).
the null hypothesis that the two percentages might be equal.
Some of the main results of this meta-analysis are summa- Further, it is of practical significance that the percentage of rized in Table 7. The main effects were not very sparse, with synergistic effects is much higher among active two-factor more than one third of main effects classified as active.
interactions than among all two-factor interactions.
However, only about 7.4% of all possible two-factor inter-actions were active. The percentage drops steadily as the 6. ADDITIONAL INVESTIGATION OF THE LOG number of factors participating in the interactions rise.
Thus, Table 7 tends to validate both the effect sparsity The analysis in Section 5 is based on the data from experi- principle (especially as applied to interactions) and also ments as originally published without any nonlinear trans- tends to validate the hierarchical ordering principle. How- formations. However, response transformations are com- ever, this study also supports a caution in applying effect mon in analysis of experimental data. For background on sparsity and hierarchy. For example, if about 2.2% of three- good practice, see Wu and Hamada [8] who describe eight factor interactions are active (as Table 7 indicates), then commonly used transformations. One motivation for trans- most experiments with seven factors will contain one or forming data is variance stabilization. Another is generation more active three-factor interactions.
of a more parsimonious model with fewer higher order Percentage of Potential Effects in 113 Experiments That Were Active as Determined by the Lenth Method No. of active effects Percentage of effects that were active (%) Confidence intervals (␣ ⫽ 0.05) on the percentage of effects that were active (%) 2006 Wiley Periodicals, Inc.

Box plot of absolute values for main effects, two-factor interactions, and three-factor interactions.
terms. To provide a rough sense of how such transforma- cantly different from 50%. An analysis of two-factor inter- tions affect the regularities reported here, we focused on action synergies on the log transformed data can be found just one commonly employed transformation, the loga- in Table 10. Therefore, we conclude that the newly reported rithm. Of the 107 data sets that could be subject to this regularity of ASIS is a property of data as they are reported transformation (those containing only positive response by their experimenters (usually in physical dimensions) and values), it was found that log transformation resulted in is not generally persistent under nonlinear transformations more parsimonious models for 13 responses (meaning that of the reported data. ASIS is a function of the physical the number of active effects were reduced), whereas theuntransformed data produced more parsimonious modelsin 28 cases. In the other 66 responses, the number of sig- nificant effects was unaffected by the use of this transfor-mation. In addition, we observed that in both the full set of Synergistic and Antisynergistic Two-Factor Interactions in 113 Exper- 107 transformed responses and in the smaller set of 13 more parsimonious transformed responses, the proportion ofsynergistic and antisynergistic responses was not signifi- All two-factor interactions Confidence interval The Conditional Probabilities of Observing Active Effects Based Meta- (␣ ⫽ 0.05) (%) analysis of 113 Experiments Active two-factor interactions Confidence interval (␣ ⫽ 0.05) (%) 2006 Wiley Periodicals, Inc.
for the purpose of an extended study of system regularities and analyzed using the methods described here. Such an effortwould be resource intensive, but it would guard against po- Synergistic and Antisynergistic Interactions in 107 Experiments Whose tential biases introduced by studying only those systems on Responses Were Transformed Using a Logarithm which full factorial experiments have already been conducted.
One major outcome of this work is validation and quan- tification of previously known regularities. All three regular- ities commonly discussed in the DOE literature (effect spar- transformed data sets sity, hierarchy, and heredity) were confirmed as statistically significant. However, many investigators will find that, ac- cording to this study, these regularities are not as strong as they previously supposed. Although effect sparsity and hi- C.I. (␣ ⫽ 0.05) (%) erarchy are statistically significant trends, exceptions to Active two-factor these trends are not unlikely, especially given the large number of opportunities for such exceptions in complex systems. The data presented here suggest that a system with four factors is more likely than not to contain a significant ␣ ⫽ 0.05) (%) 13 data sets that interaction given that 7.4%(4) ⫹ 2.2%(4) ⬎ 50%. The data also suggest that a system with a dozen factors is likely to parsimonious using contain around 10 active interactions with roughly equal the log-transform numbers of two-factor interactions and three-factor inter- actions since 7.4%(12) ⬇ 2.2%(12) ⬇ 5. These observations may be important in robust parameter design. It is known that robust design relies on the existence of some two-factor C.I. (␣ ⫽ 0.05) (%) interactions for its effectiveness. However, some three-fac- Active two-factor tor interactions may interfere with robust design, depend- ing on which method is used. For example, field compari- sons of single array methods and crossed array methods C.I. (␣ ⫽ 0.05) (%) have revealed that crossed arrays are more effective. Thishas led to the conjecture that single arrays rely too stronglyon effect sparsity [11]. The meta-analysis in this articlesuggests that the problem may be more closely related to systems and whatever transformations experimenters actu- effect hierarchy. Depending on the number of factors, ally use before reporting the data, but may be altered by three-factor interactions may be more numerous than two- factor interactions. Any robust design method that relies onstrong assumptions of effect hierarchy is likely to give dis- 7. CONCLUSIONS AND FUTURE WORK appointing results unless some effective steps are taken to The results presented here must be interpreted carefully. It is reduce the likelihood of these interactions through system important to acknowledge the many influences on the data design, response definition, or factor transformations.
that we subjected to meta-analysis. This investigation was Another benefit may arise from this study because it entirely based on two-level full factorial experiments pub- quantifies effect heredity. Bayesian methods have been pro- lished in journals and textbooks. Full factorial experiments are posed for analyzing data from experiments with complex most likely to be conducted for systems that have already been aliasing patterns [13]. These methods require prior proba- investigated using less resource intensive means. For example, bilities for the parameters given in Table 8 ( p , p , and so it is common practice to use a screening experiment before on). A hypothesis for future investigation is that using the using a higher resolution design. A specific consequence is results in Table 8 in concert with Bayesian methods will that all the estimates of percentages of active effects in Table 2 provide more accurate system models than the same meth- may be inflated. If the screening stage has filtered out several ods using previously published parameter estimates.
inactive factors, then the experiments with the remaining fac- Another major outcome of this study is identification tors are more likely to exhibit active effects of all kinds. In order and quantification of ASIS—a strong regularity not previ- to characterize the structure of a larger population of systems ously identified in the literature. It was shown that about on which experiments have been conducted, responses could 80% of active two-factor interactions are synergistic, mean- be selected at random from many engineering domains, and ing that ␤ ␤ ␤ ⬎ 0. The consequences of ASIS for engineer- then full factorial experiments might be carried out specifically ing design require further discussion. In cases wherein 2006 Wiley Periodicals, Inc.
larger responses are preferred, procedures that exploit main effects are likely to enjoy additional increases due to active The financial support of the National Science Foundation two-factor interactions even if those interactions have not (award 0448972) and the support of the Ford/MIT Alliance are been located or estimated. By contrast, in cases wherein greatly appreciated. The comments of an anonymous reviewer smaller responses are preferred, procedures that exploit proved helpful in clarifying the presentation of this research.
main effects to reduce the response are likely to be penal-ized by increases due to active two-factor interactions. The discussion of ASIS and its relationship to improvement ef- 1. Thomke, S. Enlightened experimentation: The new imperative for forts raises the question of why ASIS was defined as it was in innovation. Harvard Business Rev 2001, 67–75.
this article. This definition was chosen because it revealed 2. Box, G.E.P.; Hunter, W.G.; Hunter, J.S. Statistics for experimenters: the new, statistically significant regularity in the data set.
An introduction to design, data analysis, and model building; John Other relationships among main effects and interactions Wiley & Sons: New York, 1978.
3. Box, G.E.P.; Draper, N.R. Empirical model-building and response were explored and found to be insignificant. However, any surfaces; John Wiley & Sons: New York, 1987.
regularities associated with improvements rather than in- 4. Taguchi, G.; translated by Tung, L.W. System of experimental design: creases raise practical and conceptual difficulties. This Engineering methods to optimize quality and minimize costs. Quali- study was based on meta-analysis of published data sets. If tyResources: A division of the Kraus Organization Limited: White the authors of published data sets do not clearly state Plains, NY, American Supplier Institute: Dearborn, MI, 1987.
5. Phadke, M.S. Quality engineering using robust design; PTR Prentice- whether larger or smaller responses are preferred, how can Hall: Englewood Cliffs, NJ, 1989.
one define "improvement" for that data set? Further, even if 6. Logothetis, N.; Wynn, H.P. Quality through design: Experimental the authors express a preference, might not a different ap- design, off-line quality control and Taguchi's contributions; Clarendon plication of the same physical phenomenon reverse that Press: Oxford, 1994.
preference? By contrast, regularities associated with the 7. Montgomery, D.C. Design and analysis of experiments; John Wiley & Sons: New York, 2004.
published values reflect regularities in physical phenom- 8. Wu, C.F.J.; Hamada, M. Experiments: Planning, design, and param- ena as observed and interpreted by the experimenters. To eter optimization; John Wiley & Sons: New York, 2000.
the extent that such regularities exist and can be con- 9. Box, G.E.P.; Meyer, R.D. An analysis for unreplicated fractional fac- firmed as stable and reliable, they can be helpful in in- torials. Technometrics 1986, 28, 11–18.
terpreting data.
10. McKay, M.D.; Beckman, R.J.; Conover, W.J. Comparison of three methods for selecting values of input variables in the analysis of Some experienced practitioners will find ASIS surprising.
output from a computer code. Technometrics 2000, 42, 1, 55– 61.
It is common for experimenters to report that, if they use 11. Kunert, J.A. Comparison of Taguchi's product array and the combined experimentation to attain some increases in a response, array in robust-parameter-design. Spring Research Conference on then any further increase will be harder to attain. We agree Statistics in Industry and Technology, Gaithersburg, MD, 2004.
12. Hamada, M.; Wu, C.F.J. Analysis of designed experiments with that this is the general trend in engineering quality improve- complex aliasing. J Qual Technol 1992, 24, 130 –137.
ments, but how our proposed synergy concept relates to 13. Box, G.E.P.; Liu, P.T.Y. Statistics as a catalyst to learning by scientific this issue is not so simple. When engineers seek to improve method. J Qual Technol 1999, 31-1, 1–29.
a system, they move toward regions of improvement until 14. Frey, D.D.; Li, X. Validating robust parameter design methods. ASME locating local maxima or constraints. These maxima and Design Engineering Technical Conference 2004, DETC2004 –57518,Salt Lake City, Utah.
constraints make additional improvements difficult to 15. Chipman, H.M.; Hamada, M.; Wu, C.F.J. Bayesian variable-selection achieve. Our results are based on meta-analysis of 2k exper- approach for analyzing designed experiments with complex aliasing.
iments. It is an interesting question whether such experi- Technometrics 1997, 39-4, 372–381.
ments are typically conducted at local maxima or away from 16. Admassu, W.; Breese, T. Feasibility of using natural fishbone apatite them. If 2k experiments are typically conducted away from as a substitute for hydroxyapatite in remediating aqueous heavymetals. J Hazardous Mater 1999, B69, 187–196.
local maxima, there are at least two explanations: 1) the 17. Bergman, R.S.; Cox, C.W.; DePriest, D.J.; Faltin, F.W. Effect of maximum has not yet been located, or 2) constraints on the process variations on incandescent lamp performance. J Illumination design space are limiting the optimization of that engineer- Engng Soc 1990, 19-2, 132–141.
ing system. Determining the underlying reasons for ASIS is 18. Bogoeva-Gaceva, G.; Mader, E.; Queck, H. Properties of glass fiber an interesting subject for future research. It is odd that such polypropylene composites produced from split-warp-knit textile pre-forms. J Thermoplastic Composite Mater 2000, 13, 363–377.
a strong regularity has not been discussed in either theoret- 19. Brachet, A.; Rudaz, S.; Mateus, L.; Christen, P.; Veuthey, J.L. Opti- ical or practical discourse regarding DOE. The previously mization of accelerated solvent extraction of cocaine and benzolecgo- known regularities of effect sparsity, hierarchy, and heredity nine from coca leaves. J Separation Sci 2001, 24, 865– 873.
are intellectual cornerstones of DOE and many popular 20. Friedman, M.; Savage, L.J. Planning experiments seeking maxima.
methods provide benefit by exploiting them. Perhaps future Techniques Statistical Anal 1947, 365–372.
21. Grimm, J.; Chlebek, J.; Schulz, T.; Huber, H.L. The influence of post-exposure research will give rise to new DOE methods that exploit ASIS bake on line width control for the resist system RAY-PN (AZ PN 100) in x-ray and thereby reduce resource demands and/or increase ef- mask fabrication. J Vac Sci Technol 1991, B9–6, 3392–3398.
fectiveness of engineering experimentation.
2006 Wiley Periodicals, Inc.
22. Karthikeyan, R.; Lakshmi Narayanan, P.R.; Naagarazan, R.S. Mathe- 42. Pei, Z.J.; Xin, X.J.; Liu, W. Finite element analysis for grinding of matical modeling for electric discharge machining of aluminum- wire-sawn silicon wafers: A designed experiment. Int J Machine Tools silicon carbide particulate composites. J Materi Processing Technol Manuf 2003, 43, 7–16.
1999, 87, 59 – 63.
43. Soudki, K.A.; El-Salakawy, E.F.; Elkum, N.B. Full factorial optimization 23. Laus, M.; Lelli, M.; Casagrande, A. Polyepichlorohydrine stabilized of concrete mix design for hot climates. J Materials Civil Engng 2001, core shell microspheres by dispersion polymerization. J Polymer Sci Nov/Dec, 427– 433.
A Polymer Chem 1997, 35-4, 681– 688.
44. Brabham, D.E. Designing color performance into color-improved HPS 24. Lee, S.S.G.; Tam, S.C.; Loh, N.H.; Miyazawa, S. An investigation into lamps. J Illuminating Engng Soc 1991, Winter, 50 –55.
the ball burnishing of an ANSI 1045 free-form surface. J Materi 45. Rahman, M.; Ramakrishna, S.; Thoo, H.C. Machinability study of Processing Technol 1992, 29, 203–211.
carbon/peek composites. Machining Sci Technol 1998, 3-1, 49 –59.
25. Modi, O.P.; Yadav, R.P.; Mondal, D.P.; Dasgupta, R.; Das, S.; Yeg- 46. Dini, J.W.; Kelley, W.K.; Cowden, W.C.; Lopez, E.M. Use of electrode- neswaran, A.H. Abrasive wear behavior of zinc-aluminum alloy— posited silver as an aid in diffusion welding. Welding Research Suppl 10% Al O composite through factorial design of experiment. J Mater 1984, January, 26s–34s.
Sci 2001, 36, 1601–1607.
47. Gurses, A.; Yalcin, M.; Dogar, C. Electrocoagulation of some reactive 26. Moskowitz, I.L.; Babu, S.V. Surface morphology and quality of a-Si: dyes: A statistical investigation of some electrochemical variables.
C:H films. Thin Solid Films 2001, 385, 45–54.
Waste Management 2002, 22, 491– 499.
27. Murugan, N.; Parmar, R.S. Effects of MIG process parameters on the 48. Pei, Z.J.; Strasbaugh, A. Fine grinding of silicon wafers: Designed geometry of the bead in the automatic surfacing of stainless steel. J experiments. Int J Machine Tools Manuf 2002, 42, 395– 404.
Mater Processing Technol 1994, 41-4, 381–398.
49. Sahoo, R.N.; Naik, P.K.; Das, S.C. Leaching of manganese from 28. Myers, R.H.; Montgomery, D.C. Response surface methodology; John low-grade manganese ore using oxalic acid as reductant in sulphuric Wiley & Sons: New York, 1995; p 105.
acid solution. Hydrometallurgy 2001, 62, 157–163.
29. Olofsson, U.; Holmgren, M. Friction measurement at low sliding speed 50. Naik, P.K.; Sukla, L.B.; Das, S.C. Aqueous SO2 leaching studies on using a servohydraulic tension-torsion machine. Exp Mechanics Nishikhal manganese ore through factorial experiment. Hydrometal- 1994, 34-3, 202–207.
lurgy 2000, 54, 217–228.
30. Saravanan, P.; Selvarajan, V.; Joshi, S.V.; Sundararajan, G. Experi- 51. Schultz, M.K.; Inn, K.G.W.; Lin, Z.C.; Burnett, W.C.; Smith, G.; Bie- mental design and performance analysis of alumina coatings depos- galski, S.R.; Filliben, J. Identification of radionuclide partitioning in ited by a detonation spray process. J Physics D Appl Phys 2001, 34, soils and sediments: determination of optimum conditions for the exchangeable fraction of the NIST standard sequential extraction 31. Sen, R. Response surface optimization of the critical media compo- protocol. Appl Radiat Isot 1998, 49(9 –11), 1289 –1293.
nents for the production of surfactin. J Chem Tech Biotechnol 1997, 52. Carter, C.W., Jr.; Doublie, S.; Coleman, D.E. Quantitative analysis of 68, 263–270.
crystal growth: Tryptophanyl-tRNA synthetase crystal polymorphism 32. Sharma, A.K.; Forester, W.K.; Shriver, E.H. Physical and optical and its relationship to catalysis. J Mol Biol 1994, 238, 346 –365.
properties of steam-exploded laser-printed paper. TAPPI J 1996, 53. Becerra, M.; Gonzalez Siso, M.I. Yeast b-galactosidase in solid-state 79-5, 211–221.
fermentations. Enzyme Microbial Technol 1996, 19, 39 – 44.
33. Shearer, G.; Tzoganakis, C. Free radical hydrosilylation of polypro- 54. Asanza Teruel, M.L.; Gontier, E.; Bienaime, C.; Nava Saucedo, J.E.; pylene. J Appl Polymer Sci 1996, 65-3, 439 – 447.
Barbotin, J.-N. Response surface analysis of chlortetracycline and 34. Shulka, A.K.; Stevens, P.; Hamnett, A.; Goodenough, J.P. A nafion- tetracycline production with K-carrageenan immobilized streptomy- bound platinized carbon electrode for oxygen reduction in solid ces aureofaciens. Enzyme Microbial Technol 1997, 21, 314 –320.
polymer electrolyte cells. J Appl Electrochem 1989, 19, 383–386.
55. Trezona, R.I.; Pickles, M.J.; Hutchings, I.M. A full factorial investiga- 35. Smith, S.D.; Osbourne, J.R.; Forde, M.C. Analysis of earth moving tion of the erosion durability of automotive clearcoats. Tribology Int systems using discrete event simulation. J Construction Engng Man- 2000, 33, 559 –571.
agement 1995, 121-4, 388 –396.
56. Gupte, M.; Kulkarni, P. A study of antifungal antibiotic production by 36. Sosada, M. Optimal conditions for fractionation of rapeseed lecithin Thermomonospora sp MTCC 3340 using full factorial design. J Chem with alcohols. JAOCS 1993, 70-4, 405– 410.
Technol Biotechnol 2003, 78, 605– 610.
37. Thompson, N.G.; Islam, M.; Lankard, D.A.; Virmani, Y.P. Environmen- 57. Mussatto, S.I.; Roberto, I.C. Optimal experimental condition for hemi- tal factors in the deterioration of reinforced concrete. Materials cellulosic hydrolyzate treatment with activated charcoal for xylitol Performance 1995, 34-9, 43– 47.
production. Biotechnol Prog 2004, 20, 134 –139.
38. Wang, X.; Feng, C.X.; He, D.W. Regression analysis and neural 58. Pan, T.-Y.; Cooper, R.R.; Blair, H.D.; Whalen, T.J.; Nicholson, J.M.
networks applied to surface roughness study in turning. IIE Trans Experimental analysis of thermal cycling fatigue of four-layered FR4 Design Manuf 2002.
printed wiring boards. J Electronic Packaging 1994, 116, 76 –78.
39. Vuk, T.; Tinta, V.; Gabrovsek, R.; Kaucic, V. The effect of limestone 59. Spedding, T.A.; Wang, Z.Q. Study on modeling of wire EDM process.
addition, clinker type and fineness on properties of Portland cement.
J Materials Processing Technol 1997, 69(1–3), 18 –28.
Cement Concrete Res 2001, 31, 135–139.
60. Lloyd, F.A. Parameters contributing to power loss in disengaged wet 40. Dirilgen, N. Effects of pH and chelator EDTA on Cr toxicity and clutches. Soc Automotive Engineers Trans 1974, 83, 2498 –2507.
accumulation in lemna minor. Chemosphere 1998, 37-4, 771–783.
61. Daniel, C. Use of half-normal plots in interpreting factorial two-level 41. Taylor, J.B.; Carrano, A.L.; Lemaster, R.L. Quantification of process experiments. Technometrics 1959, 1, 311–341.
parameters in a wood sanding operation. Forest Products J 1999, 62. Lenth, R.V. Quick and easy analysis of unreplicated factorials. Tech- 49-5, 41– 46.
nometrics 1989, 31-4, 469 – 473.
2006 Wiley Periodicals, Inc.


thedigest Primary Care Society for GastroenterologySample Article Cancer focus: Screening acceptability: more than the public can swallow?Pancreatic cancer: tracking a silent killer Beyond our scope:Endoscopy special Functional illness training - Treating constipation Patients deserve more In my early years in practice I looked after a young woman with primary liver


Circular 57 pinfish, eel, sea trout, tilapia, sturgeon, and striped bass (Inglis et al. 1993). Strep has also been Streptococcus is a genus of bacteria containing isolated from a variety of ornamental fish, including some species that cause serious diseases in a rainbow sharks, red-tailed black sharks, rosey number of different hosts. A major identifying