DoE: Factorial designs

Factorial Designs are used to determine which factors have an influence in the response(s) of our system, and the magnitude of their effects. Before running an experiment, the subject matter experts decide on the controllable and variable factors that are expected to have an influence in the response of the system. The total number of experiments is then calculated according to the number of factors, the levels given to them, and the degree of uncertainty we're willing to pay in order to save experimental time. Two types of factorial designs, the Full Factorial and the Fractional Factorial are discussed in this post.



Full Factorial Designs (2^k).

A Full Factorial design allows the determination of the effects of the main factors, and the effects of all their possible interactions. The most common designs use two levels for each factor, called "high" and "low", or "+1" and "-1". For a design with k factors and 2 levels, the total number of experiments is then 2^k. The total number of experiments doubles with the addition of every new factor, and in practice it becomes very time consuming dealing with 5 or more factors. In these cases, a Fractional design (see later) is recommended.

The Design Matrix is the tool used for planning the experiments, and for calculating the effects of each main factor and their interaction. For a Full Factorial design with 2 factors at 2 levels, the Design Matrix is:


where the 4 experimental runs are given by the 4 possible combinations of the two main factors in their "high" (+1) and "low" (-1) levels. Notice that the interaction between factors A and B (AB) is positive when both factors are simultaneously in the "high" or "low" level, and is negative when A and B have opposite relative values. Mathematically, the AB column in the Design Matrix (and in general every interaction in designs of higher order) is obtained by multiplying the values of every main factor in the design matrix.

Let's consider as an example that a researcher wants to evaluate the effect of Pressure (P) and Temperature (T) in the yield (R) of a chemical reaction. The researcher decides that the "high" and "low" levels for the two factors are, respectively, T(K) = [300, 400], and P(atm) = [1, 2].  After running the experiments according to a 2^2 Design Matrix, he obtains the following results:


The effects associated to each factor are obtained as the average of the responses when they are in the high level, minus the average of the responses when they are in the low level:
 

in our example:

which shows that the main factor that affects the yield in the reaction is the temperature (T). The effect of the Pressure comes at a distance, and the effect of the interaction T-P is an order of magnitude lower than the effect of T.


Fractional Factorial Designs (2^k-n).

When the number of factors is high, running a Full Factorial might not be efficient time- and cost-wise. For instances, a Full Factorial with 6 factors would require 64 runs. In these cases, Fractional Factorial designs reduce significantly the amount of experiments. In Fractional Factorials, we run only a subset of the total number of experiments of the Full Factorial Design Matrix. The trade off is a controlled amount of uncertainty in the model.

The subset of experiments is chosen so that main factors are confounded with some interactions.  This strategy assumes that the probability of the main factors having a effect on the system is higher than the probability of their interactions. For instances, an experimental design with 3 factors at 2 levels that would require 8 experimental runs in a Full Factorial Design, can be studied with only 4 experiments in a Fractional Factorial. The Design Matrix in this case is chosen so that the effects of the main factors are confounded with the first order interactions (A=BC, B=AC, C=AB). The Design Matrix in this case is the subset of grey rows of the Full Factorial indicated in the following table:

 


Selecting the significant factors.

Calculating the effects of the main factors and their interactions still leaves unanswered the question of their degree of significance. Using replicates and running an analysis of variance (ANOVA) allows discriminating the significant effects from those that only add noise to the system. Additionally, some graphical techniques already implemented in commercial DoE software, such as the Normal and Half Normal Probability Plots, are also of great help to discriminate between significant and non-significant effects.


Additional material.

For a detailed description of factorial designs check Walpole's Probability and Statistics for Engineers and Scientists.

Proprietary excel templates for a Full Factorial (2^3) and a Fractional Factorial (2^4-1) are included herewith as working examples.

No comments:

Post a Comment