Empirical Investigation Partial Lamarckianism of the Benefits of
Transcription
Empirical Investigation Partial Lamarckianism of the Benefits of
Empirical Investigation of the Benefits of Partial Lamarckianism Christopher R. Houck Jeffery A. Joines Box 7906 Department of Industrial Engineering North Carolina State University Raleigh, NC 27695-7906 chouck@eos.ncsu.edu Box 7906 Department of Industrial Engineering North Carolina State University Raleigh, NC 27695-7906 jjoine@eos.ncsu.edu Michael G. Kay James R. Wilson Box 7906 Department of Industrial Engineering North Carolina State University Raleigh, NC 27695-7906 kay@eos.ncsu.edu Box 7906 Department of Industrial Engineering North Carolina State University Raleigh, NC 27695-7906 jwilson@eos.ncsu.edu Abstract Genetic algorithms (GAS) are very efficient at exploring the entire search space; however, they are relatively poor at finding the precise local optimal solution in the region in which the algorithm converges. Hybrid GAS are the combination of improvement procedures, which are good at finding local optima, and GAS. There are two basic strategies for using hybrid GAS. In the first, Lamarckian learning, the genetic representation is updated to match the solution found by the improvement procedure. In the second, Baldwinian learning, improvement procedures are used to change the fitness landscape, but the solution that is found is not encoded back into the genetic string. This paper examines the issue of using partial Lamarckianism (i.e., the updating of the genetic representation for only a percentage of the individuals), as compared to pure Lamarckian and pure Baldwinian learning in hybrid GAS. Multiple instances of five bounded nonlinear problems, the location-allocation problem, and the cell formation problem were used as test problems in an empirical investigation. Neither a pure Lamarckian nor a pure Baldwinian search strategy was found to consistently lead to quicker convergence of the GA to the best known solution for the series of test problems. Based on a minimax criterion (i.e., minimizing the worst case performance across all test problem instances), the 20% and 40% partial Lamarckianism search strategies yielded the best mixture of solution quality and computational efficiency. Keywords Lamarckian evolution, Baldwin effect, local improvement procedures, hybrid genetic algorithms. 1. Introduction Genetic algorithms (GAS) are a powerful set of global search techniques that have been shown to produce very good results for a wide class of problems. Genetic algorithms search the solution space of a function through the use of simulated “Darwinian” evolution. In general, the fittest individuals of any population tend to reproduce and survive to the next @ 1997 by the Massachusetts Institute of Technology Evolutionary Computation S(1): 3 1-60 C:. K. Houck,J. A. Joines, M.G. Kay, and J. R. Wilson generation, thus improving successive generations. However, inferior individuals can, probabilistically, survive and also reproduce. Genetic algorithms have been shown to solve linear and nonlinear problems by exploring all regions of the state space and exponentially exploiting promising areas through mutation, crossover, and selection operations applied to individuals in the population. Whereas traditional search techniques use characteristics of the problem (objective function) to determine the next sampling point (e.g., gradients, Hessians, linearity, and continuiq), GASmake no such assumptions. Instead, the next sampled points are determined based on stochastic sampling/decision rules, rather than a set of deterministic decision rules. Therefore, evaluation functions of many forms can be used, subject to the minimal requirement that the function can map the population into a totally ordered set. A more complete discussion of GAS, including extensions and related topics, can be found in the books by Davis (1901), Goldberg (1989), 12/litchell (1996), and Michalewicz (1906). Hybrid G k s use local improvement procedures as part of the evaluation of individuals. For many problems there exists a well-developed, efficient search strategy for local improvement (e.g., hill-climbers for nonlinear optimization). These local search strategies complement the global search strateg?: of the G I , yielding a more efficient overall search stratea. T h e issue o f whether or not to recode the genetic string to reflect the result of the local improvement has given rise to research into Lamarckian search strategies (i.e., recoding the string to reflect the improvement) and Baldwinian search strategies (ix., leaving the string unaltered). In one investigation, \.f%itley, Gordon, and Mathias (1994) found that, for three test problems, utilizing a Baldwinian search strategy was superior in terms of finding the optirnal solution for the Schwefel function, whereas Lamarckian search performed better on the (;riewank function, and both strategies worked equally well on the Rastrigin function. Although most approaches to hybrid GASuse either a pure Lamarckian or a pure Baldwinian search strategy, neither is optimal for all problems. Therefore, this work concentrates on a partial Lamarckian strategy, where neither a pure Lamarckian nor pure Baldwinian search s t r a t e g is used; rather, a mix of Lamarckian and Baldwinian search strategies are used. This paper prcscnts an empirical analysis of the use of (;As utilizing local improvement procedures to solve a \.ariety of test problems. Section 2 describes how GAScan be hybridized hy incorporating local improvement procedures into the algorithm. Incorporating local improvement procedures gives rise to the concepts of Lamarckian evolution, the Baldwin effect, and partial Lamarchanism, all of which are explained in Section 2 . Also, the concept of a one-to-one geno+-pe to phenotype mapping is reviewed, where genotype refers to the space that the G.4 searches, whereas the phenotype refers to the space of the actual problem. T h e concepts introduced in Section Z are applied to three different classes of problems described in Section 3 : five different multimodal nonlinear function optimization problems, the location-allocation problem, and the inanuhcturing cell formation problem. Section 4 presents the results of a series of empirical tests of the various search strategies described in Section 7 . 2. Hybridizing GAS with Local Improvement Procedures Ilany researchers (Chu ei Keasley, 1995; Davis, 1991; Joines, I n g , & Culbreth, 1095; I\.licl;alewicz, 1996; Renders & Flasse, 1996) have shown that (;A5 perform well for global searching because they are capable of quickly finding and exploiting promising regions of the search space, but they take a relatively long time to converge to a local optimum. Local improvement procedures (e.g., pairwise switching for permutation problems and gradient 32 Evolutionan Computation \.ihiiie .iNutnher , I Benefits of Partial Lamarckianism descent for unconstrained nonlinear problems) quickly find the local optimum of a small region of the search space but are typically poor global searchers. They have been incorporated into GAS in order to improve the algorithm’s performance through individual “learning.” Such hybrid GAS have been used successfully to solve a wide variety of problems (Davis, 1991; Joines et al. (1995); Renders & Flasse, 1996). There are two basic models of evolution that can be used to incorporate learning into a GA: the Baldwin effect and Lamarckian evolution. T h e Baldwin effect allows an individual’s fitness (phenotype) to be determined based on learning (i.e., the application of local improvement). Like natural evolution, the result of the improvement does not change the genetic structure (genotype) of the individual. Lamarcluan evolution, in addition to using learning to determine an individual’s fitness, changes the genetic structure of the individual to reflect the result of the learning. Both “Baldwinian learning” and “Lamarckian learning” have been investigated in conjunction with hybrid GAS. 2.1 Baldwin Effect T h e Baldwin effect, as utilized in GAS, was first investigated by Hinton and Nolan (1987) using a flat landscape with a single well representing the optimal solution. Individuals were allowed to improve by random search, which in effect transformed the landscape to include a funnel around the well. Hinton and Nolan (1987) showed that, without learning, the GA fails; however, when hybridized with the random search, the GA is capable of finding the optimum. Whitley et al. (1994) demonstrate that “exploiting the Baldwin effect need not require a needle in a haystack and improvements need not be probabilistic.” They show how using a local improvement procedure (a bitwise steepest ascent algorithm performed on a binary representation) can in effect change the landscape of the fitness function into flat landscapes around the basins of attraction. This transformation increases the likelihood of allocating more individuals to certain attraction basins containing local optima. In a comparison of Baldwinian and Lamarckian learning, Whitley et al. (1994) show that utilizing either form of learning is more effective than the standard GA approach without the local improvement procedure. They also argue that although Lamarckian learning is faster, it may be susceptible to premature convergence to a local optimum compared with Baldwinian learning. Three numerical optimization problems were used to test this conjecture; however, the results were inconclusive. They believe that these test functions were too easy and that harder problems may in fact show that a Baldwinian search strategy is preferred. 2.2 Lamarckian Evolution Lamarckian evolution forces the genotype to reflect the result of the local search. This results in the inheritance of acquired or learned characteristics that are well adapted to the environment. T h e improved individual is placed back into the population and allowed to compete for reproductive opportunities. However, Lamarckian learning inhibits the schema processing capabilities of GAS (Gruau & Whitley, 1993; Whitley et al., 1994). Changing the genetic information in the chromosomes results in a loss of inherited schema, altering the statistical information about hyperplane partitions implicitly contained in the population. Although schema processing may not be an issue given the higher order cardinality representations for the test problems used in this paper, the problem of premature convergence remains the same. Baldwinian learning, on the other hand, certainly aggravates the problem of multiple genotype to phenotype mappings. A genotype refers to the composition of the values in the Evolutionary Computation Volume 5, Numher 1 33 C. R. Ilouck, J. A . Joines, XI. G. Kay, and J. R. Wilson chromosome or individual in the population, whereas a phenotype refers to the solution that is constructed from a chromosome. In a direct mapping, there is no distinction between genotypes and phenotypes. For example, to optimize the function 5 cos ( X I ) - sin (2x2), a typical representation for the chroniosorne would be a vector of real numbers ( X I , xz), which provides a direct mapping to the phenotype. However, for some problems, a direct mapping is neither possible nor desired (Xakano, 1991). T h e most common example ofthis is the T S P problem with an ordinal representation. Here, the genotype is represented by an ordered list of iz cities to visit, for example, ( q 1 1 , cpl,. . . ,c(,,]). However, the phenotype of the TSP is a tour, and any rotation of the chromosome yields the same tour; thus, any rotation of a genotype results in the same phenotype. For example, the two tours ( I , 2,3,4) and (3,4,1,2) have different genotypes because their gene strings are different; but both strings represent the same tour and thus have the same phenotype. It has been noted that having multiple genotypes map to the same phenotype may confound the G h (Homaifar, &an, & Liepins, 1993; Nakano, 1991). This problem also occurs when a local improvement procedure (1,IP) is used in conjunction with a GA. Consider the exainple of optimizing sin (s).Suppose a simple gradient-based LIP is used to determine the fitness of a chromosome. Then, any genoqpe in the range [ - 7 r / 2 , 3 ~ / 2 ]will have the same phenotype value of 1 a t T/Z. 2.3 Partial Lamarckian Hybrid <;*Asneed not he restricted to operating in either a pure Baldwinian or pure Lamarckian manner. Instead, a mix of both strategies, or what is termed partial Lamarckianism, could be employed together. For example, a possible strategy is to update the genotype to reflect the resulting phenoqpe in 50% of the individuals. Clearly, a 50% partial Lamarckian strateg!. has no justification in natural evolution, but for simulated evolution it provides an additional search strateg. Oruosh and Davis (1994 advocated the use of a 5% rule for updating individuals when employing repair functions in GASto solve constrained optimization problems. All infeasible solutions senerated by the GX are repaired to the feasible domain to determine their fitness. The 5% rule dictates that 5% of the infeasible individuals have their genetic representation updated to reflect the repaired feasible solution. This partial updating was shown on a set of com hinatorial problems to be better than either no updating or always updating. However, Michalewicz and Nazhivath (1995) determined that a higher percentage, 20% to 25%, of update did better when using repair functions to solve continuous constrained nonlinear programming problems. Previous research has concentrated either o n the comparison of pure Lamarckian and pure Kaldwinian search or the effectiveness of partial repair for constrained optimization. This u o r k examines the use of partial Lamarchanism with regard to the use of local search on a set of hountled nonlinear and comhinatorial optimization test problems. 3. Experimentation 'li) iiiwstipte the trade-off between disrupted schema processing (or premature convergence to a local optima) in 1,amarckian learning and multiple genotype mapping to the same phenovpe i n Baldwinian learning, a series of experiments using local improvement procedures as ei.aluation functions were run on several different test problems. For each test problem, the (;.A was run with varying levels of Iamrckianisni (i.e., updating the genotype to reflect the phenotype), as well as a pure genetic approach (i.e., no local improvement). T h e GA was Benefits of Partial Lamarckianism run with no Lamarckianism (i.e., a pure Baldwinian search strategy), denoted as 0% update, as well as 100% Lamarckianism (i.e., a pure Lamarckian search strategy). Combinations of both strategies (i.e., partial Lamarckian search strategies) were used to determine whether there was a trend between the two extremes and whether combinations of Baldwinian and Lamarckian learning were beneficial. In these experiments, individuals were updated to match the resulting phenotype with a probability of 5%, lo%, 20%, 40%, 50%, 60%, 80%, YO%, and 95%. These values were chosen to examine the effects at a broad variety of levels, but focusing attention at each extreme. For each problem instance, the GA was replicated 30 times, with common random seeds across the different search strategies. T h e GA and parameters used in these experiments are described in Appendix A. T h e GA is of a form advocated by Michalewicz (1996) and uses a floating-point representation with three realvalued crossovers and five real-valued mutations. However, for the cell formation problems, the GA is modified to work with integer variables. The GA was terminated when either the optimal solution was found or after 1 million function evaluations were performed. Using function evaluations takes into account the cost of using a LIP (i.e., the number of function evaluations generated by the LIP), thus allowing a direct comparison between the hybrid GA methods and the pure GA (i.e., no local improvement). Multiple instances of seven different test problems were used in this investigation to ensure that any effect observed (I) held across different classes of problems as well as different sized instances of the problem, and (2) was not due to some unknown structure of the specific class of problem selected, specific instance chosen, or the particular local improvement procedure used. T h e first five test problems are bounded nonlinear optimization test problems taken from the literature, whereas the last two problems are the location-allocation and manufacturing cell formation problems. For each of the nonlinear optimization problems, three different sized instances (n = 2,10,20) were used in the study. For both the location-allocation and cell formation problems, two different sized instances were used. 3.1 Brown’s Almost Linear Function The first test problem comes from a standard test set of nonlinear optimization problems (More, Garbow, & Hillstrom, 1981) and is as follows: where T h e function is not linearly separable and has the basic form of a nonlinear least squares problem. Three different instances of this problem were used for this investigation: a 2dimensional instance (Brown-2), a 10-dimensional instance (Brown- lo), and a 20-dimensional instance (Brown-20). 3.2 Corana Function The next test problem consists of three instances of the modified Corana problem: a 2-, lo-, and 20-dimensional instance (Corana-2, Corana- 10, and Corana-20, respectively). This Evolutionary Computation Volume 5, Number 1 3s C. R . I-fouck,J. A. Joines, hl. G. Kay, and J. R. Wilson problem comes froin a family of continuous nonlinear multimodal functions developed by Corana, Marchesi, iMartini, and Ridella (1987) to test the efficiency and effectiveness of simulated annealing compared to that of stochastic hill-climbing and Nelder-Mead optimization methods. This parameterized fatnily of test functions contains a large number of local minima. T h e function can be described as an n-dimensional parabola with rectangular pockets removed and with the global optimum always occurring a t the origin. These functions are parameterized with the number of dimensions ( I T ) , the number of local optima, and the width and height of the rectangular pockets. As forinulated in Corana, Marchesi, Martini, and Ridella (1987), the function proved to be too easy: All the learning methods found the optimal solution every time with the same amount of computational effort. Therefore, the first condition in the definition of g ( x I )below was modified so that rectangular troughs, instead of pockets, are removed from the function. This was accomplished by eliminating the restriction that all, instead of any, x, satisfy k,s, - t, 5 x, 5 k,s; + t,, and all k, are not zero. T h e modified Corana function is as follows: 1, f ( s )= C t . , g ( x , ) , I= for c, E 7 and s,E 1- ~O,OOO,IO,OOO~ (2) I where i7 0.1 j-;, $f((S,) = : , / = 7 = .’, = .I-; I , if k,s, t, 5 otherwise ~ .I-,5 k;r, + t;, for any integer k k,s, + I,, if k, < 0 0, if k, = 0 k,si - t,, if k, > 0 (1,1000,10,100,1,10,100,1000,1,10, 1, 10, 100, 1000) 0.2, t,=0.05 ,100,1000,1,10,100,1000, 3.3 Griewank Function T h e Griewank function is the third test problem, again with three instances of 2, 10, and 2 0 dimensions, respectively (Griewank-2, (hiewink- 10, and Griewank-20). T h e Griewank function is a bounded optimization problem used by U’hitley et al. (1994) to compare the effectiveness of Lamarckian and Baldwinian search strategies. T h e Griewank function, like Brown’s almost linear function, is not linearly separable and is as follows: 3.4 Rastrigin Function ‘l‘hc next test problem, with three instances of 2, 10, and 20 dimensions (Rastrigin-2, Rastrigin- 10, and Rastrigin-20), is also a bounded minimization problem taken from Whitley et al. (1094). T h e functional form is as follows: Benefits of Partial Lamarckianism 3.5 Schwefel Function The last nonlinear test problem, with three instances of 2, 10, and 20 dimensions (Schwefel-2, Schwefel- 10, and Schwefel-20), is also a bounded minimization problem taken from Whitley et al. (1994). T h e functional form is as follows: where V=-min{-x sin(fi):xE [-512,511]} and V depends on system precision; for our experiments, V = 418.9828872721625. 3.6 Sequential Quadratic Programming For these five nonlinear optimization test problems, a sequential quadratic programming (SQP)' optimization method was used as the LIP. SQP is a technique used to solve nonlinear programming problems of the following form: Minimizef(x), s.t. x E X where, for n; inequality and n, equality constraints, X is the set of points X bl<x<bu, g,(x)LO, j = 1 , ..., n;; E iXnsatisfying hj(x)=O, j = l , . . . , ne with 61 E 8";bu E W ; f92' :+ R smooth; gj : Rn + R , j = 1,. . . ,n, nonlinear and smooth; and h, : R"+ R , j = 1,. . . ,n, nonlinear and smooth. SQP works by first computing the descent direction by solving a standard quadratic program using a positive definite estimate of the Hessian of the Lagrangian. A line search is then performed to find the feasible minimum of the objective along the search direction. This process is repeated until the algorithm converges to a local minimum or exceeds a certain number of iterations. For the purposes of this study, SQP was stopped after 2 5 iterations if it had not found a local optimum. A full description of the SQP used in this study can be found in Lawrence, Zhou, and Tits (1994). 3.7 Location-Allocation Problem The continuous location-allocation (LA) problem, as described in Houck, Joines, and Kay (1996), is used as the next test problem. T h e LA problem, a type of nonlinear integer program, is a multifacility location problem in which both the location of n new facilities (NFs) and the allocation of the flow requirements of m existing facilities (EFs) to the new facilities are determined so that the total transportation costs are minimized. T h e LA problem is a difficult optimization problem because its objective function is neither convex nor concave, resulting in multiple local minima (Cooper, 1972). Optimal solution techniques are limited to small problem instances (less than 25 EFs for general l,, distances [Love & Juel, 19821 and up to 35 EFs for rectilinear distances [Love & Morris, 19751). Two rectilinear distance instances of this problem were used in the investigation: a 200-EF by a 20-NF instance (LA-200) and a 250-EF by a 25-NF instance (LA-250), both taken from Houck et al. (1996). 1 SUP is available in the Marlah environment (Grace, 1992) and as Fortran and C code (Lawrence e t al., 1994). Evolutionary Computation Volume 5, Numher 1 37 C . K. lIouck,J.X Joines, hl. G. Kay. a n d J . K. MTlson <;iven ?it EFs located at known points li/,j = 1, . . . , ?PI, with associated flow requirements u./,.j= 1, . . . , m, and the number of N F s i i , from which the flow requirements of the EFs are satisfied, the LA problem can be formulated as the following nonlinear program: ii /=I 111 /=I subject to where the unknown variable locations of the N F s X/, i = 1 , . . . , T I , and the flows from each S F i to each EFj, r/,, i = 1 , . . . , I / , j = 1,. . . , 7 ~ ,are the decision variables to be determined, and d(&Y,,((,)is the distance between NF i and EFj. .hinfinite number of locations are possible for each NF i since each /Yl is a point in a continuous, typically two-diinensional, sp"". 'l'he G-4approach to solve this problem, described in detail in H o u c k e t a]. (1996) is used in these experiments. T h e individuals in the GX are a vector of real values representing the starting locations for the NFs. Based on these locations for the NFs, the optimal allocations are found by allocating each EF to the nearest NF. T h e total cost of the resulting network is then the weighted sum of the distances from each EF to its allocated h'F. ' f h e alternative LA method developed by Cooper (1072) is used as the LIP for this problem. 'I'his method quickly finds a local minimum solution given a set of starting locations for the S F s . T h i s procedure w-orks by starting with a set of NF locations; it then determines an optimal set o f allocations based o n those locations, which for this uncapacitated version of the problem reduces to finding the closest NF to each EF. T h e optimal NF locations for these new allocations are then determined by solving I I single facility location problems, which is in general a nonlinear convex optimization problem. T h i s method continues until n o further allocation changes can be made. 3.8 Cell Formation Problem T h e manufacturing cell formation probleni, as described in Joines, Culbreth, and K m g (1996), is the final test problem. 'The cell formation problem, a type of nonlinear integer programming problem. is concerned u.ith assigning a group of machines to a set of cells and assigning a series of parts needing processing by these machines into families. T h e goal is to create a set of autonomous manufacturing units that eliminate intercell movements (i.e., a part needing processing by a machine outside its assigned cell). Tuo instances of this problem were in the investigation: a 22 machine by a 148-part instance (Cell-I) and a 40 machine by a 100-part instance (<:ell-Z). Consider an ni machine by 11 part cell formation problem instance with a maximum of k cells. T h e problem is to assign each machine and each part to one and only o n e cell and fainily, respectively. Joines et al. (1996) developed an integer programming model with the following set variable declarations that eliminate the classical binary assignment variables anti constraints: .v; = y; = I, I, if machine i is assigned to cell I if p a r t j is assigned to family 1 T h e model was solved using a GA. In the (;A, the individuals are a vector of integer variables representing the machine and part assignments (.I-, ,.I-?,. . . ,.I-),,,-y l , y ? , . . . ,y12).T h i s representation has been shown to perform better than other cell formation methods utilizing the Benefits of Partial Lamarckianism same objective goines et al., 1996). T h e following “grouping efficacy” measure as the evaluation function: Maximize e - e, e + e, (r)is used (7) = - where k e = e, + e,, e, = C D, I= - di 1 Grouping efficacy tries to minimize a combination of both the number of exceptional elements e, (i.e., the number of part operations performed outside the cell) and the process variation e, (i.e., the discrepancy of the parts’ processing requirements and the processing capabilities of the cell). Process variation for each cell i is calculated by subtracting from the total number of operations a cell can perform Di, the number of part operations that are performed in the cell di. T h e total process variation is the sum of the process variation for each cell. Exceptional elements decrease r, since the material handling cost for transportation within a cell is usually less than the cost of the transportation between cells. Process variation degrades the performance of the cell due to increased scheduling, fixturing, tooling, and material handling costs. A greedy one-opt switching method, as described in Joines et al. (1995) was used as the LIP. Given an initial assignment of machines and parts to cells and families, respectively, the LIP uses a set of improvement rules developed by N g (1993) to determine the improvement in the r objective function if machine i is switched from its current cell to any of the remaining k - 1 cells. T h e procedure repeats this step for each of the m machines as well as each of the n parts, where each part is switched to k - 1 different families. 4. Results For the 30 replications of each problem instance, both the mean number of function evaluations and the mean final function values were determined. To ensure the validity of the subsequent analysis of variance (ANOVA) and multiple means tests, each data set was tested first for normality and second for homogeneity of the variances among the different search strategies. T h e Shapiro-Wilk test for normality was performed on the results at cv = 0.05. Since there is no established test for variance homogeneity (SAS Institute Inc., 1990), visual inspection was used to determine whether there were significant variance differences. If there were significant differences and there existed a strong linear relation between the means and standard deviations (i.e., a linear regression correlation greater than 0.8), a logarithmic transformation was used (SAS Institute Inc., 1990). For the number of functional evaluations, standard deviations equal to zero (which correspond to the search strategy terminating after 1 million functional evaluations) were removed from the linear regression, since these values were not due to randomness. For those results passing the Shapiro-Wilk test for normality, an ANOVAwas conducted for both the number of function evaluations and the final functional value. If the ANOVA showed a significant effect, cv > 0.01, the means of each of the 12 different search strategies considered, namely, no local improvement (referred to as -I), pure Baldwinian (referred to as 0), pure Lamarckian (loo), and nine levels of partial Lamarckian (referred to as 5 , 10, 20, 40,50,60,80,90, and 95, respectively), were compared using three different statistical methods. T h e three methods, each performing a multiple means comparison, were the Duncan Evolutionary Computation Volume 5, Number 1 39 C. K.Houck, J. A. Joines, M. C. Kay, and J. R. Wilson Table 1. Table 1: ,Multiple means comparison for Brown's almost linear function. Crir. #E\2I, A -\I1 B B B B B B F I Fitnr\s 211 I I r\ B I c I c I c c C xo so I I c c P I c F I F C c c IM) 95 90 I I c F c I c c approach to minimize the Bayesian loss Function (Duncan, 1975); Student-Newman-Keuls (SNK), a multiple-range test using a multiple-stage approach; and an approach developed by Ryan (1960), Einot and Gabriel (1975), and Welsch (1977) (REGMr), which is also a multiple-stage approach that controls the maximum experimentwise error rate under any complete o r partial hypothesis. Each of the tests was conducted using the general linear models routine in SAS v6.09 (SAS Institute Inc., 1990). All three of these tests were used because there is n o consensus test for the comparison of multiple means. 'i'he ranking of both the number of function evaluations to reach the optimal solution and the ranking of the final fitness are given in Tables 1 through 7 for each of the 1'9 test problem instances considered in this paper: the five nonlinear test problems (Brown, Corana, Griewank, Rastrigin, and Schwefel), each at 2 , 10, and 20 dimensions; the two LA problems (LA-200 and LA-250); and the two cell formation problems (Cell-1 and Cell-2). Each of the three multiple means comparisons, Duncan's, SNK, and REGVC', were performed o n each problem instance. In the tables, if all three of the multiple means comparison methods d o not agree, the results of each are presented; otherwise, the combined results for all the methods agreeing are presented. 'The means and standard deviations of the final fitness valuc and the number of functional evaluations for each search strategy are shown in Tables B l to R7 o f Appendix B for all test problem instances. I he multiple means composition methods are statistical tests that provide information on sets of means whose differences are statistically significant. For example, consider the results in Table 1 for the Brown-10 instance with respect to final fitne All three multiple means tests agree that the - 1 (no local improvement) search strategy yields significantly worse results than any of the other strategies. Similarly, the Baldwinian strategy (0% Lamarckian) is significantly better than the - 1 strategy but significantly worse than any of the other strategies. Finally, the other strategies (5% to 100% Lamarckian) yield significantly the same tirial fitness value. In some instances, the tests are unable to place a level into o n e group, and F > Benefits of Partial Lamarckianism Table 2. Multiple means comparison for thc Corana function. I l a range of groups is presented. For example, the 20% partial Lamarckian strategy for the Brown-20 instance (Table I) belongs to two both groups C and D (i.e., C-U). The strategies assigned to the best group are shaded. The rank indicates the rank of the various means, and moving from left to right, the means get better. Each of the test prohlems is discussed in detail in the following sections. 4.1 Brown's Almost Linear Function This problem benefits greatly from the use of SQP. As shown in Table 1 , the optimal solution was readily found by all partial Lamarckian search strategies employing SUP as their local improvement procedure. T h e GA employing no local improvement took considerably longer to find the optimal solution for the two-dimensional instance and failed to find the optimal for the other two instances. The pure Baldwinian search also terminated after 1 million function evaluations for the larger instances. For the two-dimensional instance, there is no statistically significant difference in the number of function evaluations required to obtain the optimal solution for all strategies employing SUP. However, the 10- and 20-dimensional instances show that although the strategies employing a lower percentage of partial Lamarckianism find the optimal, they require a significantly greater number of function evaluations. 4.2 Corana Function As shown in Table 2, the two-dimensional Corana function instance is easily solved by all variants of the GA. The 10-dimensional instance shows an interesting result. Only the higher levels of partial Lamarckianism were able to find the optimal solution, whereas Evolutionary Cornputadon \'olume 5, Number 1 41 C. K. Houck, J. A. Joines, hl. (2.Kay, a n d J . R. Wilson Table 3. .\lultiple iiieaiis comparison for the Griewank function. pure BaldM-inian search found the worst solutions. For the 20-ciimensional instance, n o strateR. was ahle to locate the optimal value every time; however, the final function value shows more statistically significant variation (i.e., the lower the percentage of Larnarckian learning, the worse the final functional value until 00% is reached, after which all strategies yield similar results). T h e 90%, 0 5 % , and 100% Lamarckian strategies found the optimal approximately half the time. It is interesting to note that the G A not employing local search (~ 1) outperforms the lower Lamarckian levels, including pure Baldwinian learning. A possible explanation for this is that the Corana function exhibits numerous local minima and extreme discontinuity wherever a rectangular local minimum has been placed in the hyperdimensional parabola, making this problem troublesome for SOP. Therefore, the other strategies waste valuable computational time mapping the same basin of attraction. 4.3 Griewank Function This problem benefits greatly from the use of SQP. r\S shown in Table 3 , the optimal solution was reatlily found hy all variants o f t h e GA. Although the quality of the solutions was similar, only Duncan's method found any difference for the two-dimensional instance, with the <;A employing no local improvement and several higher levels of 1,amarckianism being more efficient. For the 10- and 20-climensional instances, the optimal solution was readily found by all search strategies employing SQP as their local improvement procedure, whereas the no local improvement (;A terininated after 1 riiillion function evaluations. Ilowever, there is a statistically significant difference in the number of function evaluations required. Unlike Benefits of Partial Lamarckianisrn Table 4. Multiple means comparison for the Rastrigin function. Kawigin-20 FlIl,C\< 411 I A B B B B B B B B B B B all other problems, the greater amount of Lamarckianism, the greater number of function evaluations required to locate the optimal solution. 4.4 Rastrigin Function As shown in Table 4, all strategies found the optimal solution for the two-dimensional instance; however, no local improvement took significantly longer. Similarly, all strategies, except for pure Baldwinian search, found the optimal solution for the 10-dimensional instance. There is statistically significant evidence that moderate and high levels of Lamarckianism (greater than 50%) yield quicker convergence. For the 20-dimensional instance, a similar trend is present. All strategies except pure Baldwinian search yielded optimal results; for this size instance, a Lamarckian level of greater than 40% yields the quickest convergence. It should be noted that no local improvement outperformed several of the lower levels, including pure Baldwinian search. As the problem size increases, more computational time is used in the initialization of the population, not allowing Baldwinian learning to overcome the problem of a one-to-one genotype to phenotype mapping before it terminates. 4.5 Schwefel Function As shown in Table 5 , every strategy found the optimal solution for the two-dimensional instance; with no local improvement taking significantly longer. T h e 10- and 20-dimensional instances show faster convergence for greater degrees of Lamarckianism. Low levels of Lamarckianism do not consistently yield optimal solutions, whereas with 20% or greater Lamarckianism, the optimal solution is consistently located. 4.6 LAProblem As shown in Table 6, both LA problem instances showed the same trend seen in the second cell formation problem instance (Cell-2 in Table 7). With any use of local improvement, the Evolutionary Computation Volume 5, Number 1 43 C. R. Ilouck, J. A. Joines, 11.G. Kay, and J. R. Wilson Table 5. .Multiple means comparison for the Schwefel function. Prob. Inst. Crit #E\al5 Kunl. -I I(H1 YO Y? 80 50 40 60 20 10 0 5 Ail A A A A A A A A A A A A 40 50 60 80 YO 95 100 60 40 10 100 80 90 95 C-E C-E D-E D-E D-E €3 E Schwctcl-2 Rank I 5 0 10 20 ,411 A A A A A - Schuciel-lo r SchuelclLlO Table 6. Alultiple means comparison for the LA problem. G4 found the optimal solution before the 1 million function evaluation ceiling was reached. €Iowever, the Baldwinian strategy (0%) required a significantly greater number of function evaluations than any strategy employing any form of Lamarckianism. Benefits of Partial Lamarckianism Table 7. Multiple means comparison for the cell formation problem. Crit. Fitness 4.7 Cell Formation Problem As shown in Table 7 , the first cell formation problem instance, Cell-1, is a case where using the local improvement routine with any amount of Lamarckian learning resulted in finding the optimal solution; whereas, when using no local improvement or a strictIy Baldwinian search strategy, the algorithm did not find the optimal solution and terminated after 1 million function evaluations, with the Baldwinian strategy finding better solutions than the no local improvement strategy. The second cell formation instance, Cell-2, is more interesting. In terms of fitness, any strategy employing a moderate amount of Lamarckianism (>2O%) found the optimal solution, whereas, the strategies employing either a Baldwinian strategy or using no local improvement did not consistently find the optimal solution. This problem instance also shows the familiar pattern of increasing levels of Lamarcluan learning resulting in fewer function evaluations to reach the optimal solution. A possible explanation for the poor performance of a Baldwinian search strategy is that most of the 1 million function evaluations are used in the initialization of the population. Therefore, a strategy employing pure Baldwinian learning does not have enough functional evaluations left to overcome the multiple genotype-to-phenotype mapping problem. 4.8 Summary of Results To concisely present the main results of these statistical tests shown in detail in Tables 1 to 7 , Tables 8 and 9 show rankings of each search strategy with respect to the fitness of the final result and the number of function evaluations required, respectively. For each strategy, the rankings were determined by finding the group that yielded the best results for each test problem instance as determined by the SNK means analysis. If the strategy was placed into a single group, that group’s rank was used; however, if the method was placed into several Evolutionary Computation Volume 5, N u d e r 1 45 C. K. I-louck,J.A. Joines, 11. G. Kay, and]. R. Wilson Table 8. Rank of solutions (SNK). Dagger indicates failed Shapiro-M'ilk test, or ANOVA did not show signiticant effect a t (I = 0.01. Search Strategy ProbIem 1 Brown-10 I Brown-20 I3 I2 I 3 I2 L----kL Rastrigin-7' 1 Schwefel-2' ~ 1 ~ Schwefel- 10 I I LA-200 LA-250 ~ ~ Cell- 1 Cell-2 - overl:;pping groups, the method u.as placed into the group with the best rank. Therefore, these rankings represent how. many groups o f means are significantly better for each test pro1iei-n instance. T h e SNK multiple means test was arbitrarily chosen for these rankings. 71'ahle 8 shows the rankings for the final fitness of the solutions returned by each of the search strategies, u h e r e 1 represents the best rank and is shaded, 2 represents the next hest r;~nk,etc. - I l l the strategies empioying a t least 20% Lamarckian learning consistently founcl the optimal solution. Nowever, the pure Baldwinian (0%) and the S % and 1O'X1 Benefits of Partial Lamarckianism partial Lamarckian strategies did find significantly worse solutions for several test problem instances. T h e GA employing the no LIP (-1) is included to provide a comparison with how efficiently the local search procedure utilizes function evaluations compared with a pure genetic sampling approach. For most of these test problem instances, the use of the LIP significantly increases the efficiency of the GA. Table 9 shows the rankings for the number of function evaluations required to locate the final functional value returned by each of the search strategies. T h e table shows an interesting trend: Neither the pure Lamarckian (100) nor pure Baldwinian (0) strategies consistently yielded the best performance; this was also observed by Whitley et al. (1994). Using a binary representation and a bitwise steepest ascent LIP, Whitley et al. (1994) demonstrated that a Baldwinian search strategy was preferred for the 20-dimensional Schwefel function, whereas the results in Table 9 show a preference for a Lamarckian search strategy when using a floating point representation and SQP. Also, a Lamarckian strategy was preferred in Whitley et al. (1994) for the 20-dimensional Griewank instance, whereas in this study a Baldwinian search strategy is preferred. T h e results of this study and those of Whitley et al. (19941, together, show that a combination of the problem, the representation, and the LIP all can influence the effectiveness of either a pure Lamarckian or pure Baldwinian search strategy. However, with respect to the representations and LIPs used in this study, a partial Lamarckian strategy tends to perform well across all the problems considered. Taking a minimax approach (i.e., minimizing the worst possible outcome), the 20% and 40% partial Lamarchan search strategies provide the best results. T h e worst ranking of each search strategy, in terms of convergence speed, is shown as the last row of Table 9. Both the 20% and 40% strategies yield, at worst, the third best convergence to the optimal solution. T h e other search strategies, 0% and 5% partial Lamarckian, did not consistently find the optimal. Higher amounts of Lamarckianism, 50% to loo%, converge slower to the optimal solution. Since no single search strategy is dominant in terms of solution quality and computational efficiency, the worst case performance of the search strategy can be minimized across the entire set of test problems examined by employing a 20% to 40% partial Lamarckian strategy. T h e no local improvement search strategy ( - 1) is shown for comparison to demonstrate the effectiveness of hybrid GAS. 5. Conclusions LIPs were shown to enhance the performance of a GA across a wide range of problems. Even though Lamarckian learning disrupts the schema processing of the GA and may lead to premature convergence, it reduces the problem of not having a one-to-one genotype-tophenotype mapping. In contrast, although Baldwinian learning does not affect the schemaprocessing capabilities of the GA, it results in a large number of genotypes mapping to the same phenotype, Except for the Griewank-10 and Griewank-20 problem instances, this may explain why Baldwinian search was the worst strategy for several of the test problems. In the empirical investigation conducted, neither pure Baldwinian nor pure Lamarckian search strategies were consistently effective. For some problems, it appears that by forcing the genotype to reflect the phenotype, the GA converges more quickly to better solutions, as opposed to leaving the chromosome unchanged after it is evaluated. However, for Griewank10 and Griewank-20, Lamarckian strategies require more computational time to obtain the optimal. Therefore, a partial Lamarckian strategy is suggested. The results of this paper have opened up several promising areas for future research. This research has shown that local search improves the GAS’ ability to find good solutions. Evolutionary Computation Volume 5, Number 1 47 C. R. Houck, J. A. Joines, M. G. Kay, and J. R. Wilson Table 9. Rank of convergence speed (SNK). Dagger indicates failed Shdpiro-Wlk test, or ANOVA did not show significant effect ‘It (1 = 0.01. I I f nothing is known about the particular problem under consideration, a minimax approach is suggested to determine the amount of partial Lamarckianism. Future research could investigate the use of an adaptive approach to dcterniine the amount of partial Larnarckianism to use. Also, it is possible that, instead of a single percentage of Ianarckianism applied during the entire run, different amounts of Lamarcliianisin over the course of the run would be more txneticial. ICven though I a n a r c k i a n learning helps to alleviate the problem of one-to-one Benefits of Partial Lamarckianism ~ 1. 2. 3. 4. 5 6. 7. 8. Supply a population Po of N individuals and respective function values i t 1 Pi +- selection_function(P,- 1) P, c reproduction-function(P:) Evaluate(P,) i+i+l Repeat steps 3 - 6 until termination Print out best solution found Figure Al. The GA. genotype-to-phenotype mapping, it can still apply local searches to many points in the same basin of attraction. Another area of future research would involve determining whether a local search is even necessary for a new individual (i.e., has the current basin already been mapped). Therefore, computational time is wasted mapping out the same basin of attraction, which then can be used to explore new regions. We are currently investigating combining several global optimization techniques with GAS for the purpose of introducing memory into the search strategy. Acknowledgments T h e authors would like to thank the editor and the referees for their insightful comments and suggestions, which resulted in a much improved paper. This research was supported, in part, by the National Science Foundation under Grant DMI-9322834. Appendix A. Description of Genetic Algorithm T h e genetic algorithm (GA) used in the experiments in Section 3 is summarized in Figure Al. Each of its major components is discussed in detail below. T h e use of a GA requires the determination of six fundamental issues: chromosome representation, selection function, the genetic operators making up the reproduction function, the creation of the initial population, termination criteria, and the evaluation function. T h e rest of this appendix briefly describes each of these issues relating to the experiments in Section 3 . A.l Solution Representation For any GA, a chromosome representation is needed to describe each individual in the population of interest. T h e representation scheme determines how the problem is structured in the GA and also determines the genetic operators that are used. Each individual or chromosome is made up of a sequence of genes from a certain alphabet. An alphabet could consist of binary digits (0 and l),floating point numbers, integers, symbols (i.e., A, B, C, D), matrices, etc. In the original design of Holland (1975), the alphabet was limited to binary digits. Since then, problem representation has been the subject of much investigation. It has been shown that more natural representations are more efficient and produce better solutions (Michalewicz, 1996). In these experiments, a real-valued representation was used for the five nonlinear problems and the location-allocation problem, whereas an integer representation was used for the cell formation problem. Evolutionary Computation Volume 5, Number 1 49 C;. K. Houck, J. A. Joines, AT. G. Kay, and J. K.Wilson A.2 Selection Function T h e selection of individuals to produce successive generations plays an extremely important role in a GI. There are several schemes for the selection process: roulette wheel selection and its extensions. scaling techniques, tournament, elitist models, and ranking methods (Goldberg, 1989; Michalewicz, 1996). Ranking methods only require the evaluation function to map the solutions to a totally ordered set, thus allowing for minimization and negativity. Ranking methods assign PI based on the rank of solution i after all solutions are sorted. h normalized geometric ranking selection scheme Uoines €k Houck, 1994) employing an elitist model was used in these experiments. Normalized geometric ranking defines PI for each individual by l-’\Individual i is chosen] = q’(1 - q)’-’ (All where q = I. = AV = = probabiliy of selecting the best individual; rank of the individual, \$,here 1 is the best; population size; anti q l-(l-q)\’ A.3 Genetic Operators Genetic operators provide the basic search mechanism o f the GA. ‘The application of the two basic types of operators (i.e., mutation and crossover and their derivatives) depends on the chromosome representation used. Operators for real-valued representations (i.e., an alphabet of floats) are described in Michalewicz (1904). For real parents, = (.q,xz, . . . , x p , ) and 1- = ( y l , y!, . . . ,J,,), each of the operators used in the experiments in this paper are described tlriefly below. Xote, for the cell formation problem, the operators were modified to produce integer results uoines et a]., 1996). Let,n, and), be the lower and upper hound, respectively, for each variable s,(or and let F (or . I) represent the changed parent or offspring produced by the genetic operator. x )I,), Uniform Mutation Randomly select one variabiej and set it equal to a random number draun from a uniform distribution C(n,,1 7 , ) ranging from n, to b,: Multiuniform Mutation Apply Equation iv to all the variables in the parent x Boundary Mutation Randomly select one \ariahle/ and set it equal to either its lower or upper t)ound, where 1- = C(0,1): (1,. .r = b,, s,. so if i = J , if i =.J, otherwise 7- < 0.5 I’ 2 0.5 Benefits of Partial Lamarckianism Nonuniform Mutation Randomly select one variablej and set it equal to a nonuniform random number based on the following equation: (x;+ (bj - x;)f(G), - (xi+aj)f(G), if i = j , if i = j , otherwise YI < 0.5 yI 2 0.5 a uniform random number between (0,l); current generation; maximum number of generations; and a shape parameter. Multi-nonuniform Mutation Apply Equation A4 to all the variables in the parent Simple Crossover Generate a random number from 2 to (n - 1) and create two new individuals where n is the number of parameters: Y x. from a discrete uniform distribution (x’and r’)using Equations A5 and A6, x: y; = ;, if i < r otherwise yj, x,, if i < Y otherwise Arithmetic Crossover Produce two complementary linear combinations of the parents, where Y = U(0,l): --I X’=YX+(l-Y)T, Y =(I -Y)X+YY (A71 Heuristic Crossover Produce a linear extrapolation of the two individuals. This is the only operator that utilizes fitness information. Anew individual, is created using Equation A8, where Y = U(0,l), and X is better than F in terms of fitness. IfX’ is infeasible (i.e.,feasibiliry equals 0, as given by Equation A9), then generate a new random number Y and create a new solution using Equation A8; otherwise, stop. To ensure halting, after t failures, let the children equal the parents and stop: x’, X’=X+Y(X-r>, feasibility = { --I Y =x if x : > a , , otherwise (‘48) xisb, forall i = 1 , . . . , 72 649) A.4 Initialization of Population As indicated in step 1 of Figure 1, the GA must be provided with an initial population. The most common methods are to (1) randomly generate solutions for the entire population, or (2) seed some individuals in the population with “good” starting solutions. T h e experiments in Section 3 used a random initial population with common random numbers across all search strategies for each replication. Evolutionary Computation Volume 5, Number 1 51 C . K,Houck,J. A. Joines, M. G. Kay, and J. R. Wilson Parameter Value Parameter Uniform mutation 4 Nonuniform mutation Multi-nonuniform mutation Boundary mutation Simple crossover AArithnieticcrossover Heuristic crossover Probability of selecting the best, y Population size, ,li Maximum number of function evaluations Maxiinurn number of iterations for SOP 4 6 4 2 2 2 0.08 80 1,000,000 2s A.5 Termination Criteria T h e (;A moves from generation to generation selecting and reproducing parents until a termination criterion is met. Several stopping criteria currently exist and can be used in conjunction with another, for example, a specified maximum number of generations, convergence of the population to virtually a single solution, and lack of improvement in the best solution over a specified number of generations. T h e stopping criterion used in the experiments in Section 3 was to terminate when either the known optitnal solution value was found o r the specified maximum number of function evaluations was exceeded. A.6 Evaluation Functions As stated earlier, evaluation functions of many forms can be used in a GA, subject to the minimal requirement that the function can map the population into a totally ordered set. T h e GA employing no local improvement used the evaluation functions of Section 3 directly. T h e hybrid GA used in the pure Baldwinian, pure Lamarckian, and partial Lamarchan search strategies used the local iinprovement procedure (LIP) for each particular problem as its evaluation function. Each o f the LIPS then used the evaluation functions of Section 3 . A.7 GA Parameters X b l e A1 lists the values of the parameters of the GAS used in the experiments in Section 3 . T h e parameter values associated with the mutation operators represent the number of individuals that will undergo that particular mutation, whereas the values associated with crossover represent the pair5 of individuals. Appendix B. Means and Standard Deviations for Each Search Strategy ~-I he means and standard deviations of the final fitness value and the number of functional evaluations for each search strategy are shown in Tables B l to B7 for all test problem instances. T h e number of times (maximum 30) that the search strategy found the optimal solution is also +en in the tables. 52 Evolutionary Computation Volume 5, Nomt)er 1 Benefits of Partial Lamarckianism Table B1. Means and standard deviations for the Brown problem. Problem Instance Brown-2 I Search Strategy -1 0 S 10 20 40 so 60 80 90 9s 100 -1 0 5 Brown-I0 10 20 40 50 60 80 90 95 I00 -1 0 5 10 20 Brown-20 40 50 60 80 90 95 100 1 Final Fitness Value Standard Dev. Mean 9.33333e - 07 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 1.26602 5.03208e - 01 1.00000e - 07 3 . 3 3 3 3 3 e - 08 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 1.65499e + 01 1.10978e + 01 1.00000e - 07 3 . 3 3 3 3 3 e - 08 3.33333e - 08 3 . 3 3 3 3 3 ~- 08 6.66667e - 08 3.33333e - 08 0.00000 3 . 3 3 1 3 3 ~- 08 6.66667e - 08 6.66667~- 08 2.5371e - 07 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 1.9554 4.7555e - 01 3.0513~- 07 1.8257e - 07 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 4.3312 2.SFlle+Ol 3.OS13e - 07 1.8257e - 07 1.8257e - 07 1.8257e - 07 2.5371e - 07 1.8257e - 07 0.0000 1.8257e - 07 2.5371e - 07 2.5371e - 07 Evolutionary Computation Volume 5, Number 1 No. of Times Optimal 30 30 30 30 30 30 30 30 30 30 30 30 0 7 30 30 30 30 30 30 30 30 30 30 0 0 30 30 30 30 30 30 30 30 30 30 No. of Functional Evaluations Standard Dev. Mean 9.S529e + 04 3.1070e + 03 3.1026~+ 03 3.0946e + 03 3.0856~+ 03 3.0389e + 03 3.0193e + 03 2.9950~+ 03 2.9282e + 03 2.9092~+ 03 2.8798e + 03 2.8681e + 03 1.0000e + 06 8.5243~+ OS 2.8470~+ 04 1.856Se + 04 1.3169~+ 04 9.2201e + 03 8.9320e + 03 8.6745e + 03 8.148le + 03 7.8687~+ 03 7.7787e + 03 7.6381~+ 03 1.0000e + 06 1.0013e + 06 2.7049~+ 04 1.4136r + 04 1.0893~+ 04 6.7153e + 03 5.7384~+ 03 6.1421e + 03 5.7521~ + 03 5.3642e + 03 5.4270e + 03 5.5075~+ 0 3 6.72 13e + 04 1.522Oe + 02 1.6197e + 02 1.6261e + 02 1.7367e + 02 1.6929~ + 02 1.8582e + 02 1.80768 + 02 1.6989e + 02 1.6929e + 02 1.6445r + 02 1.5229e + 02 0.0000 3.0670e + 05 1.S704e + 04 1.0994~+ 04 4.234Se + 03 1.7223~+ 03 1.5899e + 03 1 . 6 3 1 2 ~ +03 9.7597e + 02 8.260Se + 02 7.7405e + 02 6.6600e + 02 0.0000 8.6148~+ 02 2.45 37e + 04 7.3162~+ 03 5.1028e + 03 2.7935~+ 03 1.3383r + 03 1.4S38e + 03 1.3410e + 03 I. 190+ + 03 I . 1 9 5 5 e + 03 1.3284~+ 03 53 C. R. Houck, J. A. Joines, *M. G. Kay, and J. R. Wilson Table B2. .\leans and standard deviations for the Corana problem. So. of Functional Evaluations Benefits of Partial Lamarchanism Table B3. Means and standard deviations for the Griewank problem. Problein Instance Chewank-2 Griewank-lO 1 Search Strategy -I 0 5 10 20 40 so 60 80 90 95 I00 90 100 -1 Griewank-20 20 40 50 60 80 90 95 100 Final Fitness Value Standard Dev. Mean 3.69823~- 03 3.7610e - 03 7.39600~- 04 2.2567e - 03 7.39600e - 04 2.2567~- 03 9.86133~- 04 2 . 5 5 7 1 ~- 03 4.93067~- 04 1.8764e - 03 4.93067~- 04 1.8764~- 03 2.2567e - 03 7.39600e - 04 4.93067~- 04 1.8764e - 03 9.86133~- 04 2 . 5 5 7 1 ~- 03 1 . 2 3 2 6 7 ~- 03 2.8034e - 03 7.39600~- 04 2.2567~- 03 9.86133e - 04 2.5571~- 03 5.25359~- 02 2.9763e - 02 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.0000 0.00000 4.20434e - 02 4.4794e - 02 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 Evolutionary Computation Volume 5 , Number 1 No. of 'Lines Optimal 15 27 27 26 28 28 27 28 26 25 27 26 1 30 30 30 30 30 30 30 30 30 30 30 4 30 30 30 30 30 30 30 30 30 30 30 No. of Functional Evaluations Mean Standard Dev. 5.0653~+ 05 5.0193~+ 05 3.2468e + 05 3.1 199e + 05 3.0965~+ 05 3.0753e + 05 2.9192e+OS 3.1061~+ 05 2.5194~t 05 2.4883~+ 05 2.92 17e + 05 2.6033e + 05 3.5088~t 05 3.1584e + 05 2.6093~+ 05 2.5242P + 05 3.3917e + 05 3.3349e + 0s 2.9896~+ 05 3.2910e + 05 2.2906e + 05 2.7131~+05 3.1562~+ 05 2.8729e t 05 9.7013e + 05 1.63681,+ 05 4.7135e+03 1.7993~+ 02 4.8088~+ 03 2.8770r + 02 5 . 0 W e + 03 3.7042~-+ 02 5.9641~+ 02 5.4296~+ 03 6.2652e + 03 6.4218~+ 02 6.8425e + 03 8.7743e + 02 7.2035e + 03 9.1656e + 02 8.2834~t 03 1.I45Oet02 8.8033e + 03 1.2008~+ 02 9.0910~t 03 I . 1866e + 02 9.3120e+03 1.1467e + 02 9.0575~+ 05 2.6030~-+ 05 7.8540~+ 03 I ,9988~+ 02 8.0354e + 03 2.8841P + 02 8.2087e + 03 3.8040e + 02 8.4880~+ 03 5.3300~+ 0 2 9.1453e + 03 6.8650e + 02 9.4776~+ 03 6 . 6 8 6 5 ~+ 02 9.7216~+ 03 7.1319Pt02 1.0219e+04 7.1239e+02 l.0371e+04 6.5058e + 02 1.0554~t 04 6.5540~+ 02 1.0655~+ 04 5.3251~+02 55 C. R. 1 louck, J. A. Joines, M. G. Kay, and J. R. Wilson Table B4. Means and standard deviations for the Rastrigin problem. Problem instance Search Strategy -1 0 5 I0 20 Rastrigin-2 40 50 60 80 90 95 I00 -1 0 5 10 20 Rastrigin-10 40 SO 0.OooOo -1 1 .nooooP - 06 1.21 S15e + 01 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 10 20 40 50 60 80 90 95 100 56 ~~~ 60 80 90 95 I00 0 5 Rastrigin-20 Final Fit Mean 5.00000e - 07 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 I .00000e-06 1.12762 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 No. of Times Optimal 30 is Value Standard Dev. 5.0855e - 07 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.0000 1.4006 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 7.3257 o.ooon 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 No. of Funct a1 Evaluations Mean Standard Dev. 6 . 3 1 3 6 ~ + 0 3 3.212lr + 03 2.1 390e + 03 2 . 3 5 7 2 ~+ 01 2.1376~+03 2.5566~+ 01 2.1391~+ 03 2.6483~+ 01 2.1376~+03 3.1725~+ 01 2.13Me + 03 4.3927~+ 01 2.1377e+03 4.7064~+ 01 2.13?6e+03 4.5057~+ 0 1 2 . 1 3 4 4 ~+ 03 5.3647e + 01 6.0398r + 01 2.1401~+ 03 2.1386~+ 03 6. I656r + 01 2.1426r t 03 6.429s~+ 0 1 30 30 30 30 30 30 30 30 30 30 9.6194~+ 03 7 163% + 05 1.5054~+ 05 I 30 30 I 30 30 30 30 30 30 30 30 30 30 I 6.8002~+ 04 5.2204~+ 04 4.1791% + 04 4.0310e + 04 3.8088~+ 04 3.7627~+ 04 1.5740~+ 04 3.7634~+ 04 2 . 1 5 6 1+~0 5 Y.9512P+05 4.5566~+ 05 3.7480e + 0.5 2.947% + 05 2.0678~+ 05 1.839% t 05 1.91 34P + 05 1.6438e t 05 1.499I P + 05 1.5426e + 05 1.6482~+ 05 1.7994~+ 05 1.0696~+ 05 6.005 1 u + 04 2.2303P + 04 1.1!29~+04 1.2086r + 04 1.0SRlc + 04 9.2368e + 01 8.8 I 2 1e + 03 8.81Ole + 03 1.0570~+ 04 6.0225~+ 04 3.78361. + o'+ 1.s220u + 05 I .4303e + 05 I.I280r+05 6.5623r + 04 3.689% + 04 5.1640~+ 04 4.0192c + 04 3.8368r + 04 3.8706~+ 04 3 . 5 2 5 ~+ ~04 Evolutionary Computation Volume 5, Number 1 Benefits of Partial Lamarchanism Table BS. Means and standard deviations for the Schwefel problem. Problem Instance Search Strategy -1 0 5 Schwefel-2 10 20 40 50 60 80 90 95 I00 -1 0 5 10 Schwefel-I0 20 40 50 60 80 90 95 100 -1 Schwefel-20 0 5 10 20 40 50 60 80 90 95 100 Final Fitness Value Standard Dev. Mean 5.0401e - 07 4.33333e - 07 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.0000 1.00000e - 06 1.0560 6.63306e - 01 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 1.1698e - 03 4.40933e - 04 8.14151 9.0376 0.00000 0.0000 0.0000 0.00000 0.0000 0.00000 0.00000 0.0000 0.0000 0.00000 0.0000 0.00000 0.00000 0.0000 0.0000 0.00000 0.0000 0.00000 0.0000 0.00000 Fhlutionary Computation Volume 5 , Number 1 1 No. ofTimes Optimal 30 30 30 30 30 30 30 30 30 30 30 30 30 19 30 30 30 30 30 30 30 30 30 30 23 11 30 30 30 30 30 30 30 30 30 30 No. of Functional Evaluations Mean Standard Dev. 6.782Ye + 03 3.4377e + 03 8.3627e + 03 8.9766~+ 03 8.3500e + 03 8.9631e + 03 8.3915e + 03 9.0084e + 03 9.0260e + 03 8.3996e + 03 8.4540e + 03 9.1 lO0e + 03 8.4480e + 03 9.0932e + 03 8.4446e + 03 9.0945e + 03 8.5156e + 03 9.2095~+ 03 8.529Ye + 03 9.2290e + 03 8.5241+ ~ 03 Y.2185e + 03 8.5262~+ 03 9.21 7 5 +~ 03 7.5922~+ 04 1.312Oe + 04 4.9627e + 05 4.257% + 05 7.8612~+ 04 1.198% + 05 1.1426~+ 05 6.1898~+ 04 7.9676e + 04 3.7094e + 04 4.6399~+ 04 7.7312e + 04 7.3009~-+ 04 4.8608e + 04 4.72 17e + 04 7.1344e + 04 7.3YSle + 04 5.4221r + 04 7.199Se + 04 5.3185e+04 6.8266~+ 04 4.8523e + 04 6.9889e + 04 4.7395, + 04 4.6027e + 05 3.6124~+05 7.0960e + 05 4.0809~+ 05 3.6296~+ 05 1.8691~ + 05 3.3455e + 05 1.500')P + 05 9.5471e + 04 2.8962~+ 05 2.447% + 05 8.721 1 P + 04 8.1234~+ 04 2.3312~+ 05 2.4751~+05 9.2.517e+0.1 2.3880e + 05 1.211 6 +~0i 2.2414~+ 05 1.1503e + 05 I.1259e + 0 s 2.2112e+OS 2.4558~+ 0 5 1.3543~+05 57 C. K. IIouck, J. A. Joines, At. G. Kay, and J. R. Wilson Table B6. .\leans and standard deviations for the Location-Allocation (LA) problem. Table B7. A\\Ieansand standard deviations for the cell formation problem. Search Straw? -I 0 10 20 4n ?I) (Ill 80 'ill C) i I00 Benefits of Partial Lamarckianism References Chu, P.,& Beasley, J. (1995). A genetic algorithm for the generalized assignment problem (Tech. Rep.). London: T h e Management School Imperial College. Cooper, L. (1972).The transportation-location problems. OperationsResearch, 20,94-108. Corana, A., Marchesi, M., Martini, C., & Ridella, S. (1987). Minimizing multimodal functions of continuous variables with the “simulated annealing” algorithm. ACM Transactions on Mathematical Software, 13(3), 262-2 80. Davis, L. (1991). Handbook ofgmetic algorithms. New York: Van Nostrand Reinhold. Duncan, D. (1 975). t-tests and intervals for comparisons suggested by the data. Biometrics, 31,3 39-3 59. Einot, I., & Gabriel, K. (1975). A study of the powers of several methods of multiple comparisons. 30burnalof the American Statistical Association, 70(3fI), 574-583. Goldberg, D. (1989). Genetic algorithms in search, optimization and machine learning. Reading, MA: Addison-Wesley. Grace, A. (1992). Optimization toolbox user’s gziide. Natick, MA: Mathworks. Gruau, F., & Whitley, D. (1993). Adding learning to the cellular development of neural networks. Evolutionary Computation, 1, 2 13-2 33. Hinton, G., & Nolan, S. (1987). How learning can guide evolution. Complex Systems, I , 495-502. Holland, J. (I 975). Adaptation in namral and artificial systems. PhD thesis, University of Michigan, Ann Arbor. Homaifar, A., Guan, S., & Liepins, G. E. (1993). A new approach on the traveling salesman problem by genetic algorithms. In S. Forrest (Ed.), Proceedings of the 5th International Conference on Genetic Algor-ithms(pp. 460-466). San Mateo, CA: Morgan Kaufmann. IJouck, C. R., Joines, J. A., & Kay, M. G. (1996). Comparison of genetic algorithms, random restart, and two-opt switching for solving large location-allocation problems. Cornputen and Operations Research, 23(6), 587-596. Joines, J., Culbreth, C., & King, R. (1996). Manufacturing cell design: An integer programming model employing genetic algorithms. IIE Transactions, 28(1), 69-85. Joines, J., & Houck, C. (1994). On the use of non-stationary penalty functions to solve constrained optimization problems with genetic algorithms. In 2.Michalowicz, J. D. Schaffer, H.-P. Schwefel, D. G. Fogel, & H. Kitano (Eds)., Pr-oceedingsof the First IEEE Conference on Evolutionary Cornputation (Vol. 2, pp. 579-584). Piscataway, NJ: IEEE Press. Joines, J., King, R., & Culbreth, C. (1995). The use ofa local improvement hezwistic with ageizeti~algorith~j for manufacturing cell design. (Tech. Rep. NCSU-IE 95-06). Department of Industrial Engineering. Raleigh, NC: North Carolina State University. Lawrence, C., Zhou, J. L. & Tits, A. L. (1994). User’s guide for CFSQP version 2.4: A C code fbr solving (large scale) constrained nonlinear (minimax) optimization problems, generating iterates siitisfiirigall inequaliq constraints (Tech. Rep. TR-940161-1). College Park, MD: University of Maryland, Electrical Engineering Department and Institute for Systems Research. Love, R., & Juel, H. (1982). Properties and solution methods for large location-allocation problems with rectangular distances. Jo11mal of the Operations Research Society, ??, 443-452. Love, R., & Morris, J. (1975). A computational procedure for the exact solution of location-allocation problems with rectangular distances. Naval Researzh Logisfirs Qmcarterlv, 22, 441-453. Michalewicz, Z. (1996). Genetic algorithms + data stmctw-es = evolutionary pi-og?*ams.(3rd ed.). New York: Springer-Verlag. Evolutionary Computation Volume 5, Number 1 59 C. K. Houck, J. A. Joines, M. G. Kay, and J. R. Wilson Michalewicz, Z., d Nazhiyath, G. (1995). Genocop 111: A co-evolutionary algorithm for numerical optimization problems with nonlinear constraints. In 1). B. Fogel (Ed.), A-omdingy ofthe Second IEEE Inmnhoi2nl Co7lfiw77re 017 Erolutionoiy Covipritnrion p o l . 1, pp. 647-65 1). I'iscataway, NJ: IFXZ Press. Mitchell, M. (1996). A 7 z ititmdiictioii to p i m i czlgorithm. Cambridge, MA: MIT Press. More, j.j.,Garbow, B., k Hillstrom, K. (1981). 7'esting unconstrained optimization software. ,4CM *, fi~clrZ.V7Ct?Oilr 011 L~rrtheirrnticnl . ~ < J f i N I . e , :(I), 17-41 . Nakano, R. (1991j. Conventional genetic algorithm for job shop problems. In Proceedings ofthe Fourrh Inteniotiond Cor!firunre 017 Cuirtic Algol-ithms (pp. 474-479). San Mateo, CA: MorKan Kaufniann. Xg, S. (1 903). Worst-case analysis of a n algorithm for cellular manufacturing. European ffovrnal of Opemtionnl Restw~ch,ri9(3), 386398. Orvosh, I)., k Davis, I,. (1094). Using a genetic algorithm to optimize prohlems with feasibility constraints. In Z.Xlichalewicz, I . D. Schaffer, F3.-I?.Schwefel, D. G. Foeel. & H . Kitano (Eds.). Renders, j.-hl.,LY Flasse, S. ( 1 996). IIybrid methods using genetic algorithms. IEEE E-ciiisnrtioizs o n Systtws. .Zlm iiiirl ~ , ~ y l m n f t i i,76(2), - ~ , 143-2 58. Ryan, '1'. (1960). Significance tests for inultiple comparison of proportions, variances, and other statistics. Py.ho/ogic.nl Biilletiii. K, 3 18-328. S.AS Institute Inc. (1900). S.4S/S'L4T riser:rgciide (4th ed.). Car); NC: SAS Institute Inc Ii'elsch, K . (1 Y77). Stepwise multiple comparison procedures. j o l i i 7 7 d of the Amei-icm Statisrid L4.rso&tzoii. 72(3F9/: .566-575. \\'hitley, 11.;Gorclon, S . , k hlathias, I(.(1994). Lamarckian evolution, the Kaldwin effect and function optimization. I n I-.Davidor, € I . Schwefel, & R. Ma'nner (Eds.), Pnrnllelp~oblevi~~oAitiyfr-om izdtwePP.SS III (pp. 6-1 5). Berlin: Springer-\'erlag.