Cooperative coevolutionary hybrid NSGA-II with Linkage Measurement Minimization for Large-scale Multi-objective optimization

Rui Zhong Graduate School of Information Science and Technology
Hokkaido University
Sapporo, Japan
rui.zhong.u5@elms.hokudai.ac.jp
   Masaharu Munetomo Information Initiative Center
Hokkaido University
Sapporo, Japan
munetomo@iic.hokudai.ac.jp
Abstract

In this paper, we propose a variable grouping method based on cooperative coevolution for large-scale multi-objective problems (LSMOPs), named Linkage Measurement Minimization (LMM). And for the sub-problem optimization stage, a hybrid NSGA-II with a Gaussian sampling operator based on an estimated convergence point is proposed. In the variable grouping stage, according to our previous research, we treat the variable grouping problem as a combinatorial optimization problem, and the linkage measurement function is designed based on linkage identification by the nonlinearity check on real code (LINC-R). We extend this variable grouping method to LSMOPs. In the sub-problem optimization stage, we hypothesize that there is a higher probability of existing better solutions around the Pareto Front (PF). Based on this hypothesis, we estimate a convergence point at every generation of optimization and perform Gaussian sampling around the convergence point. The samples with good objective value will participate in the optimization as elites. Numerical experiments show that our variable grouping method is better than some popular variable grouping methods, and hybrid NSGA-II has broad prospects for multi-objective problem optimization.

Cooperative Co-evolution (CC), Linkage Measurement Minimization (LMM), Large-Scale Multi-Objective Problems (LSMOPs), hybrid NSGA-II

I Introduction

Multi-objective evolutionary algorithms (MOEAs) have successfully solved various multi-objective problems[1]. However, the performance of the traditional MOEAs degrades dramatically as the dimension of the problem increases. When the scale of the problem reaches a certain dimension, this type of problem is called large-scale multi-objective optimization problems (LSMOPs). Solving LSMOPs is always tricky, and mainly due to the following aspects: (1). The complexity of optimization problems tends to increase with the increase of dimensions. (2). The search space of large-scale problems increases exponentially with the increase of the dimension, which is known as the curse of dimensionality[2]. Therefore, a number of generic MOEAs have also been developed for solving LSMOPs since 2013[3]. These methods are usually not aimed at a specific type of problem but want to develop a general framework to solve LSMOPs[4]. There are mainly three strategies currently. (1). Variable grouping method based on Cooperative Coevolution (CC)[3]. (2). Methods based on dimensionality reduction or problem transformation[5]. (3). Design new search strategies based on MOEAs[6]. These strategies succeed in solving many-objective optimization problems[7], constrained multi-objective optimization problems[8], and computationally expensive multi-objective optimization problems[9].

This paper mainly adopt CC framework to solve the LSMOPs. Inspired by divide and conquer, the CC framework decomposes the original problem into multiple non-separable sub-problems and applies evolutionary algorithms (EAs) to solve each sub-problem alternately[10]. This strategy has been a huge success in solving large-scale problems. However, several studies[11, 12] have shown that the CC framework is sensitive to problem decomposition strategies. In theory, a perfect variable grouping strategy can exponentially reduce the search space without losing optimization accuracy, while a poor grouping strategy will mislead the direction of optimization. Therefore, how to design the decomposition strategy has become a popular research topic.

In addition, unlike large-scale single-objective optimization problems (LSSOPs), in LSMOPs, the same variables in different objective functions may have different interactions. Currently, there are many variable grouping methods were extended from LSSMOPs to group the variables in LSMOPs, such as Fast interdependency identification (FII) for LSMOPs[13], Differential Grouping (DG) for LSMOPs[14], MOEA-DVA[15], and other methods. Although these methods show great performance in grouping accuracy, computational cost and local interaction identification make these algorithms limited. What is more, many studies[11, 12] in LSSOPs show that high-accuracy grouping is not equivalent to high-performance optimization, and ignoring some weak interactions can accelerate the optimization, especially under the fitness evaluation times (FEs) limitation.

Besides, In order to find the global optimum among the fitness landscape, the heuristic algorithm should be equipped with two major characteristics to ensure finding the global optimum. These two main characteristics are exploration and exploitation[16]. Exploration is the ability of an algorithm to search whole parts of a problem space whereas exploitation is the convergence ability to the best solution near a good solution. However, the No Free Lunch Theorems (NFLT)[17] has proved that there is no algorithm, which can perform general enough to solve all optimization problems, and a single optimizer cannot balance the exploitation and exploration well in optimization, thus, hybrid evolutionary algorithms and memetic algorithms have become a popular research topic.

In this paper, we propose a novel variable grouping method, named Linkage Measurement Minimization (LMM) and apply hybrid Nondominated Sorting Genetic Algorithm II (hNSGA-II) as the sub-problem optimizer (CC-hNSGA-LMM). In the grouping stage, our proposal allows an automatic decomposition that treats the variable grouping problem as a combinatorial optimization problem, and we design a linkage measure function to evaluate the variable grouping performance. In the sub-problem optimization stage, we introduce a Gaussian sampling operator based on a estimated convergence point combined with NSGA-II to balance the exploitation and exploration in optimization. Specifically, the main contributions of this paper are as follows.

(1) For variable grouping, we propose a novel automatic variable grouping strategy named LLM and explain the mathmatical mechanism of LLM in detial, which reveals the relationship between our proposal and LINC-R.

(2) For optimizer, we first hypothesize that the individuals around the PF have better fitness, and the participation of elite individuals in optimization can accelerate the convergence. Based on this hypothesis, we estimate a convergence point at every generation of optimization and apply the Gaussian sampling as a local search operator to exploit the potential individuals around the estimated convergence point.

(3) Theoretical analysis and numerical experiments show that our proposal has broad prospects in solving LSMOPs and can solve larger-scale and many-objective optimization problems through simple extension.

The rest of the paper is organized as follows, Section II covers preliminaries and a brief review of grouping methods. Section III introduces our proposal in detail. Section IV shows the experiments and analysis. Section V discusses about the direction of our research in the future, Section VI concludes the paper and shows future directions.

Ii Preliminaries and related works

Many grouping methods have been proposed based on CC for LSSOPs in recent years[11, 12], and meanwhile, some works have been expanded to LSMOPs[13, 4]. In this section, we firstly introduce some preliminaries including the concept of multi-objective problems, separability of functions, estimation of a convergence point, and NSGA-II. Then, we will provide a brief review of grouping methods both in LSSOPs and LSMOPs.

Ii-a Preliminaries

Ii-A1 Multi-objective Problems

Without loss of generality, a Multi-objective Problem (MOP) can be mathematically defined as Eq (1):

(1)

Where consists of M objectives, is the objective space, is the decision space, and is a solution consisting of decision variables. The dominance relation between two solutions can be defined as . If this formula is satisfied, we can say that x dominates y. A Pareto optimal solution is a solution that is not dominated by any solution in

Ii-A2 Separability of functions

The concept of separability is derived from a biological concept[18]. if a feature at the phenotype level is determined by two or more genes, then we say that these genes have interaction. Then we extend the concept to optimization problems. When , We call a partially separable function, and the variables in each sub-problem are called linkage sets[19]. The explicit or implicit interactions exist between every pair of variables in a linkage set, and we call these variables non-separable variables, and there is no interaction between variables in different linkage sets, and these variables are called separable variables. There are two extreme cases of the interaction, when there is no interaction between all variables, , we call is a fully separable function. On the contrary, when every pair of variables exist explicit or implicit interaction, we call a fully non-separable function. Based on CC framework, we want to develop a grouping method which can find the interactions and divide the variables into sub-problems properly.

Ii-A3 Estimation of a convergence point

Estimation of a convergence point was first proposed by Murata. The paper[20] hypothesize that: in a population-based optimization algorithm, all individuals are moving towards the global optimum. Fig.1 shows how the estimation of a convergence point works. Although the estimated point is not exactly on the global optimum in practice due to incorrect directions or inaccurate population movements. However, it has higher possibility that the estimated point is close to the global optimum.

Moving vector
Fig. 1: Moving vector is calculated from a parent (worse) individual and its offspring (better) . The is the estimated convergence point.

Let us derive how to estimate the convergence point mathematically. First, we define parent (worse) individual , offspring (better) , and moving vector as describe in Fig 1. The unit direction vector of is given as ,i.e., . denotes the estimated convergence point, and represents the expansion from parent individual with the direction . in Eq (2) becomes the minimum.

(2)

As the minimum line segment from the convergence point to the expansion line segments is the orthogonal projection from , we can apply the Eq (3) into Eq (2) to remote .

(3)

Finally, the convergence point can be calculated by Eq (4). See detail expansion of equations in paper [20].

(4)

Ii-A4 Nsga-Ii

Non-dominated Sorting Genetic Algorithm (NSGA-II) is proposed by Deb in 2002 to solve MOPs[21]. Now, NSGA-II is one of the most popular MOEAs with three special characteristics, the fast non-dominated sorting approach, the fast crowded distance estimation procedure, and the simple crowded comparison operator. Deb et al. simulated several test problems from previous studies using the NSGA-II optimization technique and showed exciting results. The pseudocode of NSGA-II is described in Algorithm 1

Input:
Output:
1 Function NSGA-II():
2       nonDominated() while not stop criterion do
3             fastNonDominateSorting() while  do
4                   crowdingDistanceAssigment()
5             end while
6            Sort() fill with the less crowded individuals of Selection() Crossover() Mutation() nonDominated()
7       end while
8      
9
Algorithm 1 NSGA-II

Ii-B A brief review of grouping methods

Based on the idea of divide and conquer, the CC framework was proposed in 1994[10] and has been popularized in large-scale optimization. CC requires decomposing the problem into a set of low-dimensional subproblems, each of which is optimized separately. Since the candidate solutions of each sub-component cannot form a complete solution, representative solutions of other sub-components are required to form a complete solution for evaluation. These representative solutions are known as the context vector[22]. The context vector is updated iteratively and acts as the context in which the cooperation occurs. In this part of the section, we briefly introduce the grouping principle on LSSOPs, which can be classified into three major groups: automatic, semi-automatic, and -strategy.

Ii-B1 Automatic

In the automatic grouping methods, the formation of sub-components completely depends on the logic of the algorithm. The representative algorithms include LINC-R[18] and LIMD[23], and are extended to DG[24], DG2[25], ERDG[26], and CCVIL[27], etc. respectively. These methods are mainly designed based on perturbations. Eq (5) defines a sample perturbs in -D and -D and both in -D and -D.

(5)

Eq (6) is employed by LINC-R to identify the interaction between and .

(6)

is the allowable error. LIMD applied Eq (7) to detect the interaction between and .

(7)

When and not satisfy the simultaneous increase or decrease on at least one individual of population, LIMD identifies and as nonseparable variables. Notice that we cannot apply LINC-R and LIMD to the whole fitness landscape in practice, and only finite individuals are checked. In other words, if the condition both in LINC-R and LIMD are not satisfied, then and are identified as separable variables.

Ii-B2 Semi-automatic

Semi-automatic methods decompose the problem depending on the both algorithm logic and parameters specified by users. Some studies apply statistical detection methods to form the sub-components by a threshold or a set of intervals defined on correlation coefficients with the participation of users, such as AVP2[28] and 4CDE[29]. The fuzzy c-mean algorithm proposal by Fan et al[30]. forms the sub-components with the number of components. This kind of algorithm often consumes fewer FEs to detect the interactions, which is an advantage compared with automatic methods.

Ii-B3 -strategy

This kind of algorithm requires fewer FEs to form the sub-components than automatic methods and semi-automatic methods. The number and the size of each component are necessary hyperparameters decided by users. And the representative detection principles include Random Grouping[31, 32], Delta Grouping[33], Fitness Difference Partitioning[34, 35], and so on.

CC framework is a well-studied technique for LSSOPs, and in LSMOPs, due to the multiple objective functions, and the variable interactions in different objective functions are often different, so the grouping methods developed for LSSOPs always need to be modified for LSMOPs. Currently, the popular grouping strategies on LSMOPs include Random Grouping[3], Differential Grouping[14], and Variables Analysis[15]. In paper [36], dynamic Random Grouping is applied. The variables are regrouped after each generation of optimization. As same as the Random Grouping in LSSOPs, the probability of two interacting variables being divided in the same sub-problem is quite high in multiple trial experiments. In DG for LSMOPs, the paper[13] applies Eq (6) to all objective functions, and when and satisfy Eq (6) in all objective functions, they are identified as separable variables. Both Random Grouping and DG were originally proposed to solve LSSOPs, and these methods focus on partitioning decision variables into sub-problems correctly while ignoring the population diversity in the objective space. Therefore, the variable grouping methods based on Random Grouping or DG can easily find some local or global optimal solutions but may not be able to diversify the population along the whole Pareto front. Variable Analysis (VA) is a grouping method for LSMOPs. MOEA/DVA[15] perturbs several times on a random sample. If all the perturbed samples are non-dominated with each other, this variable is considered a position variable, and if each perturbed sample is dominated or dominating the rest samples, this variable is regarded as a distance variable, otherwise, it is regarded as a mixed variable. MOEA/DVA optimizes the different types of variables in order and allocates different computational resources. Numerical experiments show that MOEA/DVA significantly outperforms many other MOEAs on LSMOPs in benchmarks.

Iii CC-hNSGA-LMM

In this section, we will introduce the details of our proposal and the techniques. The flowchart of our proposal is shown in Figure 2.

The flowchart of CC-hNSGA-LMM
Fig. 2: The flowchart of CC-hNSGA-LMM

Our proposal includes the decomposition stage and the optimization stage. Next, we will introduce the flow of our proposal in detail.

In decomposition stage, our previous research[37] regards the decomposition problem as an optimization problem and designed the linkage measurement function based on LINC-R. Next, we will give a simple mathematical explanation of our proposal.

In LINC-R, we can rewrite Eq (6) to Eq (8)

(8)

And we notice that LINC-R can be understood as the additive form of vector. Eq (8) can also be written to Eq (9)

(9)

Figure 3 shows how original LINC-R and variant LINC-R work on separable variables.

(a).The original LINC-R works on the separable variables. (b).The variant LINC-R works on the separable variables.
Fig. 3: (a).The original LINC-R works on the separable variables. (b).The variant LINC-R works on the separable variables.

Next, we derive the variant LINC-R to -D or higher dimensions. In -D space, the schematic diagram is shown in Figure 4.

The variant LINC-R works on 3-D space
Fig. 4: The variant LINC-R works on 3-D space

Similarly, we define the fitness difference in -D space in Eq (10)

(10)

And Eq (11) defines the variant LINC-R in -D space

(11)

Therefore, we can reasonably infer the variant LINC-R in -D space on Eq (12).

(12)

Notice that we only detect the interactions based on the finite individuals, which means when is not satisfied at least once in all individuals, then we consider this function is a fully separable function by default. Although this strategy is limited especially for trap functions, it is impossible to check the interactions on the whole fitness landscape, and in LSSOPs, perturbation-based methods often identify the interactions based on 1 sample such as DG, DG2, etc. due to the FEs. Thus, Eq (13) is approximately correct in LSSOPs.

(13)

However, when Eq (13) is not satisfied, we only know that interactions exist in some variable pairs, but we cannot know the interactions exist in which pairs, so we can actively to detect the interactions between variables. Taking the -D space as an example,

(14)

Therefore, Our target is to apply the heuristic algorithm to find the interactions between all variables as much as possible. According to the above explanation, in the -D problem, the linkage measurement function in our proposal is designed as Eq (15)

(15)

is the number of sub-problems. Eq (15) is the original linkage measurement function of our proposal[37]. Meanwhile, our further research notice that this linkage measurement function often contains multiple optima especially in separable functions and partially separable function. Therefore, we can attach a reasonable penalty to lead the direction of optimization. Eq (16) defines a linkage measurement function with a penalty term.

(16)

Then, we extend Eq (16) to LSMOPs with multiple samples. Eq (17) is our linkage measurement function in this paper.

(17)

is the weight of the objective function, and is the number of objective functions in LSMOPs. We apply averaging weight in Eq (17).

To optimize this linkage measurement function, we also apply Elitist Genetic Algorithm (EGA)[38]. The pseudocode of decomposition stage is shown in Algorithm 2

Input:
Output:
1 Function LMM():
2       (Initialization) for  do
3             for  do
4                  
5             end for
6            
7       end for
8       Evaluate() (Fully separable function) if  then
9            
10       end if
11       Evaluate() bestIndividual() (Optimization) while not stop criterion do
12             Selection() Crossover() Mutation() Evaluate() Replace() bestIndividual()
13       end while
14      
15
Algorithm 2 Decomposition Stage

The elitist reservation strategy directly replicates the best individual without crossover, mutation, and selection to the next generation. This strategy can prevent the optimal individual from destroying the superior gene and chromosome structure during crossover mutation.

In optimization stage, first we hypothesize that there is much possible for better individuals around the PF individuals. This hypothesis is extended from single objective optimization problems[20]. Based on this hypothesis, we find the PF in every generation of optimization by quick search. The pseudocode of quick search is shown in Algorithm 3.

Input:
Output:
1 Function QS():
2       (All solutions are PF in initialization) for  do
3             if  then
4                   continue
5             end if
6            for  do
7                   if  then
8                         continue
9                   end if
10                   if  then
11                         break
12                   end if
13                  else if  then
14                        
15                   end if
16                  else
17                         continue
18                   end if
19                  
20             end for
21            
22       end for
23       for  do
24             if  then
25                  
26             end if
27            
28       end for
29      
30
Algorithm 3 Quick Search

And the function Dominate is realized in Algorithm 4.

Input:
Output:
1 Function Dominate():
2       for  do
3             if  then
4                  
5             end if
6            if  then
7                  
8             end if
9            
10       end for
11      if  and  then
12            
13       end if
14      else if  and  then
15            
16       end if
17      else
18            
19       end if
20      
21
Algorithm 4 Domination identification

The target of our quick search is only to find the whole PF in the current population, thus the fast non-dominated sorting in this situation is wasteful and unnecessary. From Algorithm 3, the worst and best time complexity of quick search is and respectively. is the number of objective functions, and is the population size. After the PF is found, we estimate the convergence point with averaging strategy and apply the Gaussian sampling with the mean of the estimated convergence point. Figure 5 and Algorithm 5 shows how our proposal works.

The averaging strategy to estimate the convergence point approximately.
Fig. 5: The averaging strategy to estimate the convergence point approximately.
Input:
Output:
1 Function EGS():
2       (Estimate a convergence point with averaging) for  do
3             for  do
4                  
5             end for
6            
7       end for
8      
9
Algorithm 5 Estiamte a convergence point and apply Gaussian sampling

Finally, the whole process of our proposal in optimization stage is described on Algorithm 6

Input:
Output:
1 Function OPT():
2       // context population for  do
3             for  do
4                   // Population before non-Dominate sorting
5             end for
6            
7       end for
8      
9
Algorithm 6 Optimization Stage

Iv Experiment results and analysis

In this Section, we ran many experiments to evaluate our proposal, CC-hNSGA-LMM. In Section IV-A, we introduce the experiment settings, including benchmark functions, comparing methods, parameters of algorithms, and the performance indicators. In Section IV-B, we provide the experiment results. Finally, we analyze our proposal both in decomposition stage and optimization stage in Section IV-C.

Iv-a Experiment Settings

Iv-A1 Benchmark functions

We conduct our experiments on benchmark functions up to -D and -D. The details of benchmark functions are shown in Table I

Test Suite Benchmark Feature of PF Separability
ZDT[39] ZDT1 Convex Separable
ZDT2 Concave Separable
ZDT3 Convex, disconnected Separable
ZDT4 Convex Separable
ZDT5 Convex Separable
ZDT6 Concave Separable
DTLZ[40] DTLZ1 Linear Separable
DTLZ2 Concave Separable
DTLZ3 Concave Separable
DTLZ4 Concave Separable
DTLZ5 Concave, degenerate Separable
DTLZ6 Concave, degenerate Separable
DTLZ7 Disconnected Separable
UF[41] UF1 Concave Separable
UF2 Concave Separable
WFG[42] WFG1 Convex Separable
WFG2 Convex, disconnected Partially separable
WFG3 Linear, degenerate Partially separable
WFG4 Concave Separable
WFG5 Concave Separable
WFG7 Concave Separable
TABLE I: The benchmark functions of our experiment

We did not apply high-dimensional WFG6, WFG8, and WFG9 as our benchmark functions because these functions are not suitable for extending to high dimensions due to high computational cost. All benchmark functions are provide by geatpy[43] and pymoo[44].

Iv-A2 Compareing methods and parameters

We combine our proposal in decomposition with NSGA-II (CC-NSGA-LMM) and comparing with Random Grouping (CC-NSGA-G)[3], Differential Grouping (CC-NSGA-DG)[14], and Monotonicity Detection (CC-NSGA-LIMD)[45] with 30 trial runs. Besides, we compare our ultimate proposal (CC-hNSGA-LMM) with CC-NSGA-LMM to verify the effect of hybrid NSGA-II. Table II shows the parameters of our proposal in the grouping stage, and Table III shows the parameters of sub-problems optimization. The NSGA-II is also provided by geatpy.

Parameter value
Optimization direction Minimization
Optimizer Elitist GA
Population size 20
Max iteration 20
Gene length 6 and 7
TABLE II: The parameters of decomposition optimization
Parameter value
Dimension -D and -D
FEs 750,000 and 1,500,000
Optimization direction Minimization
Optimizer NSGA-II
Population size 50
Crossover rate 0.9
Mutation rate 0.2
TABLE III: The parameters of sub-problems optimization

Iv-A3 Performance indicators

To evaluate the performance of our proposal, we introduced Hypervolume (HV)[46] and Inverted Generational Distance (IGD)[47] as indicators to evaluate the performance of the algorithms. HV was first presented as the size of the space covered. Given a solution set and a reference point , HV can be calculated as Eq (18)

(18)

where denotes the Lebesgue measure. IGD is also the most commonly used indicator. given a solution set and a reference set , IGD can be defined as Eq (19)

(19)

where denotes the Euclidean distance between and , and a lower IGD value means better performance as same as HV.

Iv-B Performance of our proposal

In this section, the performance of CC-hNSGA-LMM is studied, both on our proposed decomposition method and the introduction of Gaussian sampling based on an estimated convergence point. Experiments are conducted on the benchmark functions presented in Section IV-A1 with 30 independent runs. Besides, we randomly choose one trial run result in 30 trial runs and draw the PF graph within comparing methods and reference sets. Due to space limitations, we select some representative PF graphs in Figure6. The mean of HV and IGD calculated in 30 trial runs are shown in Table IV and Table V. The best solution among CC-NSGA-G, CC-NSGA-DG, CC-NSGA-LIMD, and CC-NSGA-LMM these 4 methods is highlighted with red and orange in -D and -D respectively to show the performance of our proposed decomposition method. Besides, we mark the better solution between CC-NSGA-LMM and CC-hNSGA-LMM with to show the effect of the introduction of Gaussian sampling based on an estimated convergence point. The FEs consumed in the decomposition stage of DG, LIMD, and LMM are provided in Table VI.

The representive PF graphs within 5 methods and reference sets in
Fig. 6: The representive PF graphs within 5 methods and reference sets in -D and -D.
Func CC-NSGA-G CC-NSGA-DG CC-NSGA-LIMD CC-NSGA-LMM CC-hNSGA-LMM
-D -D -D -D -D -D -D -D -D -D
ZDT1 0.913 1.147 0.581 0.710 0.872 1.101 0.158 0.177 0.134 0.144
ZDT2 0.918 1.517 0.523 0.784 0.851 1.221 0.075 0.296 0.071 0.148
ZDT3 0.499 1.376 0.193 0.714 0.443 0.907 0.177 0.195 0.150 0.083
ZDT4 1690.193 1578.118 1006.923 773.871 1615.125 1360.387 149.293 265.902 112.492 168.336
ZDT5 1408.709 2175.905 963.814 1511.787 1395.165 2058.472 947.672 1506.865 885.373 1262.068
ZDT6 0.431 0.355 0.384 0.422 0.507 0.270 0.335 0.221 0.279 0.192
DTLZ1 8.261e10 6.956e11 30228.461 2.403e11 7.210e10 8.988e11 26869.899 2.111e5 1440.654 9060.585
DTLZ2 706.197 10266.957 0.073 4269.415 428.909 9674.223 0.077 0.067 0.063 0.056
DTLZ3 4.158e12 7767.971 1.780e6 8.773e12 3.251e12 4.632e13 1.545e6 9.983e6 1.271e5 8.328e5
DTLZ4 1011.450 14788.623 0.043 50.488 539.192 11120.653 0.059 0.099 0.060 0.091
DTLZ5 538.285 7767.971 0.161 4270.747 290.810 6144.751 0.175 0.251 0.148 0.211
DTLZ6 5.716e6 5.145e7 4.690e6 4.994e7 3.746e6 4.385e7 2.103e5 1.997e6 4.035e4 3.554e5
DTLZ7 1.184 1.886 0.174 0.445 0.970 1.660 0.153 0.364 0.093 0.317
UF1 1.232 1.580 0.101 0.388 0.741 1.274 0.095 0.155 0.067 0.107
UF2 0.136 1.197 0.105 0.645 0.121 0.849 0.094 0.280 0.089 0.277
WFG1 5.090 1.536 4.552 0.841 4.973 0.903 4.179 0.884 3.965 0.520
WFG2 0.335 0.625 0.317 0.657 0.269 0.542 0.309 0.604 0.289 0.587
WFG3 0.273 5.077 0.288 4.260 0.237 4.036 0.264 3.897 0.253 3.957
WFG4 3.698 4.961 2.855 3.579 3.592 3.680 3.032 3.150 3.057 2.652
WFG5 0.795 3.219 0.140 1.783 0.692 2.617 0.130 1.323 0.129 0.981
WFG7 2.785 6.560 2.233 6.264 2.733 6.283 2.136 6.114 2.010 5.842
TABLE IV: The mean of HV among 5 methods in 30 trial runs
Func CC-NSGA-G CC-NSGA-DG CC-NSGA-LIMD CC-NSGA-LMM CC-hNSGA-LMM
-D -D -D -D -D -D -D -D -D -D
ZDT1 1.171 1.474 0.627 0.760 1.097 1.379 0.025 0.016 0.011 0.007
ZDT2 2.492 3.012 1.533 1.800 2.306 2.842 0.280 0.273 0.205 0.190
ZDT3 0.816 0.981 0.448 0.511 0.731 0.899 0.014 0.015 0.008 0.007
ZDT4 6106.310 13626.344 3659.787 8258.761 5861.058 13467.469 539.775 1151.956 403.140 845.605
ZDT5 10.899 10.470 7.350 8.613 9.116 10.594 7.183 7.153 7.194 6.221
ZDT6 1.224 1.152 1.182 1.261 0.976 0.887 1.053 0.816 0.938 0.784
DTLZ1 10183.928 21115.183 77.196 3432.782 9639.649 22587.874 76.844 154.833 28.667 53.545
DTLZ2 19.094 45.582 0.065 7.940 16.762 45.225 0.067 0.133 0.006 0.002
DTLZ3 33931.424 73316.697 307.394 11939.406 31664.480 75494.883 312.767 602.455 131.318 244.015
DTLZ4 21.575 51.164 0.129 8.361 17.541 47.096 0.249 0.453 0.271 0.346
DTLZ5 19.490 45.704 0.071 7.851 16.962 45.825 0.069 0.127 0.002 0.003
DTLZ6 389.946 785.520 375.773 807.134 363.994 792.004 170.141 356.833 94.073 197.733
DTLZ7 5.528 6.859 0.146 1.235 4.308 5.981 0.125 0.133 0.036 0.027
UF1 1.419 1.651 0.491 0.658 1.043 1.444 0.367 0.378 0.384 0.323
UF2 0.445 0.557 0.217 0.278 0.332 0.454 0.076 0.081 0.082 0.091
WFG1 3.798 3.870 1.614 1.999 3.697 3.882 1.503 1.502 1.502 1.405
WFG2 0.719 0.732 0.709 0.720 0.590 0.586 0.630 0.621 0.610 0.612
WFG3 1.538 1.078 0.894 0.895 0.863 0.908 0.883 0.835 0.871 0.871
WFG4 0.561 0.496 0.009 0.089 0.739 0.747 0.012 0.015 0.015 0.009
WFG5 3.213 0.720 3.020 0.228 3.252 0.712 2.989 0.149 2.835 0.137
WFG7 0.656 0.714 0.590 0.649 0.639 0.676 0.587 0.595 0.568 0.605
TABLE V: The mean of IGD among 5 methods in 30 trial runs
Func DG LIMD LMM
-D -D -D -D -D -D
ZDT1 63250 251500 1974 3972 1503 3003
ZDT2 63250 251500 1974 3972 1503 3003
ZDT3 63250 251500 1974 3972 1503 3003
ZDT4 63250 251500 1974 3972 1503 3003
ZDT5 250500 1001000 1974 3972 1503 3003
ZDT6 244558 989058 1974 3972 1503 3003
DTLZ1 249502 999002 1974 3972 74649 149304
DTLZ2 249502 999002 1974 3972 1503 3003
DTLZ3 249502 999002 1974 3972 74652 149274
DTLZ4 249502 999002 1974 3972 1503 3003
DTLZ5 249502 999002 1974 3972 1503 3003
DTLZ6 1976 3974 1974 3972 74629 149271
DTLZ7 250500 1001000 1974 3972 1503 3003
UF1 250500 1001000 1974 3972 1503 3003
UF2 250500 1001000 1974 3972 1503 3003
WFG1 249502 999002 1974 3972 1503 3003
WFG2 247518 999002 63250 251500 74364 149325
WFG3 247518 999002 63250 251500 73253 148245
WFG4 249502 999002 1974 3972 1503 3003
WFG5 249502 999002 63250 251500 1503 3003
WFG7 248506 997006 1974 3972 1503 3003
TABLE VI: FEs consumed on decomposition stage of three variable grouping methods

Iv-C Analysis

In this Section, we will analyze the effect of the LMM in decomposition and the Gaussian sampling in optimization.

Iv-C1 LMM in decomposition

From Table IV and Table V, we can see our proposed CC-NSGA-LMM outperforms the compared three methods in the majority of test functions. This is mainly due to the following aspects. (1). We calculate the linkage measurement function based on multiple samples. Although this is a necessary condition for the employment of LINC-R and LIMD in low-dimensional space, the linkage is only possible to check the nonlinearity around a sample point in high-dimensional space due to the FEs limitation, such as DG. Therefore, our proposal is more robust for solving large-scale optimization problems based on CC. (2). Although our proposal is based on LINC-R, the existence of the penalty allows our proposal to ignore some weak interactions between variables. This process will increase the error in the optimization stage, it can accelerate the sub-problems optimization, especially under the limitation of FEs.

From Table VI, we notice that in our proposed decomposition method, and evaluation times appear in the -D and -D test functions frequently, This is because our proposal identifies these test functions as fully separable, which is consistent with the description of functions in Table I. According to the FEs consumed by DG in the decomposition stage, it can be seen that when FEs are approximately equal to and in -D and -D functions respectively, DG identifies the problem as a separable function. And the FEs saved in the decomposition stage allow more FEs to be allocated to optimize the sub-problems, which makes our proposal better than the compared methods.

Meanwhile, we notice that CC-NSGA-DG and CC-NSGA-LIMD outperform our proposal in some -D test functions, such as DTLZ2-5. But this competitiveness almost disappeared in the -D test function. This is because the time complexity of DG and LIMD is . is the population size, and is the dimension. DG sets in high-dimensional space. So as the dimension increases, more FEs are necessary to identify the interactions between variables. This is the main reason for the rapid degeneration of DG and LIMD when the dimension approaches to -D. And our proposal can control the time complexity by controlling some hyperparameters, which enables our proposal to have more feasibility to extend to larger-scale optimization problems.

However, we notice that in ZDT1-4, DG identifies these functions as partially separable functions, and DG identifies DTLZ6 as completely non-separable functions, which is inconsistent with the separability description in Table I. Let’s take ZDT1 as an example. The formula of ZDT1 is shown in Eq (20)

(20)

We can see that although in Eq (20) is a separable function at the monotonicity level in the limited search space, DG cannot detect this information, which reveals the limitation of DG in identifying the separability of such variables. At the same time, our proposal LMM identifies ZDT1-4 as fully separable functions. Due to the existence of the penalty, which weakens the effect of the fitness difference term in the linkage measurement function, so the penalty term dominates the direction of optimization.

Meanwhile, LMM recognizes DTLZ1, DTLZ3, and DTL6 as partially separable functions, which is because the fitness difference term still occupies a large proportion in the linkage measurement function. In future research, we can adaptively adjust the intensity of the penalty through some methods such as machine learning and reinforcement learning.

Iv-C2 Gaussian sampling based on an estimated convergence point

From Table IV and Table V, we can say the introduction of the Gaussian sampling based on an estimated convergence point can accelerate the convergence of optimization and find better PFs, which proves the hypothesis we proposed in Section I is true. Paper[48] has already proved that only an estimated convergence point may not accelerate the convergence with numerical experiments, especially in multimodal functions. However, our proposal does not rely on the estimated convergence point in excess but considers the area centered on the estimated convergence point as the trust region. This strategy gives more possibilities for exploitation, although it needs to consume some FEs.

V Discuss

The above analysis shows our proposal has broad prospects to solve LSMOPs, however, there are still many aspects for improvement. Here, we list a few open topics for potential and future research.

V-a The design of linkage measurement function

In our proposed variable grouping method, we design a linkage measurement function to lead the direction of decomposition solution search, and there are mainly three important constituent elements of this function: fitness difference term, penalty, and weights of the objective function. To simplify the design of the linkage measurement function, we apply constant penalty and averaging weights. In practice, the intensity of the penalty and the weights of objective functions should be changed based on the feature of LSMOPs. For example, in fully separable LSMOPs, the penalty will play a decisive role to lead the direction of optimization due to the fitness difference term being infinitely close to 0, and the intensity of the penalty will decrease with the increase of interactions between variables. And for weights of objective functions, the importance of different objective functions should be different in practice. Adaptively deciding these hyperparameters based on reinforcement learning or machine learning is our future research topic.

V-B More powerful local search operator

In this experiment, we design a Gaussian sampling operator with the mean of an estimated convergence point and a constant variance as our local search operator and achieved good experimental results. The search space of the optimization problem is often different, and the search space corresponding to the concept of ”local” is also different. Given a simple example, the search space of two problems are and respectively, so it is unreasonable to apply a constant as the Gaussian sampling variance for all kinds of optimization problems. Therefore, an adaptive variance determination strategy is necessary. In future research, it is promising research to introduce the CMA-ES[49] as our local search operator.

V-C The scalability of our proposal

This paper mainly contributes two different aspects to accelerate the convergence of LSMOPs. (1). Designing a novel decomposition method. (2). Developing a efficient Gaussian samlping operator based on the estimated convergence point combined with NSGA-II.

For decomposition method, our proposal performed well on -D and -D LSMOPs. In the face of higher dimensional problems, The computational cost of DG, especially on fully separable functions, is completely unacceptable[24]. However, our proposal can control the consumption of FEs by adjusting some hyperparameters, such as the length of genes (number of groups), the maximum iteration of decoposition optimization, etc. And the previous research shows that the proposed variable grouping method can identify the interactions between variables in noisy environments. Therefore, our proposed grouping method has stronger scalability than the current popular variable grouping methods theoretically. In the future, the introduction of our proposal to solve higher-dimensional LSMOPs, constraint LSMOPs, and real-world LSMOPs is our research direction.

For the Gaussian sampling operator, experimental results show that our proposal can accelerate the convergence and find better PF in solving LSMOPs. In theory, our designed Gaussian sampling operator has strong scalability to combine with GA, DE, and PSO in single-objective evolutionary algorithms (SOEAs), and MOEA/D, NSGA in MOEAs with simple modification. In the future, the combination of our proposal with novel EAs to solve more complex optimization problems is a promising research topics.

Vi Conclusion

In this paper, we extend our previous research on the variable grouping method for LSSOPs to LSMOPs and design a Gaussian sampling operator based on an estimated convergence point to accelerate the convergence of the optimizer to find PF. To evaluate our proposal, we conduct our experiments on -D and -D test functions and compare our proposal with popular methods. Experiments show that our proposed variable grouping method is better than the three compared methods, and our proposed hNSGA can significantly accelerate the convergence in optimization. At the end of this paper, we list some interesting topics which can improve our algorithm. Finally, our proposal has broad prospects for addressing LSMOPs.

Vii Acknowledgement

This work was supported by JSPS KAKENHI Grant Number JP20K11967.

References

  • [1] R. T. Marler and J. S. Arora, “Survey of multi-objective optimization methods for engineering,” Structural and multidisciplinary optimization, vol. 26, no. 6, pp. 369–395, 2004.
  • [2] M. Köppen, “The curse of dimensionality,” in 5th online world conference on soft computing in industrial applications (WSC5), vol. 1, 2000, pp. 4–8.
  • [3] L. M. Antonio and C. A. C. Coello, “Use of cooperative coevolution for solving large scale multiobjective optimization problems,” in 2013 IEEE Congress on Evolutionary Computation.   IEEE, 2013, pp. 2758–2765.
  • [4] Y. Tian, L. Si, X. Zhang, R. Cheng, C. He, K. C. Tan, and Y. Jin, “Evolutionary large-scale multi-objective optimization: A survey,” ACM Comput. Surv., vol. 54, no. 8, 2021.
  • [5] C. He, L. Li, Y. Tian, X. Zhang, R. Cheng, Y. Jin, and X. Yao, “Accelerating large-scale multiobjective optimization via problem reformulation,” IEEE Transactions on Evolutionary Computation, vol. 23, no. 6, pp. 949–961, 2019.
  • [6] J.-H. Yi, L.-N. Xing, G.-G. Wang, J. Dong, A. V. Vasilakos, A. H. Alavi, and L. Wang, “Behavior of crossover operators in nsga-iii for large-scale optimization problems,” Inf. Sci., vol. 509, no. C, p. 470–487, jan 2020.
  • [7] B. Li, J. Li, K. Tang, and X. Yao, “Many-objective evolutionary algorithms: A survey,” ACM Computing Surveys (CSUR), vol. 48, no. 1, pp. 1–35, 2015.
  • [8] Z. Fan, Y. Fang, W. Li, J. Lu, X. Cai, and C. Wei, “A comparative study of constrained multi-objective evolutionary algorithms on constrained multi-objective optimization problems,” in 2017 IEEE Congress on Evolutionary Computation (CEC), 2017, pp. 209–216.
  • [9] T. Chugh, K. Sindhya, J. Hakanen, and K. Miettinen, “A survey on handling computationally expensive multiobjective optimization problems with evolutionary algorithms,” Soft Computing, vol. 23, no. 9, pp. 3137–3166, 2019.
  • [10] M. Potter and K. De Jong, “A cooperative coevolutionary approach to function optimization,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 866 LNCS, pp. 249–257, 1994.
  • [11] M. N. Omidvar, X. Li, and X. Yao, “A review of population-based metaheuristics for large-scale black-box global optimization: Part a,” IEEE Transactions on Evolutionary Computation, pp. 1–1, 2021.
  • [12] Omidvar, Mohammad Nabi and Li, Xiaodong and Yao, Xin, “A review of population-based metaheuristics for large-scale black-box global optimization: Part b,” IEEE Transactions on Evolutionary Computation, pp. 1–1, 2021.
  • [13] M. Li and J. Wei, “A cooperative co-evolutionary algorithm for large-scale multi-objective optimization problems,” in Proceedings of the Genetic and Evolutionary Computation Conference Companion, 2018, pp. 1716–1721.
  • [14] F. Sander, H. Zille, and S. Mostaghim, “Transfer strategies from single- to multi-objective grouping mechanisms,” in Proceedings of the Genetic and Evolutionary Computation Conference, ser. GECCO ’18.   Association for Computing Machinery, 2018, p. 729–736.
  • [15] X. Ma, F. Liu, Y. Qi, X. Wang, L. Li, L. Jiao, M. Yin, and M. Gong, “A multiobjective evolutionary algorithm based on decision variable analyses for multiobjective optimization problems with large-scale variables,” IEEE Transactions on Evolutionary Computation, vol. 20, no. 2, pp. 275–298, 2015.
  • [16] S. Mirjalili and S. Z. M. Hashim, “A new hybrid psogsa algorithm for function optimization,” in 2010 International Conference on Computer and Information Application, 2010, pp. 374–377.
  • [17] D. Wolpert and W. Macready, “No free lunch theorems for optimization,” IEEE Transactions on Evolutionary Computation, vol. 1, no. 1, pp. 67–82, 1997.
  • [18] M. Tezuka, M. Munetomo, and K. Akama, “Linkage identification by nonlinearity check for real-coded genetic algorithms,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3103, pp. 222–233, 2004.
  • [19] H. Kargupta and D. E. Goldberg, “Blackbox optimization: Implications of search,” in In communication. Submitted in International Journal of Foundation of Computer Science, 1996, pp. 96–63.
  • [20] N. Murata, R. Nishii, H. Takagi, and Y. Pei, “Analytical estimation of the convergence point of populations,” in 2015 IEEE Congress on Evolutionary Computation, CEC 2015 - Proceedings.   Institute of Electrical and Electronics Engineers Inc., Sep. 2015, pp. 2619–2624.
  • [21] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: Nsga-ii,” IEEE transactions on evolutionary computation, vol. 6, no. 2, pp. 182–197, 2002.
  • [22] F. Van den Bergh and A. P. Engelbrecht, “A cooperative approach to particle swarm optimization,” IEEE transactions on evolutionary computation, vol. 8, no. 3, pp. 225–239, 2004.
  • [23] M. Munetomo and D. E. Goldberg, “Linkage identification by non-monotonicity detection for overlapping functions,” Evolutionary computation, vol. 7, no. 4, pp. 377–398, 1999.
  • [24] M. Omidvar, X. Li, Y. Mei, and X. Yao, “Cooperative co-evolution with differential grouping for large scale optimization,” IEEE Transactions on Evolutionary Computation, vol. 18, no. 3, pp. 378–393, 2014.
  • [25] M. N. Omidvar, M. Yang, Y. Mei, X. Li, and X. Yao, “Dg2: A faster and more accurate differential grouping for large-scale black-box optimization,” IEEE Transactions on Evolutionary Computation, vol. 21, no. 6, pp. 929–942, 2017.
  • [26] M. Yang, A. Zhou, C. Li, and X. Yao, “An efficient recursive differential grouping for large-scale continuous problems,” IEEE Transactions on Evolutionary Computation, vol. 25, no. 1, pp. 159–171, 2021.
  • [27] W. Chen, T. Weise, Z. Yang, and K. Tang, “Large-scale global optimization using cooperative coevolution with variable interaction learning,” in International Conference on Parallel Problem Solving from Nature.   Springer, 2010, pp. 300–309.
  • [28] H. K. Singh and T. Ray, Divide and Conquer in Coevolution: A Difficult Balancing Act.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 117–138.
  • [29] Y. Rojas and R. Landa, “Towards the use of statistical information and differential evolution for large scale global optimization,” in 2011 8th International Conference on Electrical Engineering, Computing Science and Automatic Control, 2011, pp. 1–6.
  • [30] J. Fan, J. Wang, and M. Han, “Cooperative coevolution for large-scale optimization based on kernel fuzzy clustering and variable trust region methods,” IEEE Transactions on Fuzzy Systems, vol. 22, no. 4, pp. 829–839, 2014.
  • [31] Z. Yang, K. Tang, and X. Yao, “Large scale evolutionary optimization using cooperative coevolution,” Information Sciences, vol. 178, no. 15, pp. 2985–2999, 2008.
  • [32] M. N. Omidvar, X. Li, Z. Yang, and X. Yao, “Cooperative co-evolution for large scale optimization through more frequent random grouping,” in IEEE Congress on Evolutionary Computation, 2010, pp. 1–8.
  • [33] M. Omidvar, X. Li, and X. Yao, “Cooperative co-evolution with delta grouping for large scale non-separable function optimization,” 2010, pp. 1–8.
  • [34] E. Sayed, D. Essam, R. Sarker, and S. Elsayed, “Decomposition-based evolutionary algorithm for large scale constrained problems,” Information Sciences, vol. 316, pp. 457–486, 2015.
  • [35] G. Dai, X. Chen, L. Chen, M. Wang, and L. Peng, “Cooperative coevolution with dependency identification grouping for large scale global optimization,” in 2016 IEEE Congress on Evolutionary Computation (CEC), 2016, pp. 5201–5208.
  • [36] A. Song, Q. Yang, W.-N. Chen, and J. Zhang, “A random-based dynamic grouping strategy for large scale multi-objective optimization,” in 2016 IEEE congress on evolutionary computation (CEC).   IEEE, 2016, pp. 468–475.
  • [37] R. Zhong and M. Munetomo, “Random Population-based Decomposition Method by Linkage Identification with Non-linearity Minimization on Graph,” in Transactions on Computational Science & Computational Intelligence.   Springer, Accepted.
  • [38] K. A. De Jong, An analysis of the behavior of a class of genetic adaptive systems.   University of Michigan, 1975.
  • [39] E. Zitzler, K. Deb, and L. Thiele, “Comparison of multiobjective evolutionary algorithms: Empirical results,” Evol. Comput., vol. 8, no. 2, p. 173–195, 2000.
  • [40] K. Deb, L. Thiele, M. Laumanns, and E. Zitzler, Scalable Test Problems for Evolutionary Multiobjective Optimization.   London: Springer London, 2005, pp. 105–145.
  • [41] Q. Zhang, A. Zhou, S. Zhao, P. Suganthan, W. Liu, and S. Tiwari, “Multiobjective optimization test instances for the cec 2009 special session and competition,” Mechanical Engineering, 01 2008.
  • [42] S. Huband, P. Hingston, L. Barone, and L. While, “A review of multiobjective test problems and a scalable test problem toolkit,” IEEE Transactions on Evolutionary Computation, vol. 10, no. 5, pp. 477–506, 2006.
  • [43] e. Jazzbin, “geatpy: The genetic and evolutionary algorithm toolbox with high performance in python,” 2020.
  • [44] J. Blank and K. Deb, “pymoo: Multi-objective optimization in python,” IEEE Access, vol. 8, pp. 89 497–89 509, 2020.
  • [45] K. Izumiya and M. Munetomo, “Multi-objective Evolutionary optimization based on Decomposition with Linkage Identification considering monotonicity,” in 2017 IEEE Congress on Evolutionary Computation (CEC), 2017, pp. 905–912.
  • [46] E. Zitzler and L. Thiele, “Multiobjective Optimization Using Evolutionary Algorithms - A Comparative Case Study,” in Proceedings of the 5th International Conference on Parallel Problem Solving from Nature, ser. PPSN V.   Berlin, Heidelberg: Springer-Verlag, 1998, p. 292–304.
  • [47] M. Li and X. Yao, “Quality evaluation of solution sets in multiobjective optimisation: A survey,” ACM Comput. Surv., vol. 52, no. 2, mar 2019.
  • [48] Y. Pei, J. Yu, and H. Takagi, “Search acceleration of evolutionary multi-objective optimization using an estimated convergence point,” Mathematics, vol. 7, no. 2, 2019.
  • [49] N. Hansen, “The cma evolution strategy: A tutorial,” arXiv preprint arXiv:1604.00772, 2016.