HABCO: A Robust Agent on Hybrid Ant-Bee Colony Optimization

The purpose of this research is to generate a robust agent by combining bee colony optimization (BCO) and ELU-Ants for solving traveling salesman problem (TSP), called HABCO. The robust agents, called ant-bees, firstly are grouped into three types scout, follower, recruiter at each stages. Then, the bad agents are high probably discarded, while the good agents are high probably duplicated in earlier steps. This first two steps mimic BCO algorithm. However, constructing tours such as choosing nodes, and updating pheromone are built by ELU-Ants method.To evaluate the performance of the proposed algorithm, HABCO is performed on several benchmark datasets and compared to ACS and BCO. The experimental results show that HABCO achieves the better solution, either with or without 2opt.

In ant colony optimization (ACO), ants lay a pheromone trail in their paths while traveling from their nest to food source [1] to find the shortest tour for the food. Since then, many researchers also continued this approach for better results on various optimization problems [11][12][13]. In [12] proposed an adaptive parameter control scheme for developing an adaptive ACS namely AACS. The parameters value in AACS are adaptively controlled based on information of pheromone trails distribution in each iteration that is utilized to estimate the optimization state.
Naimi and Taherinejad proposed ELU-Ants and KCC-Ants to modify ACS by introduce new local update [13]. This research emphasized that the update pheromone on earlier step is more than the later step. The other researchers presented a combining ACO with the other methods [14][15]. Lucic et al [16][17] then proposed another intelligent algorithm bee colony optimization (BCO),which is designed to create the multi-agent system (colony of artificial bees) for solving the combinatorial optimization problems [17][18][19][20][21][22][23][24].
This research is tried to adopt ELU-Ants algorithm which considers the beginning step as the more important step to construct the tour. Logically, an agent with bad choice of traveling some cities (part of tour) should be discarded but an agent with good choice should be duplicated at the beginning step. The discarded and duplicated agent in earlier step guarantees the agent will be more competitive. Therefore the update pheromone on earlier step is more than the later step. In spite of starting with good or bad choice, ants must continue selecting a path for completing their tour.   1 describes the agents M constructs the tour which is split into two important step constructing. Suppose in beginning step, agents M construct the part tour which consists i cities. Each agents travels cities C 1 , C 2 , C 3 ,...,C i at beginning step. While constructing the tour, the ant is possible to get bad choice at first stage and leads trapped on bad constructing as well. At last it probably brings to the bad-tour.
In BCO, after traveling some nectars or stage, each bees backs to the hive to communicate their way. In the hive, they group into three type and perform waggle dance to identify their quality of food source. The longer dance indicates the better quality of nectars found. Bees exchange information to each other for their next stage-tour. Bees finding bad-source food tend to follow the others with good-source food at next stage. This step shows that BCO discards bee with bad stage-tour in order to get the better tour for the next stage, and in other side automatically the bee with good stage-tour is duplicated.
So based on discussion above, the proposed algorithm, HABCO, is generated by constructing the robust agent as BCO algorithm, however the process of selecting nodes (cities) is done by ELU-Ants. HABCO was first presented by Abba et al in a proceeding of a conference [25]. However this paper presented more detail of the related work and processing of judgment agent.

Related Work 2.1. Ant Colony System
Ant colony system (ACS) is one of the most widely used and best-performing ACO. While ants search food, they will deposit the pheromone on the path where they passed. The ants will tend to choose the path whose the highest amount pheromone on it, and avoid the low pheromone path. There are some substantial parts to describe ACS briefly [1]. Construction Tour : While the kth ant is in city i, the next city j is selected from the unvisited city set J k (i) according to the two possibilities,P k (i,j), in Eq. (1) and in Eq. (2) : a. if q≤q 0 (exploitation) where q is a random number ; q 0 is a parameter; τ represents pheromone value. Local pheromone updating : Ants visit edge i, j and change the pheromone τ (i,j) edge level by local updating rule in Eq. (3) : Global pheromone updating: The pheromone of edges on the globally best tour is updated by global updating rules Eq. (4) and Eq. (5).
0 < α < 1 is the pheromone decay parameter, and L gb is the length of the globally best tour.

ELU-Ants
ELU-Ants [4] is a modified of the original ACS. Hosein in his research offered two equation to implement the algorithm, one is named Kcc-Ants and the another one is ELU-Ants. Al though both of them used the similar approach, they represented their proposed on different equation. The results of both equation are almost similar. This research uses the ELU-Ants approach. The main idea of this research is the ants have more ability to add more the pheromone of the primary links than the last links. While in early steps, the ant has a lot of options to choose which city traveled, because it has passed just a few cities. As a consequence, it makes just a few edge are prohibited to be selected. Therefore, the ant can freely choose the most desirable link (with more pheromone and less length) as its next link. Contrary on final steps of tour, the ant already passed most of the cities, and the current selected link may not have a significant effect on the quality of the tour, so it seems logical to reduce its ability of changing the last link pheromone. In other words, ants have more effect on pheromone update where they are in their initial steps and less effect when they are going to finish the tour. Based on above discussion, local update original (Eq. (3)) is modified to be ELU-Ants (Eq. (6)) and Kcc-Ants (Eq. (7)).
where cc represents the number of cities passed till now; n is total number of cities ; K and ω denote two parameters given; Cl represents current length of passed path of each ants and represents initial pheromone.

Bee Colony Optimization
Bee colony optimization (BCO) is designed to create the multi agent system to solve the combinatorial optimization problems. Lucic et al [16][17] proposed BCO to solve TSP. On implementation BCO on TSP, the tour consists numerous stages. A stage is a collection of a certain amount of nectars. In TSP model, nectars can represent cities. In constructing the tour, a bee starts as an unemployed forager without knowledge about the food (nectar). Bees start exploring from nectar to nectars or travel. This step is called forward pass (FP). As a notion, not all bees start foraging in the first stage. Some of them start at the second stage, third stage, etc.

ISSN: 1693-6930
Once a bee is on, she will be active till the end of iteration. After performing one stage, a bee returns to the hive in order to unload and store the nectar. It is called backward pass (BP). In BP, bees will perform waggle dance to show their quality tour and start exchanging information. The kind of communication between individual bees contributes to the formation of the bee colony. Before communicating, BCO selects randomly with a particular probability (e.g.10 %) to be scout. The bees which retain the previous stage and continue the next stage without interacting with the others are named scout. The remind bees will be categorized into follower and recruiter. BCO uses Eq. (8) for determining the probability that the bee k in next stage (u+1) with the same partial tour at stage u in iteration z as described below : where L k (u,z) is the length stage-tour that is discovered by bee k in stage u in iteration z. As a result of interaction, in addition of scout, the bees have the two other types, recruiter and follower. The bees which retain their previous stage and recruit the follower bees on their way are named recruiter. The bees which abandon the food source (previous stage) and follow the recruiter bees on their way are named follower. Since the probability of best path bee will be 1 (P k = 1), she will use absolutely the same partial and categorized as group of recruiter. BCO introduced the probability of partial tour will be chosen by any bee that decided to choose the new route in one stage. The probability is generated regarding two main attribute the total length tour and number bee that are advertising the partial tour. The smaller the normalized value of the total length, the better is the partial tour, while the bigger normalized value of number bees, the better is the partial tour. Procedure BCO can be described as on Fig. 2

Proposed Algorithm 3.1. The Concept
Since there is a consideration for beginning step and last step in constructing tour, the tour should be split into some parts of tour. This part of tour can be identified as the stage in BCO which consists some cities. The stage-tour agent constructed is expected to judge the quality agent. On BCO, after traveling one stage, bees back to hive and group into three types follower, recruiter, and scout based on their quality. This algorithm uses "ant-bee" as its agent and mimics the behavior of bee on BCO. Follower ant-bee is identified as the agent which performs bad choice on its stage-tour. Recruit ant-bee is identified as the agent which perform good choice on its stage-tour. Scout ant-bee is identified as the agent which always find the new alternative tour. Further, follower will change their stage-tour and follow the stage-tour of recruiter agents. Scouts neither follow nor recruit the other agent, yet keep their stage-tour. its stage-tour. Therefore, at beginning second stage the agents M 1 , M 2 and M 3 should continue the tour from city C 31 , C 31 ,C 33 respectively. The interesting question is how to make a good judgment in order to classify the quality of agents on constructing stage-tour. The sub section 3.3 will explain that the total amount pheromone gathered on each stage can be used as a good judgment.

Construct Tour
In construction tour for solving TSP, the ant-bee chooses cities with ACS algorithm (Eq. (1) and Eq. (2)). After each ant-bees traveling one stage using the rule of ACS, it backs to hive to interacting with each other (supposing distance from hive to each city=0), and calculating the total amount of pheromone. HABCO also categorizes ant-bees into three types based on pheromone gathered namely scout(S), follower(F ), and recruiter(R) as BCO. Scout retains the previous stage and continues the next stage without interacting with the others, follower abandons the previous stage and follows the recruiter, recruiter retains her previous stage and recruits the follower to join her previous stage tour. On HABCO, the judgment of quality of agents is based on the amount pheromone that agent gathers in each stages. Judgment is used to classify ant-bees into Follower, and Recruiter. Scout agent is firstly chosen 10 % from all agents. The scout agent guarantees the alternative tour is kept on the process. The remainder agents is categorized as follower and recruiter .The probability ant-bee into follower after traveling one stage is used by Eq. (9).
ISSN: 1693-6930 where i is agent index; M is the total number agent ant-bees; τ i represents pheromone gathered at each stages by agent; τ max represents pheromone maximum gathered at each stages by agent. From Eq. (9) shows that agent which gathers the high pheromone has low probability to be follower. For the highest pheromone (τ max ) agent is impossible to be follower (P F =0). In other words, it will absolutely become a recruiter. After categorized the agents into follower and recruiter, the follower will change her tour and follow one of the follower agent randomly. After exchanging information on hive, the agent ant-bees continue the next stages until finish the tour.

Pheromone updating
There are 3 types pheromone updating on HABCO. They are local update, semi-global update and global update. Local update is performed as local updating rule in ACS algorithm (Eq. (3)). While ant-bees choosing their edge, they also modify their pheromone. After performing one stage, HABCO updates the pheromone called semi-global update. This update is based on modified ELU-Ants algorithm that ants have more effect on pheromone update where they are in their earlier steps and less effect when they are going to finish the tour.
where S is current stage; |S| is total number stage. Obviously from Eq. (10), the pheromone after performing the first stage (S=1) is added more than the next stage. So the ant-bees play fewer impact in update pheromone when they are in their final part of tour. In last stage (final part), S= total number stage, it makes the impact pheromone update toward zero. At last, the pheromone on the edges are travelled by the best ant-bee will be updated with global updating rule as the ACS Eq. (4) and Eq. (5).

Ruin local optima
In the observation, the best length tour is possible same in some iterations in a row especially on the relative small benchmark. The best ant-bee probably travels on same edge in each iterations. So, it might the process traps to be local optimal. This situation makes the algorithm sometimes fails to get the optimal solution. To avoid this problem (ruin local optimal), the pheromone on that edges is reduced/evaporated, when the best of length tour is same for t times iterations (t=100) in a row.
where 0<x<1. Variable is considered as a fair number. When x is set too high, ruining of local optimum will fail to be obtained. Contrarily, if x is set too small, it will decrease pheromone significantly and affect the construction of tour. The algorithm of HABCO is described as Fig. 4.

Analysis Constructing the Robust Agent
As aferementioned, the pheromone and distance nodes are two variables important in ACO. This section probes whether the two variables can be to a judgment of the quality agent. One benchmark TSP, st70, is investigated to see impact of the distance and amount pheromone at each stages that is performed by ants. This actions are tried on three stages while number agents are ten. Firstly the agent (ant) travels all cities stage by stage with follow the ACS rule on Eq. (2). We observer two kind the agents. The first is the best agent which get the best complete tour. The second one is the worst agent which get the worst complete tour. While traveling, the total distance and the amount of pheromone at each edges is added and calculated in one stage. After traveling one stage, each agent continues next stage without interaction each other. At last, after each agents completes their tour, the length complete tour and pheromone of each ants also be calculated and ranked.
In the observation, considering by calculating the distance traveled at one stage is not reliable. It shows the best ant which gets the shortest distance tour has various position rank on first stage. Further, the worst ant which travels the longest distance has not always bad position  on first stage. It means that an ant can not guarantee getting the relatively short in the complete tour when traveling relatively short at the first stage. Contrarily, when an ant travels relatively long at first stage, she can not guarantee getting relatively long in the complete tour. It is because each ants is possible to choose several cities which have short distance in the beginning, but she has no choice except continuing the long distance at last. Thus, the total tour might be still relative long at last.
The second observation is to rely on the total amount of pheromone in one stage-tour traveled by each ants. In this experiment, it shows that the ant which gathered a lot of pheromone at first stage tends to be a good or best ant in the complete tour. The other hand, the ant which get the shortest distance probably gets the high pheromone on its first stage. So, it is more reasonable that total amount of pheromone at each stage indicates the quality sub-tour. The greater total pheromone that the ant achieves in one stage, the ant has high probability of getting the good complete tour. So if the amount at early stage-tour is higher, the ant tends to get the best complete tour. Contrary, if the pheromone at early stage-tour is lower, the ant tends to the worst complete tour.
As a notion, this research is performed on three stages. In our investigation, the result of four or more stages is not be reliable. It means the pheromone gathered by agents at first stage can not be a good judgment of quality agents.
Experiments are conducted to evaluate the performance of HABCO. Table 1 shows the settings of various parameters for the proposed algorithm.This paper details two comparison.The first one is a comparison HABCO and BCO, and the second one is a comparison HBCO and ACS. All algorithms are executed 10 times and nine dataset benchmark TSP.   (2)) to construct the tour. ACS has an advantage to make the edge more dynamic by perform local update. With local update, pheromone on edge visited is diminished and make it less desirable. Therefore the premature converge can be avoided. As a consequence, the agent (ant) has many various options to develop its tour. With only ten agents, ACS can get the good quality tour. In other hand, BCO has main advantage on exchange information agents (bees) on their hive. Before backing to hive, construction bee depends only the role on Eq. (7). It makes the selection candidate node is more static. BCO has also a slightly complex formula than ACS, and also has many variables to set. The difference setting sometimes generates the different significant results. The second comparison should be discussed is comparing HABCO and ACS. Table 4 shows that HABCO outperforms ACS either with 2opt or without 2opt. HABCO also shows slightly better either in average or best case. Although ACS and HABCO uses the same role to construct the tour (Eq. (1), Eq. (2)) at each stages, HABCO has an advantage in producing the competitive tour of their agents by duplicating the good sub tour and discarding the bad sub tour in their hive. Therefore, in constructing the tour, HABCO has smarter agents than ACS.

Conclusion
HABCO as combination three algorithms, ACS, BCO and ELU-Ants, is an affective algorithm to solve the TSP problem. In solving TSP, HABCO divides the tour into some stages likes on BCO algorithm. In touring some cities in a stage, HABCO uses the ACS method from choosing the nodes, and updating local and global updating pheromone. The quality of each agents HABCO is judged based on the pheromone gathered at each stages. The pheromone gathered in ACS algorithm classifies agent into follower, recruiter and scout agent like on BCO algorithm. A good judgment on premature step makes the probably good agent (identified as recruiter) duplicated, and the probability bad agent (identified as follower) is discarded. The scout agent is exist to maintain the new alternative tour. This process generates the competitive agent in each iteration. This algorithm also proposed the semi-global update pheromone to adopt the ELU-Ants algorithm.Comparing ACS and BCO, HABCO can solve TSP problem better. It can be seen from solving some dataset TSP problem with local or without local search.

Acknowledgement
This article is etxended version of proceeding [25] with entitled "A Hybrid Ant-Bee Colony Optimization for Solving Traveling Salesman Problem with Competitive Agents.