Topology Architecture and Routing Algorithms of Octagon-Connected Torus Interconnection Network

Two important issues in the design of interconnection networks for massively parallel computers are scalability and small diameter. A new interconnection network topology, called octagon-connected torus (OCT), is proposed. The OCT network combines the small diameter of octagon topology and the scalability of torus topology. The OCT network has better properties, such as small diameter, regular, symmetry and the scalability. The nodes of the OCT network adopt the Johnson coding scheme which can make routing algorithms simple and efficient. Both unicasting and broadcasting routing algorithms are designed for the OCT network, and it is based on the Johnson coding scheme. A detailed analysis shows that the OCT network is a better interconnection network in the properties of topology and the performance of communication.


Introduction
The large-scale multiprocessor system which contains thousands of processors becomes possible with the development of hardware technologies, especially the improvement of VLSI technology [1]- [3].For example, there exists 7168 computing nodes in Tianhe -1A [4], and the number exceeds 80,000 in the Fujitsu supercomputing system [5].In the coming years, new applications and algorithms will promotes single chip processor core to have same number with the 1980s' supercomputing system node [3].We are headed for the exascale computing age and will reach this new era in 2018 with supercomputing system has 1 exaFLOPS (10 18 FLOPS) [2], [3], [6].
In the exascale computing age, multiprocessor system will has hundreds of millions of processor cores.At the same time, interconnection networks have great influence on the performance of such massive system and also determine the computational and storage ability of the massive parallel application in the future [2], [3], [6].Thus, in order to improve the communication efficiency of parallel computation, researchers have been engaged in the study of interconnection network with simple structure, low degree of node, short diameter, easy routing strategy and fine scalability [1], [7]- [14].
For a large scale system, the topology has a major impact on the performance and cost of the interconnection network.The topology of Torus interconnection network has its special features such as regularity, symmetry, fault-tolerance, short diameter, embeddability and so on; hence it is well received among researchers and practitioners [2], [5], [7], [8], [11]- [17].Being regarded as one of the most important and attractive types of typology for paralleling computational network, it has been implemented in IBM BLUE GENE/Q network [15], 3D Cray network [16] and Fujitsu Tofu network [5].However, when dealing with a network which contains millions, or hundreds of billions of processor cores, the traditional Torus network would not suitable for the connection of the future parallel systems for its overly lengthened diameter.As is required that the parallel programs should accomplish frequent communication within one set of nodes (local communication), a number of Torus-based HINs (hierarchical interconnection networks) [16]- [22] are put forward.Among these hierarchical interconnection networks, lowlevel networks, consisting of computational nodes, carry local communication, and high-level networks, consisting of cluster groups, are responsible for telecommunication.As the diameter of HINs is the product of the network diameters of every level, it still turns out to be relatively large.In contrast, the Torus embedded Hypercube [12], [13] is the combination of Torus network and Hypercube network and its diameter is the sum of two interconnected one, which greatly cut down the length of diameter of the whole network.
Octagon [23] interconnection network is applied to on-chip-network by F. Karim and other researchers.Its topological structure possesses characteristics of regularity, symmetry, and short diameter.In order to further reduce the diameter of the Torus network and improve its fault-tolerance, local communication performance, the paper provides a new type of interconnection network, the octagon-connected torus (octagon-connected torus，OCT) interconnection network, based on the incorporation of Torus network' s scalability and short diameter of Octagon topological structure.OCT is a symmetrical and regular interconnection network which is characterized by short diameter, good scalability and local communication performance.Network extension and routing algorithm could be easily achieved if adopting Johnson coding scheme on the node of OCT topology.
Under conditions of a given topology, the performance of interconnection network is determined by the routing algorithm [8], [24].Therefore, the unicasting and broadcasting routing algorithms are designed in this article based on the structure of OCT interconnection network.
for 0, 0≤i≤m-1) is the code of integer k.This binary code is called Johnson code.
Definition 3. Two nodes in two-dimensional(2D) plane are named as adjacency, if and only if there is difference between their codes, and only one bit varies.
Definition 4. If two random nodes in 2D plane are adjacent, then a (direct) link exists between them, according to the definition 3.
Definition 5.The 2k×2m 2D Torus (abbreviate as T(k, m)) interconnection network is a network topology which has following properties:1) it consists of 2k×2m nodes and 8k×m direct links.2) The node's horizontal ordinate can be marked with m-bits Johnson code, and the vertical coordinate can be marked with k-bits Johnson code.The vertical coordinate of node take as high-order position, and take the horizontal ordinate as low order, then combine them into a nodes coding, thus any nodes can be identified by binary coding of k+m bits.3) The rule of nodes coding: if and only if there is one and only one bit difference between two nodes coding in T (k, m), the nodes are adjacent and that means there exists a direct link between them.
Figure 1 is a diagram showing topological structure of 4×6 2D Torus interconnection network.Interconnection network T (k, m) shows good properties, such as ① node coding in each row and column are binary unit distance cyclic code.② There are only four adjacent nodes in any coded node which can naturally form structure of Torus (but bidimensional gray node does not meet this property when the amount of encoding bits is greater than or equal to 5).③ When k or m increase one position, the number of corresponding node only increases 4m or 4k (the number of bidimensional gray node is 2 k ×2 m , therefore, when k or m increase one position, the number of corresponding node would double to its original amount).④ XORing any two nodes coding, the sum of the total number of 1 in the result which also can be regarded as the minimum distance between these two nodes.⑤ This network possesses simple routing mechanism of Hypercube-like.Definition 6. Octagon interconnect network is a network topology which has the following properties: 1) it consists of 8 nodes and 12 direct links.2) Its coordinate can be identified by 4-bits Johnson code.3) There exists one direct link between two adjacent nodes in Figure 2 presents the topological structure of interconnect network Octagon, and also shows good qualities of Octagon. 1) In this network, any nodes connectivity is 3, and its diameter is 2. Octagon is regularity, symmetry; meanwhile it has other good qualities, such as short diameter, low connectivity and so on.2) There are 3 uncrossed links between any two nodes in Octagon.At the same time, the length of these 3 links are 1,4,4 if two nodes link directly, or it would be 2,3,3.Therefore, this network has high fault-tolerance and parallelism.3) When the result of the XOR of two nodes' code is one 1 or four 1s, the nodes are adjoined.When the result is two ones or three ones, the distance between to nodes is 2.However, Octagon is lack of scalability.
Most of parallel program communicate frequently in a set of nodes, the communication performance has a major impact on the efficiency of parallel program, and the principal factor is distance of intra-group node [8], [10].Thus, LIU F A et al. [10] puts forwards a kind of parameter which can be used in evaluating layered interconnection network, which means the optimal grouping is an evaluation method on layered interconnection network performance.
Definition 7. The distance of node group [10], [11]: the distance of node group G can be defined as the maximum distance between any two nodes when the G is in the interconnection network N.
Definition 8. Optimal grouping [10], [11]: for the given positive integer λ, the interconnection network N contains multiple groups of λ nodes, under this circumstance, the minimum distance group is the optimal group which contains λ nodes, recorded as G λ (N).
Definition 9. Group divisible performance [10], [11]: for the given interconnection network N 1 and N 2 , there exists the distance of G λ (N 1 ) less than or equal to the distance of G λ (N 2 ) for any positive integer λ, then the group divisible performance of interconnection network N 1 is superior to the interconnection network N 2 .
If the group divisible performance of interconnection network N 1 is superior to the interconnection network N 2 , making use of the condition that communication cost of a set of computational codes G λ (N1) is less than the one of G λ (N2), thus the network divisible performance can be showed its own significance.
Within the limitations of hardware resources, in order to improve calculated performance of the whole parallel system, the network should occupy resources as less as possible, and packing density can evaluate the hardware resource of interconnection network [18].
Definition 10.The packing density of interconnection network is defined as the ratio of the number of nodes of a network to the product of network diameter and degree [18].

Octagon-Connected Torus Interconnection Network
With 8×2k×2m nodes, the octagon-connected torus (OCT(k,m), where k, m are the parameters of network scale) interconnection network should be constructed by integrating the short diameter of Octagon and the scalability of Torus as following ways: 1) Firstly, based on definition 6, eight nodes can form an Octagon network.There will be 2k×2m Octagon networks in the end and each of them is referred to as a slice.
2) Form 2k×2m slices into Torus network: encode 2k×2m slices by 4 bits of Johnson code and, complying with the definition 5, connect nodes of same codes in each slice to form network T (k, m).
3) Encode OCT (k, m): Code of each node consists of two parts-A t and A o .A o (Johnson code 4) is code of node in each Octagon network.A t (Johnson code k+m) is code of each Octagon area, namely the code of node in each T (k, m) network.
The topological structure of interconnection network OCT (k, m) is as shown in Fig. 3 where the solid lines represent the Octagon interconnection network link, the dashed lines are the interconnection network T(k, m) link and the circles are node of network.The direct links exist between endpoints which marks around are the same.The scale of internetwork OCT(k,m) nodes can be expanded to form interconnection network OCT (k,m+1) or OCT(k+1,m) by increasing one bit on the code-m or k, which enables two lines or columns (4k or 4m relevant nodes added ) added in network T(k, m) and 8×4k or 8×4m nodes added in network OCT(k,m) .In the original network OCT(k,m), there is no change of the network connections in each Octagon area and the connectivity of nodes.In the interconnection network T(k, m+1) or T(k+1, m) , except the nodes connecting the added ones, there remains no change of other nodes and connections.As the example in Figure 3, interconnection network OCT(2,3) is formed from eight Octagon areas added on the right side of network OCT(2,2).Theorem.In interconnection network OCT (k, m), with two nodes A (A m+k+3 …A m+k , A m+k-

。
Proof.Each nodes coding in Octagon slice and the nodes coding in T (k, m) are Johnson code.From the construction process of interconnection network OCT (k, m), it can be seen that the distance between any two nodes equals the distance of the two nodes in T (k, m) plus those in Octagon.In Octagon slice, there is one and only one bit difference between any two nodes or the result of XOR is 1, these two nodes are adjacent, that is the distance between any two nodes is the different bits of two nodes or the same bits of two nodes plus one; there is one and only one bit difference between two nodes in T (k, m), the nodes are adjacent, that is the distance of any two nodes is the number of different bits in any two nodes.Therefore, this theorem is established.

The Properties of OCT (k, m)
Character 1.The interconnection network of OCT (k, m) is regular one, and the connectivity of any node is 7.
Owing to every Octagon belongs to the regular interconnection network, and the connectivity of any node is 3.According to the formation of the interconnection network OCT(k, taking Octagon as a node, as the result, this interconnection network is interconnection network of T(k, m),and the connectivity of node is 4.Thus, OCT(k,m) is the regular interconnection network, and the connectivity of node is 3+4=7.
Character 2. The maximum distance (the diameter) between any two nodes the interconnection network of OCT (k,m) is k+m+2.
As the diameter of interconnection network of T (k, m) is the diameter of 2m and 2k node rings, which is m+k.According to the process formation of the interconnection network OCT(k，m), taking as a T(k, m) as a node, as the result, this interconnection network is the Octagon interconnection network, and the diameter is 2. Thus, the diameter of internet is the combination of the diameter of Torus and Octagon, which are m+k+2.
Character 3. The Symmetrical network of OCT(k,m) interconnection network.
According to the process formation of the OCT(k,m) interconnection network, taking any node mark in this network as a initial point, that is we can come to the same conclusion from observing every node.Realizing the simplification of routing algorithm that is the routing algorithm and the node position is irrelevant.
Character 5.The bisection width of interconnection network of OCT(k,m) is 24×k×m.The interconnection network bisection width is that when the interconnection network is divided into two equal subnets, the minimum link number must be deleted.The bisect of OCT(k,m) interconnection network is dividing 2k×2m Octagon interconnection network, and the width is 6,thus the bisection width is 6×2k×2m =24×k×m.
For further explain the good feature of the interconnection network, the Table 1 gives the comparison three kinds of static networks.
The scalability of the interconnection network is that network topology performance maintain the same, the ability to expand the node will influence the routes efficiency.In the interconnection network of OCT (k,m), the expansion of network size and the configuration information of the interconnection node stay the same, and have the good expansibility.Table 1.Performance characteristics of three kinds of static networks [12], [13]  The diameter of the interconnection network determines the information might experience the number of the hop.The node degree of interconnection network determines the complexity of the hardware.Since the diameter will get smaller of higher node degree, the diameter and node must take into consideration for the evaluation of the cost of the interconnection network [13], [18].According to the definition 10 and tablet 1, the packaging density of three interconnection network OCT(k,m), (2k,2m,3)-OMMH, T(2k,4m) can demonstrate in the figure 4. The higher the packing density of a network, the smaller the chip area required for its VLSI layout.The figure shows that the OCT (k, m) has the highest packing density while T(2k,4m) requires the lowest packing density.
The ideal throughput of the interconnection network and the bisection width is direct proportion [8], [14].Thus, we can detect from the Table 1, the width of halve interconnection network OCT(k,m) bigger than (2k,2m,3)-OMMH、T(2k,4m), that is the ideal throughput of interconnection network OCT(k,m) is better than (2k,2m,3)-OMMH、T(2k,4m).Meanwhile, the interconnection network of OCT(k,m) is in the middle proportion with the internet performance, and has the better bandwidth expansibility.From the character 1,2,3,4,5, and relating explanation of OCT(k,m) interconnection network has the better scalability.
According to the construction process of OCT(k,m) and the definition 3、4 and 5, the optimal group distance of OCT(k，m), (2k,2m,3)-OMMH , T(2k,4m) respectively is With the increase of network node and the increase of probability of node or links failure, the interconnection network should have the certain fault-tolerant ability.Because there exists T(k, m) and Octagon in OCT(k,m) simultaneously, the rerouting of message can be achieved by simple updating the no-fault tolerant routing algorithm when meeting the single node and link failure.The failure of a single node or link in any non-source node or destination node can be corrected by adding two hops of nodes and links in bypass paths.When source node, destination node and error node or links are in same sub-network of T(k,m), it can adds one of hops in Octagon to forward message to the adjacent sub-network of T(k,m), and adds another hop can back to the former sub-network of T(k,m).In the same way, when source node, destination node and error node or links are in same sub-network of Octagon, it can adds one of hops in T(k,m) to forward message to the adjacent sub-network of Octagon, and adds another hop can back to the former sub-network of Octagon.
The former analysis shows that interconnection network OCT(k,m) has good scalability, well local communication performance and high fault-tolerant ability.

Routing algorithms in OCT
Routing algorithm is a key factor which affects the efficiency of the communication of network, and this section mainly analyzes the routing algorithm and performance of unicast and multicast.

Unicast Routing Algorithm on OCT(k,m) 3.1.1. Unicast routing algorithm
Assuming that Node ② If A and B are in a same T(k,m), as 2,2 has mentioned, their Octagon coding are same, that is Hamming(A 3 …A 0 ⊕B 3 …B 0 )≡0 only by routing from node In network T(k, m), there only a difference in horizontal axis and also in ordinate of adjacent node.The node which is left to node After H min ＝min{H l ,H r ,H u ,H d } sending this data packet to the relative adjacent node of H min and modifying A t to the code of that adjacent code, we can computing the result of H= Hamming(A t ⊕B t ), if H≡0 and A t will be a destination node, otherwise the process will be repeated.
③ If A and B are not in a same Octagon nor in a same T(k,m), they are any two nodes, then the data packet should be routed firstly to A'

Performance analysis of algorithm
The advantage of OCT(k,m) routing algorithm is the adoption of Johnson Code in T (k,m), which makes the Hamming(A⊕B)of any two nodes coding become the minimum distance between two nodes and this coding also implies routing information and relation between two adjacent codes.This algorithm can get right routing result only by saving the current coding and destination coding when transmitting data.Network Octagon also adopts Johnson coding, so the network routing will be simply when XOR result is a 1 or has 4 ones, which means the two nodes are adjacent, and XOR result is 2 ones or three ones which makes the nodal distance is 2.
Data should have twice operation at worst according to unicast routing algorithm OCT(k,m), that means it needs k+m rounds' communication operations in a same T(k, m) at the worst, therefore, it needs k+m+2 rounds' communication operations at the worst.If the algorithm can send data from the source node to the destination node in the shortest way, this kind of algorithm has high communication efficiency.All above unicast routing algorithms forward data in a shortest way, so in the worst case, the routing path would not exceed the network diameter k+m+2, and the communication efficiency would be 1/(k+m+2).

Broadcasting Routing Algorithm on OCT(k,m) 3.2.1. Broadcast routing algorithm
It is assumed that node A sends data to all the other nodes.Node A firstly sends data to all the nodes in the same Octagon, and then all the nodes which have received the data conversely send the data to all the nodes in their own T(k, m) with recursive doubling method.

Performance analysis of algorithm
Broadcast routing in this way, node A sending data to all the nodes in the Octagon needs 2 rounds of communication operations, and then the data broadcasting within T (k, m) needs m+k rounds of communications operations.Therefore, the radio needs k+m+2 rounds of communication operations, and the communication efficiency of algorithm is 1/ (k+m+2).

Conclusion
The interconnection network Torus and the interconnection network Octagon are the most important and the most attractive interconnection network topology.Therefore, this paper combines the scalability of the interconnection network Torus with the short diameter of the interconnection network Octagon to present a simple scalable OCT(k,m) interconnection network structure.This interconnection network is a kind of regular symmetrical extensible interconnection network with 7 nodes, can expand the network with the constant node degree and makes the routing algorithm simple and efficient adopting Johnson Code.Analysis and experiment results show the interconnection network has a good communication performance, fault tolerance and scalability, and it is a kind of interconnection network topology which is suitable for large scale parallel computing.

TELKOMNIKA
ISSN: 1693-6930  Topology Architecture and Routing Algorithms of Octagon-Connected Torus .... (Youyao Liu) 307 interconnect network when these two coding have one and only one bit difference or the each bit of the result of XOR is 1.
the Hamming Distance between Node A and Node B is H(A,B)= Hamming(A⊕B),"⊕" means bitwise XOR on A and B and "Hamming" function means the plus computation of "1" after XOR on A and B. From encoding method on node T(k,m) & Octagon and the construction process of OCT(k,m), the shortest route of OCT(k,m) can be seen as follows.① If A and B are in a same Octagon, as 2.2 has mentioned, their T(k,m) coding are same, that is when Hamming(A m+k+3 …A m A m-1 …A 5 A 4 ⊕B m+k+3 …B m B m-1 … B 5 B 4 )≡0, the distance between source node A and destination B ia 1 or 2 only routing in A o = A 3 …A 0 ，B o = B 3 …B 0 of Octagon.If Hamming(A 3 …A 0 ⊕B 3 …B 0 )=1 or 4, node A sends message to node B directly.Otherwise, A o should firstly compute the distance between A's adjacent nodes A o1 ，A o2 ，A o3 and B o , and send message to adjacent node which is 1 away from destination node, and then to node B .


ISSN: 1693-6930 TELKOMNIKA Vol. 13, No. 1, March 2015 : 305 -313 312 Octagon as the way in ①, A & B are in a same T (k,m) and then routing data package to the destination node B in the T(k,m) as the way in ②.

2. Octagon-Connected Torus Interconnection Network 2.1. Preliminaries Definition 1.
Binary unit-distance cyclic code is a binary code whose each two adjacent codes have one and only one bit different(unit distance characteristic), and the first code and the last one in those codes have one and only one bit different(cycle characteristic).