APP下载

Cost-Aware Multi-Domain Virtual Data Center Embedding

2018-12-26XiaoMaZhongbaoZhangSenSu

China Communications 2018年12期

Xiao Ma, Zhongbao Zhang, Sen Su

State Key Laboratory of Networking and Switching Technology,Beijing University of Posts and Telecommunications, Beijing 100876, China

Abstract: Virtual data center is a new form of cloud computing concept applied to data center. As one of the most important challenges,virtual data center embedding problem has attracted much attention from researchers. In data centers, energy issue is very important for the reality that data center energy consumption has increased by dozens of times in the last decade. In this paper, we are concerned about the cost-aware multi-domain virtual data center embedding problem. In order to solve this problem, this paper first addresses the energy consumption model. The model includes the energy consumption model of the virtual machine node and the virtual switch node, to quantify the energy consumption in the virtual data center embedding process. Based on the energy consumption model above, this paper presents a heuristic algorithm for cost-aware multi-domain virtual data center embedding.The algorithm consists of two steps: inter-domain embedding and intra-domain embedding.Inter-domain virtual data center embedding refers to dividing virtual data center requests into several slices to select the appropriate single data center. Intra-domain virtual data center refers to embedding virtual data center requests in each data center. We first propose an inter-domain virtual data center embedding algorithm based on label propagation to select the appropriate single data center. We then propose a cost-aware virtual data center embedding algorithm to perform the intra-domain data center embedding. Extensive simulation results show that our proposed algorithm in this paper can effectively reduce the energy consumption while ensuring the success ratio of embedding.

Keywords: virtual data center embedding;multi-domain; cost-aware; label propagation

I. INTRODUCTION

1.1 Background

Over the past 30 years, modern data center was born, and it has experienced the process of rapid development. Virtual data center is a new kind of data center form when the cloud computing concept is applied to the data center. It abstracts physical resources through virtualization technology, and makes resources allocation and scheduling dynamically, thus realizes multi-user sharing physical data center in order to reduce the operation cost of data center [1].

In the virtual data center, there are two roles: the infrastructure provider (InP) and the service provider (SP). The former controls the infrastructure of the entire data center,which creates respective virtual data centers on the infrastructure and deploys correspond-ing services and applications for end users.Each virtual data center embedding request has a network topology consisting of multiple virtual nodes and virtual links with limited capabilities. Virtual nodes can be divided into virtual machine nodes and virtual switch nodes, including CPU, memory, hard disk and port capacity requirements. Virtual links typically represent the demand for communication bandwidth and delays. When the InP receives the virtual data center embedding request service from the SP, the InP needs to find the nodes and links that meet these capability requirements in the physical data center network and allocate the corresponding resources for the deployment protocol or service, which is termed as the problem of virtual data center embedding.

1.2 Motivation

The problem of virtual center embedding has received significant attention from researchers.However, prior studies [14-46] have the following two limitations.

The first limitation is that they focus on finding efficient virtual data center embedding solutions for mapping virtual data center requests onto a single domain data center while ignoring the fact that physical data centers are usually deployed in multiple graphical places.For example, Google data centers are deployed in US, Chile, Taiwan, Ireland and Singapore,etc. [2]. Therefore, prior algorithms cannot be applied in such a context.

The second limitation is that they ignore the energy issue in the process of virtual data center embedding. Energy issue is very important.In Google, for example, in the past decade, the data center power consumption has increased by 20 times. It is expected that data center energy consumption will account for 8% of global energy consumption by 2020. Energy consumption will account for 41.6% of operating costs and become the largest expenditure for data center operations [3-5].

Although for the problem of virtual network embedding, some existing studies have addressed these two limitations, the problems of virtual data center embedding and virtual network embedding are different. Compared with virtual network, virtual data center has more kinds of virtual nodes (virtual switch nodes and virtual machine nodes), virtual links and more complicated network topology. The existing virtual network embedding algorithm cannot be applied in our problem. Thus, it calls for an energy-efficient virtual data center embedding algorithm.

In this paper, the authors present the energy cost model and then design a heuristic efficient algorithm.

1.3 Challenge

There are mainly two challenges in addressing the problem of cost-aware multi-domain virtual data center embedding.

1) How to model and quantify the energy consumption of physical data centers. The problem mainly includes two parts: the first part is how to model the electricity price for each physical data center and the second part is how to model the energy consumption for each physical data center. For the first part,physical data centers are distributed in different regions. Not only the price of each region is different, but also the electricity price of each region will change over time. For the second part, for each physical data center,there are often many types of nodes, leading to different types of energy consumption models.Therefore, quantifying the energy consumption of a physical data center for embedding a virtual data center is a key issue in solving the cost-aware multi-domain virtual data center embedding problem.

2) How to design an energy cost efficient multi-domain virtual data center embedding algorithm. Energy cost optimization and energy consumption optimization are two closely related but not identical issues. There are two factors for affecting the energy cost in virtual data center embedding: the first one is to select the physical data center with low electricity price and the second one is to select the physical data center with low energy consumption.However, these two factors may have conflicts with each other. For example, when you select a physical data center with low electricity price, it may cause high energy consumption in the embedding process within this physical data center. Therefore, it is important to take these two factors together into consideration.

1.4 Solution

To solve the above challenges, we propose the following solutions.

1) We propose the modeling and quantification of energy consumption in physical data centers. First, we investigate the transmission organization in many regions of the North American electricity markets, and obtain a lot of real electricity price data. In order to model these data, we make an in-depth analysis of the data by using a discrete time model in which each time the time axis is discretized into multiple equidistant intervals at intervals of changes in the price of electricity. Second,according to different functions, physical data center nodes are divided into two categories:the switch node and the server node. Switch node is responsible for packet forwarding and communication in the data center, and server node is responsible for computing tasks,storing data, and deploying applications. For the above two types of nodes, the energy consumption model is abstracted according to the latest research results of the current energy consumption model in the industry and academia. Finally, we formulate the problem of virtual data center embedding as an Integer Linear Programming with minimizing the energy cost of embedding a virtual data center embedding request as the goal, and the node and link demands as the constraints.

2) We design an efficient cost-aware multi-domain virtual data center embedding algorithm. The algorithm consists of two steps: inter-domain virtual data center embedding and intra-domain virtual data center embedding. Inter-domain virtual data center embedding refers to dividing virtual data center requests into several slices to select the appropriate single data center. Intra-domain virtual data center refers to embedding virtual data center requests in each data center. In the inter-domain embedding step, the virtual data center embedding request topology is clustered by label propagation method according to the topology and location requests, and then each subclass is embedded to a single data center. In the intra-domain embedding step,it is divided into two sub-steps: virtual node embedding and virtual link embedding. In the virtual node embedding sub-step, first add the available resource labels for each node, and then apply the best-fit strategy and the worstfit strategy for virtual node embedding. In the virtual link embedding sub-step, an energy-aware virtual link embedding is applied to minimize the number of newly opened nodes on the path.

Through a large number of simulation experiments, we show that our proposed energy-saving multi-domain virtual data center embedding algorithm can save the physical data center power cost by 15% to 30% compared with the existing algorithm.

1.5 Major contributions

The main contributions of this paper are listed as follows:

1) To the best of our knowledge, we study the problem of cost-aware multi-domain virtual data center embedding for the first time.We also formulate the problem of virtual data center embedding as an Integer Linear Programming.

2) To address this problem, we propose an efficient multi-domain virtual data center embedding algorithm. This algorithm has two steps: inter-domain and intra-domain virtual data center embedding.

3) In order to evaluate this algorithm, we have conducted extensive experiments. The experiment results show that our proposed algorithm significantly reduces the energy cost while ensuring the successful acceptance ratio of embedding.

1.6 Organization

The organizational structure of this paper is as follows. In Section II, we propose the problem model of the cost-aware multi-domain virtual data center embedding. In Section III, we give the virtual data center embedding algorithm based on label propagation. In Section IV, the simulation experiment is carried out, and the experimental results are analyzed. In Section V, the relevant work is introduced. Finally, the Section IV gives a summary of this paper.

II. PROBLEM MODEL

In this section, we first introduce the network model, and then give the formal problem description of the multi-domain virtual data center embedding. We next present the energy cost model of the physical data center, and finally give the evaluation performance.

2.1 Network model

A multi-domain data center is made up of multiple physical data centers. The definition of a physical data center in a multi-domain environment is given below. A multi-domain physical data center can be represented by Gc=(G1,G2,… Gu… Gn), where Gudenotes the u th single physical data centers. Each single physical data center can also be represented by the weighted undirected graph Gu=(Nu,Lu), where Nurepresents the set of all physical nodes in the physical data center and Lurepresents the set of all physical links in the physical data center. The physical nodes can be further divided into two categories:the switch node and the server node, that is,Nu=(Ru,Su), where Rurepresents the set of all switch nodes in the physical data center, and Surepresents the collection of all the server nodes in the physical data center.

For the physical switch, we consider the number of virtual switch which it can support as its capability. For the physical server, there are specific resource capabilities such as CPU,memory, and storage space, but in this paper we mainly consider the CPU as its capability.

Regarding the physical links, they are also divided into two categories: inter-domain links and intra-domain links. The inter-domain links refer to the backbone physical links connecting each pair of physical data centers, and the intra-domain links refer to the local physical links within each single domain physical data center. It has many attributes on the physical links, such as bandwidth, jitter and delay, but in this paper we consider the bandwidth as the main attribute on the physical links.

As shown in figure 1(b), the numbers in parentheses near the box indicate the number of available resources on the physical server. And the numbers in square brackets around the connection indicate the bandwidth available of the link. What’s more, the physical nodes are generally geographically distributed in figure 1(b). For example, SAmay be located in New York, NY, USA; SBin Chicago, IL, USA; SCin Los Angeles, CA, USA. The power markets of different locations are managed by different independent system operators (ISOs), and the ISOs are under competitive electricity market structure. Therefore, different locations often have different electricity prices. Even for the same location, the electricity price may vary frequently overtime [6,7]. We use Pru(t ) to denote the electricity price for the u th single physical data centers at the time of t.

Similar to the physical data center, the virtual data center is composed of virtual nodes and virtual links. It can be represented by the weighted undirected graph Gv=(Nv,Lv),where Nvrepresents the set of all virtual nodes in the virtual data center and Lvrepresents the set of all virtual links in the virtual data center.Regarding the virtual nodes, they are also divided into two types: virtual switch nodes and virtual machine nodes, namely Nv=(Rv,Sv),where Rvrepresents the set of all virtual switch nodes in the virtual data center, and Svrepresents the set of all virtual machine nodes in the virtual data center. If end users want their VDCs to be more reliable, they can add more virtual nodes or more links between virtual nodes in order to prevent the virtual nodes from breaking and losing data. This just modifies the input and thus does not affect how our proposed algorithm embeds such nodes and links.

And the whole data center can also be rep-resented by the weighted undirected graph Gc=(Nc,Lc), where Ncrepresents the set of all physical nodes in Gcand Lcrepresents the set of all physical links of Gc. The physical nodes can be further divided into two categories: the switch node and the server node, that is, Nc=(Rc,Sc), where Rcrepresents the set of all switch nodes in the physical data center,and Screpresents the collection of all the server nodes in the physical data center.

As shown in figure 1 (a), the numbers in the brackets near the box indicate the number of requirements for the resource of the virtual machine. And the numbers in square brackets around the connections indicate the bandwidth requirements of the link.

2.2 Problem description

Based on the definition of physical data centers and virtual data centers, the definition of multi-domain virtual data center embedding is given below. Given a virtual data center Gvand a multi-domain physical data center Gc, in the mapping M :Gv→ Gc, there are two mappings: Mnand Ml. The details are as follows:

Fig. 1. Virtual data center embedding example.

1) Mnis a one-to-one mapping function from the set Nvto the subset of set Nc. There are two sub-mapping in Mn: MRand MS,where MRis a one-to-one mapping from the set Rvto the subset of Rc, and MSrepresents the one-to-one mapping from the set Svto the subset of Sc. In the MR:Rv→ Rcmapping,for any virtual switch node, the rv∈ Rvand the physical switch node rc∈ Rc, rcneeds to satisfy the location of rvand the number of switch ports. In the MS:Sv→ Sumapping, for any virtual machine node sv∈ Svand the physical server node sc∈ Sc, scneeds to satisfy the constraints of the virtual machine node svfor the number of resources on the node.

2) Mlrepresents a one-to-one mapping function from the set Lvto the set Pc, where Pcrepresents a set of acyclic loops of all physical links Lc. The map also contains two sub-maps,Mlrand Mls, where Mlris from Lvrto a subset of Pcr, denoting all loop-free paths between any two switches, and Mlsis from Lvsto a subset of Pcs, denoting all loop-free paths between switches and servers. For both Mlrand Mls, they need to satisfy the bandwidth constraint for the virtual links.

2.3 Energy consumption model

The node energy cost mainly includes two parts: the power cost of the switch node and the server node. Before calculating these two parts of the energy costs, we first need to give the switch node and server node power consumption model.

Power consumption of the physical switch node: After embedding the virtual switch node r∈Rvto the physical switch node i∈Ru, it is necessary to calculate the extra energy consumed by the physical switch node ruPower (represented by ∆PNir). Typically,switches include the following three parts:chassis, switch fabric, and port. Among them,the chassis mainly includes fan cooling equipment; the exchange structure is responsible for the storage and forwarding of information.The energy consumption of these two parts is fixed, denoted by Pf. The port is responsible for the communication of packets within and outside the switch, and the energy consump-tion of each port is represented by Pport. Based on the above analysis, the total energy consumption of a switch node i is:

where P indicates the number of ports used by the switch.

Thus, for the embedding of the virtual switch r, the additional power consumed by the physical switch node i can be calculated as follows:

Power consumption of the physical server node: After the virtual machine node h∈Svis embedded to the physical server node j∈Su,the extra power consumed by the physical server node j is denoted by. According to research [8-10], it shows that the total energy consumption on the server, CPU energy consumption changes often occupy the vast majority of total energy consumption changes.Moreover, the total energy consumption of the server and the CPU load follows the following linear relationship:

where P represents the power of the server node, Pbrepresents the power of the node at the time of the empty load, which is called the baseline energy consumption; Pmrepresents the power at the node full load; µ represents the current CPU load. In this paper, we use Pl= Pm−Pbto represent the linear parameter of the server node’s power change with the CPU.Then, based on the above energy consumption model, for the virtual machine node h, the additional power required by the server node j is:

where Chrepresents the demand of the CPU for the virtual machine node h.

Based on the energy cost model of the switch node and the server node, the total node power consumption consumed by the physical data center is:

2.4 Evaluation performance

In the process of multi-domain virtual data center embedding, the physical data center operator’s optimization goal is to improve the acceptance ratio of the virtual data center embedding request and to minimize the long-term average power cost of the physical data center.First, define the virtual data center embedding request acceptance ratio as follows:

where Nsrepresents the number of virtual data center embedding requests that are successfully embedded, and Nallrepresents the number of incoming virtual data center embedding.

Then, define the long-term average energy cost of the physical data center as follows:

where N is the number of virtual data center requests that can be successfully accepted at time T, Ei(Gv) is the energy cost consumed to embed the i th virtual data center request, and it can be calculated by the Formula (5) in the previous section inferred.

III. COST-AWARE MULTI-DOMAIN VIRTUAL DATA CENTER EMBEDDING ALGORITHM

In order to reduce the energy consumption of data center for operators, we present an efficient cost-aware multi-domain virtual data center embedding algorithm. The algorithm consists of two steps: inter-domain virtual data center embedding and intra-domain virtual data center embedding. Inter-domain virtual data center embedding refers to dividing virtual data center requests into several slices to select an appropriate single data center, and intra-domain virtual data center embedding re-fers to embedding the sliced virtual data center request within each data center. The details are as follows.

3.1 Inter-domain embedding algorithm

In the inter-domain embedding step, we need to slice the virtual data center request into several clusters. Regarding how to make such slices, we would like to apply existing clustering method. However, the following challenges exist in applying clustering methods.

Fig. 2. Virtual data center network topology.

Fig. 3. Virtual data center node similarity matrix.

Fig. 4. Virtual data center node similarity complete graph.

1) In the multi-domain virtual data center embedding problem, there is a corresponding location constraint for each server node, so that the clustering method cannot be used directly.

2) In the multi-domain virtual data center embedding problem, there is no initial clustering information needed by the clustering method.

3) How to choose the appropriate clustering method to cluster the virtual data center topology embedding request is also missing.

To address these challenges, we design the following solutions.

1) We propose the virtual data center topology similarity matrix construction method to measure the similarity between any two nodes in a virtual data center.

2) We propose the candidate physical node construction method of the important nodes of the virtual data center to provide the initial clustering information for the subsequent clustering.

3) We propose the virtual data center topology clustering method based on label propagation to achieve the inter-domain embedding.

Next, we will give these solutions one by one.

3.1.1 Candidate physical node construction strategy for important nodes of virtual data center

To provide initial clustering information, it is necessary to find the most important virtual node for each physical data center. We first sort the price of electricity in a certain period of time for each physical data center with an increasing order, denoted by G1, G2… Gn. For each Gi, it may satisfy the location constraint for several virtual nodes. In such a scenario,we compute the degree for each virtual nodes and select the virtual one with highest degree as the important virtual node for Gi. We label this virtual node with cluster Giinitially for the subsequent clustering (in Section 3.1.3).The benefits of this strategy lie in that it con-

siders the importance of the virtual nodes and thus may speed up the convergence of the subsequent clustering. We give the pseudocode of this strategy in Algorithm 1 (Line 1-14).

3.1.2 Construction strategy of topological similarity matrix of virtual data center

In the process of measuring the similarity of the virtual data center nodes, the following factors need to be considered: the distance between nodes and the location request of the nodes. Among them, the closer the distance between the two nodes, the higher the similarity between the two points. The higher the degree of coincidence between the two positions,the higher the similarity between the two nodes. Assuming that for two virtual data center nodes v1and v2, the similarity ω between the two can be expressed as:Nv1and Nv2represent the virtual data center nodes v1and v2, respectively, where α and β represent the distance between the nodes and the node’s position request respectively (in this paper, they are set to 0.5) . The number of positions, Hv1v2represents the number of logical hops between two nodes in v1and v2in the virtual data center topology

represents highest ranking of the common physical data center candidate for two nodes v1and v2. Here, if a physical data center has the highest ranking means that is has the lowest electricity price.

As shown in the following example, figure 2 shows a virtual data center embedding request for a three-layer binary tree topology containing four virtual machine nodes (leaf nodes) v1, v2, v3, and v4, and the parentheses next to the node indicate the virtual machine node’s physical location of the demand, while the four physical locations A, B, C, D price order set to PriA<PriB<PriC<PriD. Based on the above information, the similarity between the four virtual data center nodes is calculated by using the Formula (8), and a similarity matrix is constructed, as shown in figure 3.Finally, we draw the similarity graph according to the similar matrix in figure 4. We give the pseudocode of this strategy in Algorithm 1(Line 15).

3.1.3 Virtual data center topology clustering method based on label propagation

Algorithm 1. Inter-domain virtual data center embedding.Input: virtual data center embedding request Gv, physical data center cluster Gc Output: virtual data center slicing scheme 1. For (each physical data center G G i∈ c)do 2. Rank these data center according to the electricity price with an increasing order 3. End for 4. Construct the initialized virtual node label list for physical data center 5. While (there is G G i∈ c which is not occupied )6. For (each virtual node n G v∈ v)do 7. If the virtual node nv has no label then 8. Check the location requirements of node nv, and rank these locations with an increasing order according to the electricity price 9. If the first location Gj is not occupied then 10. Add a label Gj to the node nv and Gj is occupied 11. End if 12. End if 13. End for 14. End while 15. Calculate the similarity between virtual nodes 16. While (isChange == true)17. isChange = false 18. For(each virtual node n G v∈ v)19. Get the pair of (label, weight) for nv 20. Merge all the pairs (add the pairs with the same label to one pair)21. Rank all the pairs for the updated weight with a descending order 22 If (the label of nv is not equal to the maximum of latest weight)23. The label of nv=the label of largest pair 24. isChange=true 25. End if 26. End for 27. End while 28. Return the label vector

Clustering methods include k-means clustering algorithm, hierarchical clustering algorithm, SOM clustering algorithm and FCM clustering algorithm [11,12]. Among them, the label propagation is simple yet efficient, so in this section we propose a virtual data center topology clustering method based on label propagation, as follows.

First, the clustering information is initialized. In this stage, representative virtual data center node is selected for each physical data center area, taking into account the location requirements of all the virtual data center nodes and the price of the physical data center. Then,according to the Formula (8) proposed in the previous sections of this paper, we calculate the similarity between nodes and construct similar matrices for these nodes. At the same time, the similarity is calculated by the iterative way for the remaining virtual data center nodes, and the similarity matrix is expanded to carry out label propagation. Finally, when no new virtual data center nodes are added to the matrix, it indicates that the similarity calculation between any two different virtual data center nodes has been completed and the label propagation process is completed. We give the pseudocode of this strategy in Algorithm 1(Line 16-27).

3.2 Intra-domain embedding algorithm

In order to solve the energy consumption problem in the virtual data center embedding process, this paper proposes an energy-saving embedding algorithm, which is divided into the following two steps:

1) Cost-aware virtual node embedding method. According to the network topology of the virtual data center embedding request,the corresponding nodes which need to be embedded and the corresponding level in the physical data center network are analyzed, the best-fit strategy is used in the node resource selection, and the worst-fit strategy is used in the bandwidth selection [13]. The above two strategies find the most suitable embedding target for the root node, and then design a recursive algorithm to complete the embedding of the remaining nodes;

2) Cost-aware virtual link embedding method. After the virtual node embedding is completed, the shortest path method of energy saving is used to embed the virtual link to ensure the link demand of the virtual data center.

3.2.1 Intra-domain node embedding algorithm

Based on the node tag method of available resources, we need to design the corresponding node embedding algorithm to achieve the goal of energy saving. According to the topology of the virtual data center embedding request,the corresponding network level is found to ensure that the nodes in the embedding request correspond to the nodes in the physical data center network. When the virtual data center request arrives, the resource requirements of each node in the request and the bandwidth requirements of the link between the nodes are defined. Referring to the previous step, the demand-based node tag method is also used for the virtual data center request. Each virtual switch node adds two attribute variables to represent the number of virtual machine node resources required for its underlying layer and the amount of bandwidth resource requirements.

Here, this paper designs a recursive embedding algorithm to complete the embedding task. First, by querying the boxing algorithm,all the switch nodes at the corresponding level of the physical data center network are sorted by the best-fit strategy according to the number of node resource requirements of the root node in the request to find the most suitable embedding location, and improve resource utilization.

Then, according to the number of bandwidth resource requirements in the lower node of the request, we use the worst-fit strategy to sort all the switch nodes at the corresponding level of the physical data center network to find the most suitable embedding location to ensure the success of the subsequent link embedding ratio.

According to the resource requirements of the virtual node and the size of the available resources on all the physical nodes, we first

calculate the physical node if the virtual node is embedded and the number of resources that can be used by the physical node. And then,we sort the physical node and select the physical node with the most remaining resources.For the worst-fit strategy, compared with the

best-fit strategy, its resource utilization is lower than the best-fit strategy, but the success ratio of subsequent embedding will be higher.Finally, the above two sets of values are added to be sorted. According to the sorting results will be ranked the top of the node as the root node will be mapped to the location.In this way, we not only can achieve the goal of energy conservation, but also facilitate other follow-up embedding request. For other nodes in the embedding request, the embedding is

done in the same way. The specific algorithm

is as follows:

3.2.2 Intra-domain link embedding

algorithm

Link embedding refers to finding an acyclic path between nodes in the corresponding physical data center network after the node embedding is completed, and all links on that path are able to meet with the bandwidth requirements of the corresponding virtual link.In this process, research has been done by using the shortest path method, but it ignores the on/off state of the physical data center network node, and it is easy to increase the number of open nodes and increase the energy consumption. Inspired by this, this paper designs a virtual link embedding algorithm for energy consumption perception. First, in the topology of the physical data center network, the physical links that are not required to embed the virtual link bandwidth resource are deleted, and the remaining graphs are called residual graphs.Then, in the residual graph, we calculate the shortest path. There may be multiple shortest paths here. Finally, in the shortest path, we select the path to be opened at the least number of nodes. The algorithm not only chooses the shortest path, avoids the long path problem of the link embedding, reduces the bandwidth resource cost of the virtual network embedding,but also reduces the number of newly opened nodes, which is beneficial to the goal of saving energy. The specific algorithm is as follows:

3.3 Time complexity

We give the time complexity of each algo-rithm as follows. For the Algorithm 1, inter-domain virtual data center embedding can be finished within the time complexitywhere |Nv| is the number of virtual nodes of VDC. For the Algorithm 2, intra-domain virtual node embedding can be finished within the time complexity ofwhere |Nu| is the number of physical nodes in a single domain physical data center. For the Algorithm 3, inter-domain and intra-domain virtual link embedding can be finished within the time complexity of is, wherethe number of nodes in the physical data center, according to the shortest path algorithm of Floyd.

IV. EXPERIMENT

In this section, we aim to evaluate our proposed cost-aware multi-domain virtual data center embedding algorithm, CA-VDCE. This section is organized as follows: First, the experimental setup is introduced in Section 4.1,and then the experiment results and analysis of cost-aware multi-domain virtual data center embedding algorithm are introduced in Section 4.2.

4.1 Experimental setup

Multi-domain physical data center: it composes of five physical data centers, and each of them is distributed in the (100 × 100) two-dimensional plane. The two-dimensional plane can be divided into five regions, respectively,representing the five different data centers.Different data centers are affected by different electricity markets. For these electricity markets, we have collected real data in terms of the electricity prices in five regions in the USA[14]. For each electricity market, the price changes in every hour. In this paper, one time unit in our experiment represents one minute.Regarding the data center topology, we use three layers of fat-tree structure to define its network topology, based on the real-world data center topology. We also use a typical setting for the switch: 16-port. In three layers of fat-tree structure, there are a total of 320 16-port switches and 1024 physical servers, and 3072 physical links to connect the switches and servers. Considering the isomorphism of the physical data center nodes, the number of available physical resources on each physical server node is evenly distributed within the [50,100] interval, and all physical server nodes are initially in powering off state. Similarly, the link resources on each physical link are also evenly distributed within the [50,100]interval, and the backbone links connecting the five physical data centers are evenly distributed within the [500, 1000] range to ensure the backbone link resources of the domain can meet the high link load and traffic.

Virtual data center embedding request:In this paper, the virtual data center embedding request scale is divided into three categories:small scale, medium scale and large scale, corresponding to one layer, two layers and three layers binary tree structure, respectively. For a small virtual data center embedding request,it consists of one virtual switch node and two virtual machine nodes. For a medium virtual data center embedding request, it consists of three virtual switch nodes and four virtual machine nodes. For a large virtual data center embedding request, it consists of seven virtual switch nodes and eight virtual machine nodes.For each virtual data center scale, the number of demand for node resources is evenly distributed in the range of [0,50], and the number of link bandwidth resource requirements between virtual nodes is also evenly distributed in the range of [0,50]. In each virtual data center embedding instance, there are 2000 virtual data center embedding requests, each of which arrives at the Poisson distribution of five virtual data center embedding requests in 100 time units. The duration of the virtual data center embedding request in the physical data center is subordinate to the negative exponential distribution of 1000 time units. In this paper,10 virtual data center embedding instances are run in each evaluation, and the mean of 10 instances is calculated and recorded as the final result.

Other parameters set: According to the reference [15, 16] research, Pf, Pportin Formula(1) were set to 300W and 5W, and Pb, Plin Formula (4) were set to 150W and 2W/CPU unit.

Comparison algorithms: In this paper, we use the C++ language to implement the costaware multi-domain virtual data center embedding algorithm. We compare our proposed algorithm to three other algorithms, which are random, worst-fit and best-fit VDC embedding algorithms (the algorithms in [37]). Random VDC embedding algorithm is the baseline algorithm, in which the data center and physical nodes on condition that these nodes can satisfy the virtual node constraint are randomly selected. The best-fit algorithm focuses on utilize resources in physical data center, and worst-fit algorithm does well in success ratio of each VDC request. Since meta-heuristics are not the focus of this paper, we don’t compare our proposed algorithms to the PSO based algorithms in [37]. In this section, we mainly compare the two aspects of the request acceptance ratio and the energy cost from the virtual data center. Firstly, it evaluates the effect of different virtual data center embedding request scales on the performance of the algorithm. Then, it evaluates the effect of different virtual data center embedding request arrival densities on the performance of the algorithm.Finally, it evaluates the effect of different virtual data center embedding request durations on the performance of the algorithms. The above two algorithms are implemented with the following hardware configuration: 4GHz 8-core CPU, 6GB memory, 500GB hard drive.The operating system kernel is Linux 4.4.

● Comparison to other algorithms:

We report the comparison result in terms of acceptance ratio and energy cost in figure 5 and figure 6, respectively. From the figures, we observe that, CA-VDCE achieves 34% higher acceptance embedding ratio than R-VDCE,and CA-VDCE reduces average 15% lower energy cost than R-VDCE on average. This is expected. The reasons are as follows. Firstly,we use clustering and label propagation strategy during inter-domain VDC embedding, which helps to consider the electricity price and energy consumption in a joint way and thus results in lower energy cost. Secondly,we use node labeling strategy and both best-fit and worst-fit strategies in intra-domain VDC embedding, which help to find the appropriate physical nodes to save more resources for the incoming data center requests and make our algorithm well behaved in acceptance ratio of VDC embedding.

Fig. 5. Comparison between our algorithm and other embedding algorithms on different VDC scales (acceptance ratio of embedding).

Fig. 6. Comparison between our algorithm and other embedding algorithms on different VDC scales (energy cost).

To quantify the optimality of EA-VDCE,we compare EA-VDCE with the standard ILP solver GNU Linear Programming Kit(GLPK). We fix the number of physical nodes at 50 nodes and the size of virtual data center at small level. The results are shown in figure 8. We observe that with the number of physical nodes increasing, the approximation ratio of EA-VDCE decreases and it achieves more optimal solutions. This is because that, with more number of physical nodes, EA-VDCE can explores larger space for finding the optimal solution of virtual data center embedding.

● The impact of virtual data center embedding request scales on algorithm performance

We next evaluate the impact of VDC scales on algorithm performance. We vary the input of request into three scales: small VDC size(1 virtual switch, 2 virtual machines), medium VDC size (3 virtual switches, 4 virtual machines) and large VDC (7 virtual switches,8 virtual machines). We give the compassion in terms of acceptance embedding ratio and energy cost in figure 7. From this figure, we find that for the small VDC size, our algorithm achieves about 12% lower energy cost than R-VDCE. While our algorithm performs better on medium VDCs and achieves about 21% lower energy cost. It is as expected. Because small requests have fewer nodes and less complexity, at the same time the resources in physical data center are abundant, there will be various embedding schemes, and our algorithm has less advantage during the VDC embedding compared with R-VDCE. Then we figure out that for large VDCs, our algorithm also achieves about 13% lower energy cost compared to R-VDCE. It is because resources are not enough and the physical data centers are even over loaded when facing large VDCs.The nodes in the physical data center are almost power-on, little optimization space for our algorithm has been left. So our algorithm performs the best when dealing with medium VDCs.

● The impact of virtual data center embedding request durations on algorithm performance

We then illustrate the impact of VDC duration on algorithm performance. We vary VDC request duration into three levels: CA-VDCE(D) has the normal duration time, CA-VDCE(2D) has 2X of the normal duration time,and CA-VDCE (4D) has 4X of the normal duration time. We report the experiment results in terms of energy cost in figure 9. From the figure, we observe that shorter duration makes lower energy cost, but the duration time and energy cost do not show a linear relationship. It is expected, for the reason that VDC requests consume less energy in shorter durations. And the number of physical nodes,which are power-on, does not increase when VDC request duration time grows up, so the total energy cost does not increase linearly with VDC request duration time.

● The impact of virtual data center embedding request arrival rates on algorithm performance

Finally, we figure out the impact of VDC request arrival rate on algorithm performance.We also vary VDC request arrival rate into three levels: CA-VDCE(T) has the normal arrival rate, CA-VDCE(0.5T) has the half arrival rate and CA-VDCE(2T) has 2X of the normal arrival rate. We report the comparison result in terms of energy cost in figure 10. From this figure which is similar to the figure 9, we find that lower arrival rate leads to lower energy cost, but the arrival rate and energy cost do not show a linear relationship. It is in our expectations. As a result of fewer requests and fewer virtual nodes, the resources are small, and the energy cost becomes lower. The number of physical nodes which are newly opened does not increase at the same speed as VDCs arrival rate. Thus the total energy cost does not grow up linearly with the VDCs arrival rate.

V. RELATED WORK

5.1 Virtual data center embedding

Fig. 7. Comparison between our algorithm and other embedding algorithms.

Fig. 8. Comparison between our algorithm and GLPK

Fig. 9. The energy cost over different settings of durations.

Fig. 10. The energy cost over different settings of request rate.

The current research on virtual data center network is mainly focused on packet forwarding strategy, bandwidth control, multipath transmission, bandwidth sharing and other major aspects. Diverter [17] applies the software called VNET to facilitate the process of forwarding packets for switches and routers, while ignoring the bandwidth issue during node communications. VICTOR [18]leverages the network virtualization architecture to achieve the migration virtual machines across multiple networks without losing service continuity. To make a tradeoff among the performance guarantees offered to tenants,their costs and the provider revenue, Oktopus[19] presents the design of virtual network abstractions and define two kinds of virtual data center mapping request types. It is used for certain data center topology but it isn’t applied to all kinds of network topologies. SecondNet[20] introduces a centralized VDC allocation algorithm for bandwidth guaranteed virtual to physical embedding and provides three levels of embedding schemes to meet the needs of different VDC requests, but it isn’t applied to many types of topology-based data center networks. For example, for its algorithm, it achieves a higher utilization of network resources in BCube [21] topology, but a lower utilization for VL2 [22] and in fat-tree topologies. CloudNaaS [23] leverages techniques such as software defined networking (SDN) to provide flexible and fine-grained control of the network. The advantage lies in that it has the potential to save cost and complexity while its disadvantage lies in that it has difficult operation and higher network communication overhead. For the above research, they do not consider the energy problem in the process of scheduling resources in the multi-domain data center. This is the difference between these studies and this paper.

Recently, to make a tradeoff between the operator revenue, energy consumption and carbon emissions, Greenhead [24] first divides the virtual data center request into slices by Louvain algorithm and then solves the multi-domain physical data center embedding problem. The difference between this paper and ours is that its data center is across multiple regions and is connected to each other through the backbone network of the NSFNet topology. They pay more attention on the backbone network energy saving while our focus is on the data center energy saving. Their work is orthogonal to our work. After the combination of their work and ours, the energy saving can be much more optimized.

5.2 Virtual network embedding

The reports [25-27] show that energy consumption has gradually become the main operating costs (40-50%) of physical network operators. Therefore, reducing energy consumption has become the main part for physical network operators to reduce operating costs and improve net profit. Studies [28-35]have only taken into account how to maximize the effectiveness of the physical network,while ignoring the physical network energy consumption. Therefore, we study the costaware multi-domain virtual network embedding problem [36,37], and then put forward the corresponding heuristic and meta-heuristic approach to optimize the energy consumption of physical network operators. However, this method does not take into account the dual factors of power price and energy consumption in a unified way, and still has a large optimization space in terms of energy saving. In the study [38], a high-cost and energy-efficient dual-objective virtual network embedding problem is proposed, and the corresponding mathematical model is given. However, it is solved by using the classical linear mathematical programming, which suffers from the poor performance and ignores the dynamics of electricity prices. Zhang et al. [39] design a heuristic cost-aware virtual network migration algorithm called EA-VNM, which re-optimizes the energy cost by leveraging the migration technique, in order to minimizes the energy cost while maximizing the revenue of the physical network at the same time. Mano et al.[40] analytically and empirically investigate the trade-off between the embedding time and cost for the virtual network reduction. Then they define two simple reduction algorithms and find out that the embedding cost increases only linearly with exponential decay of embedding time.

Hou et al. [41] propose a novel and efficient rick-aware VNE heuristic algorithm to perform physical isolation between risky and security VMs to evaluating VM threat and vulnerability. This algorithm performs very well in maintaining ODCN (optical datacenter network) security and earning rental revenue.Hou et al. [42] propose the energy efficient survivable virtual network embedding algorithm for the collaborative edge computing in smart cities, and the algorithm shows great effectiveness in network survivability. These two studies point out the great potential future work for the researchers in the VN embedding problem.

Recently, in other infrastructures, there are also some interesting studies to improve the resource utilization and reduce energy consumption. Li et al. [43] propose a novel ECIoT architecture, which employs cross-layer dynamic stochastic network optimization to maximize the system utility to improve throughput,reduce end-to-end delay, achieve an average throughput and delay trade-off. There are also some related studies about virtual network embedding in 4G [44] and 5G network [45-46].They also aim to provide higher resource utilization and energy efficiency.

VI. CONCLUSION

In this paper, to study the cost-aware multi-domain virtual data center embedding problem,we first present the energy cost model and then design a heuristic yet efficient algorithm.In the energy cost model, we abstract two kinds of physical nodes and give their energy cost modelling. The proposed algorithm is a two-step algorithm. In the first step, we design a label propagation based on inter-domain virtual data center algorithm to consider the electricity price and energy consumption in a unified way. In the second step, we design two efficient virtual algorithms for the virtual node embedding and virtual link embedding,respectively. The experiment results show that our proposed algorithm can reduce the power cost while ensuring the success ratio of embedding across the virtual data center.

ACKNOWLEDGEMENT

This work was supported in part by the following funding agencies of China: National Natural Science Foundation under Grant 61602050 and U1534201, and the National Key Research and Development Program of China under Grant 2016QY01W0200.