Improved Min-Min grid scheduling algorithm based on segmentation

Improved Min-Min grid scheduling algorithm based on segmentation

Abstract: Based on the traditional and classic Min-min scheduling algorithm, an improved strategy based on the idea of ​​"segmentation" is proposed, and the algorithm is simulated using the HypelSim grid simulator. The improved algorithm solves the problem of unbalanced load in traditional Min-Min algorithm. Simulation results show that the improved algorithm is reasonable and has high performance.

Keywords: Scheduling Min-Min Divided-Min-Min Simulation HyperSim

Grid technology is one of the research hotspots in the computer field today. With the rapid development of grid technology, the task scheduling problem in grid computing has become more and more important. In the grid environment, the running speed of each processor, the load of the host, and the time of network communication are dynamically changing. The management and scheduling of resources are very complicated. The task scheduling problem in a multi-machine environment has become a well-known NP problem. At present, around the task scheduling algorithm in grid computing, a lot of research work has been done at home and abroad, and various static and dynamic scheduling algorithms have been proposed. This article makes an in-depth analysis and research on the traditional and classic Min-Min scheduling algorithm, points out the defects of the algorithm, and on this basis, proposes an improved algorithm based on the idea of ​​"segmentation", and in the grid simulator environment Under the simulation experiment. The simulation results verify the rationality and effectiveness of the improved algorithm.


1 Min-Min algorithm overview In a grid environment, an efficient scheduling strategy or algorithm can make full use of the processing power of the grid system, thereby improving the performance of the application. The essence of the same problem of task scheduling is that in a grid environment composed of m tasks to be scheduled and n available task execution units (host or cluster), m tasks T = {t1, t2, ... tm ] Scheduling to n hosts H = {h1, h2, ... hn} in a reasonable way, the purpose is to get the smallest possible total execution time (Makespan). The predicted execution time ETC (Expected TIme to Compute) of m tasks on n different machines is an m × n matrix. ETC (i, j) represents the predicted execution time of the i-th task on the j-th machine, each row in the matrix represents a different execution time of a task on n machines, and each column represents m on the same machine Different execution time of a task.


Task-resource mapping with completion time as the optimization goal is an NP-complete problem, so an auxiliary heuristic algorithm is needed. For traditional Min-Min, Max-Min, A *, GA and other static heuristic algorithms, TracyD. Braun et al. Have done detailed research. The results show that the GA algorithm has the best performance under different ETC matrices, followed by Min-Min and A *. And Tracy D. Braun ’s research shows that for each ETC matrix, the average execution time of the CA algorithm is 60 seconds, while the A * algorithm is 20 minutes. Since the various parameters in the algorithm are some predicted values ​​obtained through services such as NWS, in the grid environment where the parameters change rapidly at any time, the calculation time of GA or A * will be very long, so the resulting scheduling strategy is also very unreasonable.


Min-Min algorithm is still one of the research foundations of current grid scheduling algorithm. The main idea of ​​the algorithm is: when the set of tasks to be scheduled is not empty, repeatedly perform the following operations until the set is empty:


(1) For each task TI in the set waiting to be allocated, calculate the minimum completion time for allocating the task to n machines, denoted as MinTIme (i), you can get a one-dimensional array MinTIme containing m elements ;


(2) Let the kth element be the smallest in the MinTime array, and its corresponding host is b, and assign task k to machine b;


(3) Delete task k in the task set.
Because the btin-Min algorithm always prioritizes short tasks, some tasks with longer execution time can be executed when the machine is idle, which leads to unbalanced host load and reduced utilization. In this regard, this paper proposes an improved algorithm to prioritize tasks with long execution time, namely the segmented Min-Min algorithm Dmm (Divided-Min-Min).


2 Improved Min-Min algorithm based on segmented ideas
The Dmm algorithm first sorts the tasks according to the ETC, that is, according to the average ETC or the minimum or maximum ETC, the task is classified into an ordered sequence from large to small; then the task sequence is divided into segments of the same size, and the long task segment is scheduled first , After scheduling short task segments. For each task segment, the Min-Min algorithm is still used for task scheduling. The Dram algorithm is described as follows:


(1) Calculate the keyi for each task. In an Argentine heterogeneous environment, the execution time of the same task on different machines is different. This is called grid task heterogeneity. Considering the heterogeneity of tasks, three sub-strategies were tested when determining the ranking value—average, minimum, and maximum expected execution time.


Sub-strategy 1: Dmm-avg calculates the average value of each row in the ETC matrix:



Sub-strategy 2: Dmm-min calculates the minimum value of each row in the ETC matrix:



Sub-strategy 3: Dram-max calculates the maximum value of each row in the ETC matrix:



(2) According to the sorting value, the task set is arranged in descending order to form an ordered sequence.
(3) The task sequence is evenly divided into N segments.
(4) The Min-Min algorithm is used to schedule each task segment in turn.


Unlike the Min-Min algorithm, the Dmm algorithm prioritizes tasks for scheduling, which means tasks with long execution times are scheduled earlier. Then use Min-Min algorithm locally in each task segment. The key of this algorithm is how to determine the ordering value of tasks to ensure that long tasks can be scheduled first.


The third step of the Dmm algorithm is to divide the task sequence into N segments, with the emphasis on how to select the optimal N value. On the one hand, the larger the value of N, the more the load tends to be balanced; on the other hand, the excessive value of N makes the Min-Min algorithm reduce efficiency. The curve in Figure 1 shows that the Dmm algorithm using the Dmmavg sub-strategy is an improvement over the Min-Min algorithm when selecting different values ​​of N. As shown in Figure 1. When the value of c = m / n is small, that is, the average number of tasks allocated to each machine is small, the Dmm algorithm exhibits good performance. No matter how large the value of c is, the curve in Figure 1 always reaches the highest degree of algorithm improvement when the value of _N is 4 or 5. Therefore, the value of N is set to 4, and it is usually divided into 4 segments when the task sequence is divided.



3 Simulation experiment and result analysis Here we use HyperSim simulator to simulate the improved algorithm. HyperSim is actually a general-purpose discrete event simulation library developed based on C ++, which provides a series of library functions to build a simulator for a specific computing environment or professional application field. HyperSim provides a wealth of classes, such as event generators, statistical analyzers, automatic trajectory simulators, event manipulators, etc. to construct a simulation environment. It follows the event graph model, which can optimize simulation speed and increase scalability. Compared with other simulators, the biggest advantage of HyperSim is that it runs fast and can be used to simulate large-scale grid environments. In addition, it also has the characteristics of versatility, scalability and easy configuration.


HyperSim provides a series of library functions that can easily generate different host processing capabilities, network bandwidth, communication delay and other parameters randomly. This paper designs a simulation program to test the performance of the improved algorithm. The value of IV is 4, the number of computing resources is 10, and the predicted execution time of each task is given. Fig. 2 is the simulation result of scheduling to 10 processors when grid tasks contain different numbers of tasks. Each point in the figure represents the average of 5 simulation results. It can be seen from the figure that the performance of the three sub-strategies of the Dram algorithm is better than the Min-Min algorithm; the performance of Dmm-min is better than Dmm-avg in some cases, but in many cases it is not as good as Dmm-avg; The performance is always lower than Dmm-avg. Therefore, the Dmm-avg sub-strategy is adopted as the Dmm algorithm. Simulation results show that the performance of the Dmm-avg algorithm is improved by 4.2% -6.1% compared with the Min-Min algorithm.



Figure 3 shows the comparison of the load balance of the four algorithms. Among them, the load of each processor in the Min-Min algorithm is extremely unbalanced. The main reason is that the Min-Min algorithm allocates the task with the shortest execution time to the machine with the smallest load, which causes the task with a long execution time to be allocated to the load. On big machines. The load balancing of Dram algorithm is higher than Min-Min algorithm. Among them, the Dmm-avg curve is the smoothest, indicating that the algorithm has the highest load balance.


The experimental results prove that the execution time of the Dmm algorithm is shorter than the Min-Min algorithm, because Min-Min needs to find the task with the shortest execution time in the entire ETC matrix, and the Dmm algorithm uses a segmented method, only need to search within each segment The task with the smallest completion time. This method of segmentation not only reduces the makepan, but also shortens the running time.



This paper makes an in-depth analysis of the most commonly used Min-Min algorithm, and on this basis, the Dmm algorithm with better performance and better load balance is proposed. The HyperSim simulator was used to verify the Dmm algorithm. The simulation results show that the Dmm algorithm has many advantages over the Min-Min algorithm. Since the Min-Min algorithm is a very basic algorithm in grid scheduling, the improved algorithm will be of great help to existing scheduling algorithms, especially those based on Min-Min, which can be further improved The efficiency of this type of algorithm.

Product features:
â– Good heat dissipation
--Blade design of copper alloy contacts,good conductivity
â– Super high impact resistance and thermal stability
--Cover pans are made of high quality polycarbonate
â– Chemical corrosion resistance
-- Fingerprint-resistant zinc plated mounting brackets
â– Grounding
  --One-piece grounding design,No Load-Weather resistance

Generator TR GFCI UL

Generator TR GFCI UL,Ground Fault Circuit Interrupter TR,Generator TR GFCI UL outlet,Ground Fault Circuit Interrupter TR outlet

Hoojet Electric Appliance Co.,Ltd , https://www.hoojetgfci.com