APP下载

A Multi-UCAV cooperative occupation method based on weapon engagement zones for beyond-visual-range air combat

2022-06-27WeihuLiJingpingShiYunynWuYuepingWngYongxiLyu

Defence Technology 2022年6期

Wei-hu Li ,Jing-ping Shi ,b,* ,Yun-yn Wu ,Yue-ping Wng ,Yong-xi Lyu ,b

a School of Automation,Northwestern Polytechnical University,Xi'an,710129,China

b Shaanxi Province Key Laboratory of Flight Control and Simulation Technology,Xi'an,710129,China

c Science and Technology on Aircraft Control Laboratory,FACRI,Xi'an,710065,China

Keywords:Unmanned combat aerial vehicle Cooperative occupation Beyond-visual-range air combat Weapon engagement zone Discrete particle swarm optimization Formation switching

ABSTRACT Recent advances in on-board radar and missile capabilities,combined with individual payload limitations,have led to increased interest in the use of unmanned combat aerial vehicles (UCAVs) for cooperative occupation during beyond-visual-range (BVR) air combat.However,prior research on occupational decision-making in BVR air combat has mostly been limited to one-on-one scenarios.As such,this study presents a practical cooperative occupation decision-making methodology for use with multiple UCAVs.The weapon engagement zone (WEZ) and combat geometry were first used to develop an advantage function for situational assessment of one-on-one engagement.An encircling advantage function was then designed to represent the cooperation of UCAVs,thereby establishing a cooperative occupation model.The corresponding objective function was derived from the one-on-one engagement advantage function and the encircling advantage function.The resulting model exhibited similarities to a mixed-integer nonlinear programming (MINLP) problem.As such,an improved discrete particle swarm optimization (DPSO) algorithm was used to identify a solution.The occupation process was then converted into a formation switching task as part of the cooperative occupation model.A series of simulations were conducted to verify occupational solutions in varying situations,including two-on-two engagement.Simulated results showed these solutions varied with initial conditions and weighting coefficients.This occupation process,based on formation switching,effectively demonstrates the viability of the proposed technique.These cooperative occupation results could provide a theoretical framework for subsequent research in cooperative BVR air combat.

1.Introduction

Recent technological advances have made UCAVs increasingly indispensable in modern warfare.UCAVs offer several inherent advantages over manned combat aerial vehicles,including low cost,zero casualties,high maneuverability [1,2],and long-term sustainability.Conventionally,UCAV combat has primarily involved air-to-ground attacks under the remote monitoring or operation of a ground commander.However,improvements in autonomous capabilities are an inevitable trend in future air combat.

Autonomous air combat with UCAVs has been investigated in multiple studies,initially being proposed as a pursuit-evasion game[3-5],which only partly represents the nature of these confrontations.Modeling this problem first requires identifying the pursuer and the evader,which ignores the frequent interchange of roles that occurs during actual combat.Two-target games have been proposed to solve this problem,using representative models from the matrix game [6] and the influence diagram game [7-9].Autonomous air combat decision making systems for UCAVs have also been constructed using rule-based expert systems to mimic the behavior of human pilots [10,11].However,it is difficult to design an expert system for every given scenario.In recent years,reinforcement learning has introduced new possibilities for the development of autonomous air combat systems [12-14].For example,a virtual F-16 system controlled by reinforcement learning recently defeated a human pilot by a score of 5-0 in a virtual dogfight hosted by the Defense Advanced Research Projects Agency(DARPA) [15].

Load capabilities typically limit single UCAVs to simple tasks,but multiple UCAVs can execute more complex tasks through cooperation.For example,Shin et al.proposed an autonomous aerial combat framework for two-on-two engagement,based on basic fighter maneuvers [16].Mean-field multi-agent reinforcement learning has also shown promise for multi-UCAV cooperative air combat [17].

Each of these studies involved within-visual-range (WVR) air combat,commonly referred to as dogfighting.However,the development of detection and missile technology has led to beyond-visual-range (BVR) air combat.Recent studies on one-onone BVR air combat have primarily focused on situational assessment [18],weapon engagement zone (WEZ) calculations [19,20],occupational position acquisition[21],and guidance laws[22-25].Existing studies on many-on-many BVR air combat primarily involve multi-UCAV cooperative target assignments [26-30],which are based on the assumption that both sides have entered the opponent's attack range.Thus,accurate occupational positions can significantly improve the performance of a target assignment.Though UCAVs on both sides are in a constant state of adjusting their relative attack positions,little has been reported on cooperative occupation.Ma et al.proposed a cooperative occupation decision making model for multi-UCAV BVR air combat based on a zero-sum matrix game,which was solved using a double oracle combined with a neighborhood search (DO-NS) algorithm [31].However,the weapon engagement zone(WEZ)was not included in this technique,limiting the practicality of occupation area choices.In this study,a new decision-making methodology based on WEZ is proposed to fill existing gaps in the study of cooperative occupation problems in BVR air combat.

Cooperative occupation of multi-UCAV systems in BVR air combat can be considered a two-stage problem,as shown in Fig.1.In the first stage,a cooperative occupation model and its corresponding solution algorithm are used to calculate occupation information.In the second stage,a guidance law can help drive the UCAVs to their corresponding occupation positions.

The primary contributions of this paper are as follows.1) A cooperative occupation framework is presented,which involves a combination of occupation acquisition and a path planning algorithm or guidance law.2) A cooperative occupation model and its corresponding solution algorithm are proposed,which not only enables cooperation among UCAVs but also provides detailed destination locations and heading information for the path planning model.

The remainder of this paper is organized as follows.Section 2 demonstrates the process used to establish a cooperative occupation model for multi-UCAV systems in BVR air combat.Section 3 presents an improved DPSO algorithm for solving the cooperative occupation model.Section 4 presents a formation switching algorithm utilizing information acquired in Sections 2 and 3.Section 5 presents and analyzes simulation results.Section 6 concludes the paper and discusses the potential for future research.

Fig.1.The cooperative occupation framework.

2.BVR cooperative air combat problem formulation

In multi-UCAV cooperative BVR air combat,the primary objective for both sides is to occupy advantageous attack positions.The following assumptions are made in this paper.1)UCAVs from both sides are in a head-on situation,the primary mode of BVR air combat.2) All hostile UCAVs are traveling west.The second assumption also means that the hostile UCAVs are travelling in one direction.BVR air combat typically involves multiple waves of mutually firing long range air-to-air missiles.The algorithm in this paper is proposed to solve the multi-UCAV cooperative occupation problem prior to the first wave of missiles.In other words,UCAVs are entering the battlefield in this application scenario.As stated in Ref.[32],combat aircraft typically enter the battlefield in a formation style.Therefore,we can assume that heading differences between hostile UCAV directions can be ignored.The assumption that all hostile UCAVs are travelling in one direction is realistic to a certain extent.If the second assumption is invalid,the coordinate system can be rotated such that hostile UCAVs fly westward in the new coordinates.

In actual BVR air combat,aircraft cannot be too close together,or they are easily tracked and attacked by a single hostile aircraft.Large inter-aircraft distances and a large number of aircraft produced a larger distribution space.The distribution range of a formation should not be too large in practical situations,as it is limited by the communication range of data link.The number of aircraft in a single formation should also not be too large because of the limited communication bandwidth.As described in Ref.[32],when there are more than two combat aircraft on one side,they are typically divided into several two-person groups.So in this study,we only consider scenarios where the number of UCAVs in a single group is no more than 3.

This section focuses on the problem of cooperative occupation modeling in BVR air combat.Section 2.1 presents a one-on-one situation assessment methodology,based on WEZ,in which three geometric factors (azimuth angle,entry angle and distance) were used to construct a one-on-one engagement advantage function.In Section 2.2,an encircling advantage function is proposed to represent the cooperative encirclement of multiple UCAVs on a single target.Section 2.3 presents a cooperative occupation model for BVR air combat,in which an objective function is derived from one-on-one engagement and encircling advantage functions.For simplicity,the UCAVs operated using the proposed method are collectively referred to as the red (attacker) side,while the others are denoted the blue (target) side.

2.1.One-on-one engagement advantage function

The complete one-on-one BVR air combat attack process can be described as follows.The attacker enters the target's WEZ and launches missiles from an optimal position.When the attacker is outside the target's WEZ,it is assumed the attacker does not pose a threat to the target and vice versa.

The WEZ defines a vulnerable region around a target.When the attacker launches a missile from inside this space,the missile can destroy the target that continues moving along a linear path with a constant velocity.WEZ can also be calculated using a pattern search algorithm for higher precision [21].

The assumption that a target will move linearly with a constant speed is unreasonable in actual air combat.As such,the non-escape zone (NEZ) can be used to define a narrower region around the target.When the attacker launches a missile from this zone,it will destroy the target regardless of how it moves.Thus,the probability of destroying a target is significantly higher in the NEZ than in the WEZ.

Fig.2.The geometric relationship between the attacker and the target.

The WEZ and NEZ were identified using a pattern search algorithm,as shown in Fig.3 [21].The target's threat zone is also displayed in the figure.The ranges of these zones are largest when φ=0。and q=180。(i.e.,the attacker and target are traveling in head-on trajectories).The target's threat zone is primarily affected by the maximum off-axis launch angle φ.In addition,the target's azimuth φ,entry angle q,and range of LOS D are key factors affecting situational assessment in one-on-one BVR engagement.

The azimuthal advantage function can be defined as follows:

where Ais the azimuthal advantage of the attacker relative to the target.It is easier to satisfy missile launching conditions at smaller azimuthal angles(φ).If a missile's off-axis launch angle exceeds the maximum allowable off-axis launch angle φ,launch conditions will no longer be satisfied.

Table 1 The notations and variables shown in Fig.2.

Fig.3.The weapon engagement,non-escape,and threat zones.

The entry angle advantage function can be expressed as:

where Ais the entry angle advantage for the attacker relative to the target.The attacker has the largest threat range when the entry angle q=180。and the azimuth angle φ=0。.

The distance advantage function can be expressed as:

where Ais the distance advantage of the attacker relative to the target,and Dand Dare the far boundaries of the NEZ and WEZ,respectively.These far boundaries are correlated with the current entry angle.If the distance D between the attacker and the target is smaller thanD,the attacker has entered the NEZ and Ais equal to one.If D is between DandD,the attacker is outside the NEZ but inside the WEZ.As D increases,the attacker's advantage over the target gradually decreases.If D is larger thanD,the attacker is outside the WEZ and no longer poses a threat to the target.

Fig.4 illustrates the changing regularity of the entry angle advantage function when φ=0。(b) and the distance advantage function when φ=0。and q=180。(c).The terms φ,q,and D are three coupled factors affecting situational assessment in one-onone air combat,for which the integrated advantage function can be expressed as:

where A is the integrated advantage of the attacker relative to the target and γand γare weight coefficients satisfying γ+γ=1.

The profit of the attacker over the target is given by:

where T is the threat of the target to the attacker.It is assumed the attacker always points toward the target,while the target's heading angle is maintained at-90。.Changing regularities for the attacker advantage relative to the target,the threat of the target to the attacker,and the attacker profit can each be inferred from Figs.5-7.In each figure,(b) shows a top view of the subfigure in (a).

2.2.Encircling advantage function

The cooperative encirclement of multiple UCAVs on a single target can be represented using an encircling score function.A typical evasion strategy for a target in BVR air combat can be described as follows.The target performs a maneuvering turn until the heading of the target meets the evasion requirement.The current speed and direction are then maintained in a straight-line flight.Evasion requirements dictate that if the target maintains this heading,it can escape from the threat zone in the shortest possible time.

As shown in Fig.8,the WEZ can be rotated around the target,such that the line between the attacker and the target forms a line of symmetry for the WEZ.The black dotted line is referred to as the marker line for the attacker-target distance.It forms a circle of radius D with the target as the center and intersects the rotated WEZ at two points.Connections from the target to the two intersection points are then denoted by the two red dotted lines,which divide the rotated WEZ into two distinct regions.The area containing the attacker is called the target's maneuvering threatened zone.If the target moves into this zone after discovering the attacker's intention,there is a relatively high probability it will be destroyed before it can escape.The angle between the two red dotted lines in the maneuvering threatened zone θis called the target's maneuvering threatened angle.

The angle θcan be calculated by first expanding the WEZ and the marker line along the entry angle dimension,as shown in Fig.9(a).Since the intersections of the expanded WEZ and the expanded marker line are symmetric about the 180。line,only one of the entry angles corresponding to the intersection points needs to be considered.Here,the left intersection was selected,and its corresponding entry angle was calculated using the golden section method,as shown in Fig.9(b).

Fig.4.Changing regularities for each advantage function.

The initial search range is defined as[a,b],where a=0。and b=180。.The golden section point is then calculated as R=a+0.618(b-a).If the far boundary distance for the WEZ corresponding to Ris longer than the attacker-target distance D,we set b=R(else a=R).These steps were repeated until |b-a|<ε,where ε has a sufficiently small positive value.The entry angle corresponding to the left intersection is then denoted by a and θcan be expressed as:

When multiple attackers cooperatively engage a target,all threatened maneuvering angles can be combined into a single maneuvering threatened angle θ,as shown in Fig.10.Effective cooperative occupation,and the existence of a combined maneuvering threatened angle,can significantly increase the probability of destroying a target.

The angle θcan be divided into a left (θ) and right section(θ)by V.Since the target is more likely to escape from regions with smaller combined maneuvering threatened angles,the encircling advantage function can be expressed as:

2.3.Cooperative occupation model

An N-on-M scenario was modeled in which N denotes the number of red side UCAVs and M denotes the number of blue side UCAVs.The terms R={r,r,…,r}and B={b,b,…,b}denote the red and blue sides of the UCAV set,respectively,where r(i=1,2,…,N)is the i-th UCAV on the red side and b(j=1,2,…,M)is the j-th UCAV on the blue side.

The advantage of rover the blue side can be represented as:

where Ais the advantage of rto b,calculated using Eq.(4),and PDis a penalty term related to the depth of rentering the NEZ of b.It can be expressed as:

The threat of the blue side to rcan be expressed as:

Fig.5.The advantage of the attacker relative to the target.

Fig.6.The threat of the target relative to the attacker.

Fig.7.The profit of the attacker relative to the target.

Fig.8.The target's maneuvering threatened zone.

where Tis the threat of bto r,calculated using Eq.(4).In this expression,Tis considered the probability of bdestroying rand Tis the probability of rbeing destroyed by at least one UCAV on the blue side.

The static profit of the red side over the blue side is defined as:

The encircling profit of the red side relative to the blue side is given by:

where ωand ωare weighting coefficients satisfying ω+ω=1.The number of red side UCAVs that threaten bis denoted num_threat.The cooperative occupation problem for the red side can then be expressed as:

Fig.9.Calculation of the maneuvering threatened angle for the target.

The primary objective of cooperative occupation is to achieve a situation that maximizes total profit for the team.Occupation schemes are composed of a series of position-heading combinations,each corresponding to a UCAV on the red side.The term ψin the first constraint denotes the heading of the i-th UCAV on the red side.It is equal to 0 when the i-th red UCAV points north and increases as the heading rotates clockwise.The first constraint indicates each red UCAV has an eastbound velocity component,such that red and blue counterparts can form a head-on configuration.The second constraint implies the number of UCAVs that threaten bis no more than 2.Here,ris threatening bif ris in the WEZ of band bis in the range of the maximum off-axis missile launching angle for r.The third constraint requires each red UCAV's occupation position to be accessible.The methodology used for judging occupation position accessibility is shown in Fig.11.

As seen in the figure,judgment windows are defined on the west side of r.The east-west line where ris located forms the symmetry axis for the judgment window.The length of a window is denoted D,representing the accessibility judgment depth.The width of a window is denoted D,representing the path adjustment space for rduring penetration of a threat zone.These zones are represented by the light red patches in Fig.11 and are formed by the overlapping areas between threat zones for the two targets and the judgment window.The minimum penetration depth Dis defined as the shortest east-west length of the threat zone.If Dis less than the penetration depth threshold D,then r’s occupation position is accessible(accessibility=1).Otherwise,r’s occupation position is not accessible (accessibility=0).

3.Improved DPSO algorithm

The cooperative occupation problem expressed by Eq.(14) involves finding a series of occupation position-heading combinations,each of which corresponds to a red side UCAV,such that the profit of the red side over the blue side is maximized.This is a combinatorial optimization problem and its objective and constraint functions involve complex nonlinearity and discreteness.It can also be expressed as a mixed-integer nonlinear programming (MINLP) problem,for which it is difficult to obtain an optimal solution.

Fig.10.The combined maneuvering threatened angle.

Fig.11.Accessibility judgment.

The particle swarm optimization(PSO)algorithm is widely used to solve practical optimization problems [33,34] because of its simple search mechanisms,ease of programming,and natural parallel search capabilities.The original algorithm was first proposed by Eberhart and Kennedy in 1995[35],but it could only solve continuous space problems.In 1997,a binary particle swarm optimization (BPSO) algorithm was proposed by the same authors to operate on discrete binary variables [36].Various discrete particle swarm optimization (DPSO) algorithms have since been proposed for corresponding problems.However,there is currently no general DPSO framework that can solve every type of MINLP problem.In this paper,an improved DPSO algorithm is proposed to solve the problem discussed above.

This improved DPSO algorithm is comprised of a particle definition,a swarm definition and update,an extremum update,and a termination condition judgment.A flowchart for the algorithm is provided in Fig.12.The components of the improved DPSO algorithm proposed in this paper will be introduced individually throughout this section.

3.1.Particle definition

Fig.12.A flowchart for the improved DPSO algorithm.

The problem represented by Eq.(14)was addressed by selecting optimal positions for red UCAVs in several WEZs.In this paper,a rectangular region Ω,containing all blue side WEZs,was defined as the occupation area for the red side.The occupation area was then divided into squares of uniform size w,each being assigned a binary variable.This occupation position value was 1 if the corresponding square was occupied and 0 otherwise.The set of binary variables formed a K ×L occupation matrix for the red side(denoted by X),as shown in Fig.13.Another variable set,representing the red side headings,defined an N×1 heading matrix H,where N is the number of red UCAVs.

Particles in the algorithm could then be defined as a combination of X and H .Each element in X with a value of 1 exhibited a one-to-one correspondence with each element in H .The elements of X (with a value of 1)were first ranked in ascending order of their columns and rows.For example,there are three elements equal to 1 in X,located in column 2-row 5,column 3-row 6,and column 3-row 3.The indices corresponding to these elements are 1,3,and 2,respectively.The i-th element in H corresponds to the i-th element with a value of 1 in X.

3.2.Fitness evaluation

The definition of particles in Section 3.1 can be used to convert the cooperative occupation problem from the expression in Eq.(14)to the following:

Incorporating the constraints in Eq.(15) into the fitness evaluation then produces:

where F is the fitness function for the improved DPSO algorithm proposed in this paper.

3.3.Swarm definition and update

Three sub-swarms are defined in the improved DPSO algorithm.The first was used to randomly generate particles in each iteration step,thereby ensuring particle diversity.The second sub-swarm was updated in each iteration on the basis of the previous generation of particles,making full use of acquired experience in previous iteration steps.Particles in the first sub-swarm were also selected to replace poorly performing particles.The third subswarm was updated on the basis of the global optimal particle,which can improve local algorithm search capabilities.The number of particles in each sub-swarm was assumed to be N,N,and N,respectively.

Fig.13.The occupation matrix.

The three sub-swarms were randomly initialized.In this process,all particles were required to satisfy the constraints defined in Eq.(15).The same requirements were also imposed in each iteration of the first sub-swarm.

The specific update strategy for the second swarm can be described as follows.First,a reference particle is defined that serves as the basis for an update.The n-th particle then uses the n-th particle in the second sub-swarm of the previous generation as its reference particle.Second,the matrix X for each particle is updated on the basis of its reference particle as described below:

· If the i-th updated location exceeds the boundary of the matrix,it becomes necessary to limit the location within the matrix boundary.This can be expressed as:

where limit(a,b,c)is a function that limits the value of a to between b and c.It can be expressed as follows:

· If the i-th updated location,represented by the red square in Fig.14,conflicts with one of the i-1 previously updated element locations,it must be modified in the following way.With the red square serving as the center,unoccupied locations(filled with 0 in Fig.14) are searched outward in a circular pattern.Unoccupied positions in the inner circle are given selection priority,while unoccupied positions in the same circle are selected at random.It should be noted that the locations filled with 1 in Fig.14 include not only occupied locations but also those outside the boundary of the matrix.

Particles in the second sub-swarm were updated individually using the three steps described above.The updated particles with a higher fitness value than their corresponding reference particles were selected as new members of the second sub-swarm and their corresponding reference particles were removed.Particles with low fitness in the second sub-swarm were replaced if there were particles in the first sub-swarm had higher fitness values.

The update technique used in the third sub-swarm is similar to that of the second sub-swarm,with the following differences.1)All particles use the global optimal particle in the previous generation as their reference particle.2) The swarm composed of updated particles will completely replace the third sub-swarm in the previous generation.

3.4.Extremum update

In each iteration,the particle with the maximum fitness among all particles in the three sub-swarms will be selected for comparison with the global optimal particle from the previous generation.If its fitness value is larger than that of the global optimal fitness in the previous generation,it will be selected as the new global optimal particle and the corresponding fitness will be the global optimal fitness.

3.5.Termination condition judgment

The improved DPSO algorithm terminates when the number of iterations reaches the maximum number of iterations N.

Fig.14.The boundary protection mechanism for location updates.

4.Formation switching algorithm

Once acquired desired occupation position-heading combinations,red UCAVs must adjust their flight plans to improve their advantage.This requires the UCAVs to fly cooperatively and arrive at their corresponding destinations simultaneously.In this section,the flight plan adjustment process is represented as a formation switching problem and an applicable formation switching algorithm is briefly described.

During formation switching,the number of red UCAV flight path intersections should be as few as possible,to reduce the possibility of collision.As such,each UCAV in the formation attempts to maintain its original relative position relationship with all remaining UCAVs(to the highest degree possible)through position allocation.This process can be described as follows.

First,a mapping function Λ(z):R→{-1,0,1} is defined as:

This expression can also be considered a relative distance mapping function.

Assuming the distance from UCAV rto rin the direction of the overall formation velocity is Δx,the distance from UCAV rto rin the direction perpendicular to the overall formation velocity can be denoted Δy,where i,j=1,2,…,N.In addition,Δx>0 if UCAV ris front of UCAV rand Δy>0 if UCAV ris on the right side of UCAV r.The corresponding relative distance matrix for the current formation can then be expressed as:

A relative distance matrix for the desired formation can be represented by:

The relative distance mapping function in Eq.(21)can be used to map the matrix ρto a relative position description matrix as follows:

The i-th(i=1,2,…,N)row of the relative position description matrix can be extracted to form a set containing all position relationship information for position i,relative to the remaining positions.

The Jaccard similarity coefficient [37] inspired a new definition for the degree of similarity between sets A and B,defined as:

Here,A and B are nonempty sets and |A ∩B|represents the number of successful element pairs in sets A and B.The elements a∈A and b∈B are considered a successful pair if they are equal.If this is the case,the two elements no longer participate in subsequent pairings.For example,assume A={1,2,3}and B={3,3,3}.The degree of similarity for sets A and B is then θ(A,B)=1.

The position allocation problem can then be described as:

where zis the element in row i and column j of the position allocation matrix Z .It is equal to 1 if UCAVis allocated to occupy the position j in the desired formation.These constraints imply there is only one element in each row and column equal to 1.This problem can be easily solved using a classic genetic algorithm,which will not be discussed here.

After position allocation,consensus tracking control protocols[38-40]were used to drive each UCAV in the formation to its own position.The UCAV model used in this section includes two degrees of freedom and can be represented as:

where V is the velocity of a UCAV (limited to [0.5Ma,1.5Ma]),ψ is the heading angel,nis the load factor(limited to[-5,5]),and nis the turning load factor(limited to[-5,5]),which is negative when the UCAV turns left.

5.Results and discussion

The effectiveness of the proposed cooperative occupation model was validated using three experimental analysis techniques.Section 5.1 presents occupation results in varying situations.Section 5.2 analyzes the influence of weighting coefficients on the results of cooperative occupation,specifically in two-on-two environments.Section 5.3 describes formation switching-based cooperative occupation.

The parameters used in these experiments were set as follows:γ=0.25,γ=0.75,w=1000m,N=100,N=100,N=100,V=10,V=10,and N=500.All UCAVs were assumed to fly at an altitude of 10 km at Mach 1.2.Three key factors affecting accessibility judgment were set as follows:D=18 km≈WEZ(180。-φ)·sin(φ),D=47 km≈1.5·WEZ(180。),and D=5000 m.In these expressions,WEZ(q)is the range of far WEZ boundaries corresponding to the entry angle q.The value of Dshould not be too large but should allow red UCAVs to briefly enter blue threat zones.All experiments were conducted using a 64-bit PC with 16.0 GB RAM and a 1.10 GHz processor.

5.1.Solutions in different situations

Varying simulation conditions produced different occupation results in each situation.The number of participating UCAVs and the formation of the blue side were the two primary factors affecting the outcome.The following experiments focused on these two factors and assumed ω=0.8 and ω=0.2.Corresponding results were as follows:

· One-on-one situations

Fig.15(a) shows occupation results for a one-on-one situation.This result is consistent with prior studies [21],as the optimal occupation position is located at one of the intersection points on the far boundary of the NEZ and the threat zone.Fig.15(b) demonstrates that the improved DPSO algorithm can converge within 100 iterations for one-on-one situations.

Fig.15.Occupation results for one-on-one situations.

Fig.16.Cooperative occupation results for two-on-one situations.

· Two-on-one situations

Fig.16(a) shows the cooperative occupation results for a two-onone situation.It is evident that both rand rare located at the optimal occupation positions for the one-on-one situation.In addition,rand rare located on the left and right sides of b,respectively,forming an encircling situation represented by Pin Eq.(13).Fig.16(b) also suggests the improved DPSO algorithm,proposed in this paper,can converge within 100 iterations for two-on-one situations.

· Two-on-two situations

Fig.17 shows cooperative occupation results for two-on-two engagements,in which each subfigure corresponds to an enemy formation.The positions of band bcoincide with each other in Fig.17(a)and the cooperative occupation result is similar to that of a two-on-one engagement.In Fig.17(b),band bfly in parallel at a lateral distance of 8000 m.Both rand rmust increase WEZ penetration depth to threaten band bsimultaneously.In Fig.17(c),band bfly in parallel at a lateral distance of 18000 m.In this case,even if the WEZ pentration depth is increased,band bcannot be threatened simultaneously.As such,rand rwill avoid these threat zones and attack band b,respectively.In Fig.17(d),band bfly in parallel at a lateral distance of 40000 m.The profit is higher if rand rattack either bor bcooperatively.In Fig.17(e),bflies a distance of 10000 m ahead of b.In addition,rand rtend to select positions closer to the central line of band b,to attack band bcooperatively.In Fig.17(f),bflies at a distance of 30000 m ahead of band there is no occupation position capable of threatening band bat the same time.Thus,rand rwill attack either bor bcooperatively.

· Three-on-two situations

The following conclusions can be drawn from Fig.17.1) When band bare close to each other,they are treated as a single object and rand rwill attack cooperatively.2) As the distance between band bincreases,a large scale overlap will exist between their threat zones.The eastern region of this overlap is not accessible,so rand rtend to avoid threat zones and attack band b,respectively.3)If the distance between band bis sufficiently large,twoon-two engagements can be converted into a combination of twoon-one engagements.

Fig.18 shows cooperative occupation results for three-on-two engagements,in which each subfigure corresponds to an enemy formation.The positions of band bin Fig.18(a) coincide with each other.In Fig.18(b),band bfly in parallel at a lateral distance of 8000 m.In Fig.18(c),band bfly in parallel at a lateral distance of 24000 m.In Fig.18(d),band bfly in parallel at a lateral distance of 40000 m.Fig.18(a) and (b) are similar to Fig.17(a) and (b),respectively,though they both include a randomly placed r.

The following are evident from Fig.18.1)If band bare close to each other,they will be treated as a single object.It is then sufficient to assign two red UCAVs to attack the combined target,cooperatively,with a third red UCAV included as a backup.2) As the distance between band bincreases,a large scale overlap forms between the threat zones for band b.If the eastern region of this overlap is not accessible,two red UCAVs may play the same roles as in the two-on-two situation shown in Fig.17(c).A third red side UCAV then enters the overlap region briefly,forming a cooperative attack configuration(against band b)with the other two UCAVs.3)If the distance between band bis sufficiently large,the two red side UCAVs will cooperatively attack either bor b,while the remaining red UCAV will attack the remaining blue UCAV.

5.2.Solutions with varying weight coefficients

Variable weighting coefficients may produce different occupation results.In this section,ω=0.5 and ω=0.5.Corresponding cooperative occupation results for two-on-two situations are shown in Fig.19.

Fig.17.Cooperative occupation results for two-on-two situations.

A comparison of the four subgraphs in Fig.19 with the first four subgraphs in Fig.17 suggests that when two red UCAVs attack a target cooperatively,their occupation positions are more inclined to the east.This is because an increasing value of ω/ωincreases the importance of Pin Eq.(13).The only way to increase Pis to improve the degree of encirclement for the targets.

Combined with the above simulation results,the proposed DPSO algorithm provides the following advantages for solving the cooperative occupation problem:

(1) The one-to-one correspondence between elements in X and H in particle definition and particle update step provides a natural way of describing one-to-one correspondence between the occupation position and the direction of each UCAV.

Fig.18.Cooperative occupation results for three-on-two situations.

(2) The definition of the matrix X for each particle transforms the continuous optimization space of occupation positions into a discrete space,allowing a trade-off between the accuracy of the optimization and the size of the optimization space.

5.3.Occupation simulation

In this section,ω=0.8 and ω=0.2.Fig.20 demonstrates the formation switching-based cooperative occupation process.In the initial stages,the two red UCAVs fly eastward from x=-10m.The lateral distance between the two UCAVs is 2.6× 10m,while the two blue UCAVs fly westward from x=0 m,separated by a lateral distance of 10m.The centers of the red and blue side formations are both located on the x-axis.It can be assumed that all blue side UCAVs are always moving in a uniform linear motion.The red side formation must then be continuously adjusted,based on the cooperative occupation solution,until all red side UCAVs reach their corresponding optimal occupation positions simultaneously.

6.Conclusion

In this paper,a novel two-stage cooperative occupation decision-making framework for multi-UCAV BVR air combat was proposed.Combat geometries were first used to determine an optimal cooperative occupation solution and corresponding algorithm,which were critical in the first stage.Once this solution was acquired,a formation switching process was performed to satisfy the requirements of simultaneous arrival.

Fig.19.Cooperative occupation results for two-on-two situations.

Fig.20.The cooperative occupation process.

A series of simulations was conducted to verify the various cooperative occupation results in differing situations,including one-on-one,two-on-one,two-on-two with varying enemy formations,and three-on-two with varying enemy formations.Simulated results showed the solution produced by the proposed model for one-on-one situations is consistent with prior studies.These occupation solutions exhibited a reasonable and regular trend with changing conditions.Common multi-UCAV head-on BVR air combat scenarios can be represented as a combination of the basic situations discussed above.As such,the conclusions drawn in this paper could be extended to any head-on BVR air combat scenario.

The proposed formation switching-based cooperative occupation process was also verified via simulation,satisfying the requirements of simultaneous arrival.However,pointing requirements for UCAVs arriving at their respective occupation positions,and the requirements for effectively avoiding threat zones during occupation,could not be met.The development of a path planning algorithm to meet these two requirements simultaneously will be the subject of a future study.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

This work was supported by the National Natural Science Foundation of China (No.61573286),the Aeronautical Science Foundation of China(No.20180753006),the Fundamental Research Funds for the Central Universities (3102019ZDHKY07),the Natural Science Foundation of Shaanxi Province (2020JQ-218),and the Shaanxi Province Key Laboratory of Flight Control and Simulation Technology.We thank LetPub (www.letpub.com) for its linguistic assistance and scientific consultation during the preparation of this manuscript.