APP下载

Multi-objective Particle Swarm Optimization Algorithm Based on Performance and Reliability of Discrete System Resources Configuration

2014-08-12ZHOUGuoCai周国财GAOXiang

ZHOU Guo-Cai(周国财), GAO Xiang (高 翔)

1 School of Astronautics, Northwestern Polytechnical University, Xi’an 710072, China 2 China Academy of Space Technology, Xi’an 710000, China

Multi-objective Particle Swarm Optimization Algorithm Based on Performance and Reliability of Discrete System Resources Configuration

ZHOU Guo-Cai(周国财)1*, GAO Xiang (高 翔)2

1SchoolofAstronautics,NorthwesternPolytechnicalUniversity,Xi’an710072,China2ChinaAcademyofSpaceTechnology,Xi’an710000,China

Considering research on multi-objective optimization for reliability and performance suffering cost constraints in digital circuits, an improved multi-objective optimization algorithm based on performance and reliability was proposed to solve the problem of discrete system resources configuration in this paper. This algorithm used the particle-swarm optimization (PSO) to evaluate the trade-offs configuration of the system resources between reliability and performance and proved the feasibility through the simulation. Finally, the information of resources configuration from optimization algorithm was used to effectively guide the system design so as to mitigate soft errors caused by single event effect (SEE).

multi-objectiveoptimization;functionmodule;softerror;triplemodularredundancy(TMR)

Introduction

With the demands to improve the reliability of digital circuits in the space applications, more fault-tolerant designs are introduced so that the reliabilities of the digital circuit system begin to satisfy the requirements. Compared with the general system, fault-tolerant design consumes more resources in order to achieve the higher reliability. However, the current fault-tolerant system mainly uses spatial or temporal redundancy technology and methods to fulfill the deign index of reliability[1-2], and software-implemented hardware fault tolerance (SIHFT) method can play a very good effect on soft error mitigation[3-4]. Thus, the ideas of SIHFT are drawn to assess the soft error occurred by single event effect (SEE). At the same time, the improvements of system reliability also have impact on system performance and cost. Therefore this paper is to provide the information of optimal resources configuration for fault-tolerant design so as to meet the maximum reliability, the best performance, and the lowest cost.

We will carry out the research of the resources configuration optimization algorithm for improving system reliability design, and analyze the quantitative relationship between reliability and performance under the cost to establish the balanced assessment model of reliability and performance. At last, the system design with the best reliability is simulated by a given example using this algorithm based on the application requirements.

1 Evaluation Methods on Reliability Design, Performance, and Cost in System

1.1 Evaluation of reliability design

Definition 1 Reliability evaluation of system is used to calculate the ability of finishing the specific function during the special spatial environment and the certain predetermined time.

As can be seen from Definition 1, the reliability is related to the time under the same conditions. If the digital circuit is taken as a whole system, the reliability could be represented with the work time of system. In addition to the entire digital circuit system, each independent subsystem or circuit module could compute the reliability respectively[5-6]. In this paper, the correctly executed probability of the system is used to assess the reliability defined as functionR[7]:

R=e-λ·t,

(1)

whereλis the probability of system failure occurred within unit work time after the timet, namely, the probability of circuit failure caused by SEE in space radiation environment.

1.2 Evaluation of performance and cost

Definition 2 Evaluation of performance and cost in system is defined as the impact of changes referred to the resources consumed based on the assessment of technologies or methods.

The evaluated value of performance may vary with different application backgrounds. The system performance is usually the time that the system fulfills the whole task in the cost of redundant resources. The system response time when a task is implemented successfully is selected as the measure of the system performance evaluation with the functionT, and the occupancy rate of hardware resources in system is defined as the measure of cost.

2 Optimization Algorithm of Balanced Resources Configuration

2.1 Evaluation model of reliability

Since the different designs of system will get different reliability assessments, the whole system is made up of a number of function modules according to the different definitions of function circuits and the connection relationships between the modules. Firstly, the soft errors occurred by SEE during every function module may pass on the entire system and lead to the failure for the series circuitry system. The reliability of this type system is evaluated as:

(2)

Secondly, the reliable analysis for the parallel circuitry system could be used to evaluate the redundancy design of soft errors. Then the comprehensive reliability is defined as:

(3)

Based on the above definitions, digital circuit system is able to be modeled as a complex series-parallel structure to achieve a comprehensive reliability assessment of system. Assuming that the circuit system S is made up of a series of function modules represented asM=(m1,m2, …,ms), wheresis the number of function modules. AndnofMmodules are adopted asm-module redundancy design (n≤s). When the functions ofkredundancy modules are correct, the reliability of one function module for them-module redundancy design could be computed by Ref.[8] :

(4)

Therefore, a comprehensive reliability of system is deduced from Eq. (2) to Eq. (4) shown as:

(5)

2.2 Evaluation model of performance

The above analysis shows that the more fault tolerance modules of the system, the longer the execution time of system is, which causes the rated frequency and the performance of system decrease, and also the resources and the power consumption increase.

2.3 Configuration model of multi-objective particle optimization (CM-MPO)

(6)

Therefore, CM-MPO of the system is able to be classified into the following discrete multi-objective solution, and Eq. (6) will also be modified as:

(7)

where the values ofAmaxandPmaxdenote that the system can withstand the upper limits of the hardware resources and the power respectively.

Fig.1 The flow chart of algorithm

3 Simulation Results

An example of simulation system given by FPGA is shown in Fig.2, where the structure of this system is made up of eight function modules. At the same time, the working frequency of system is set as 100 MHz, and the working cycle of system is 10 ns. Furthermore, it is considered that the redundancy judgment circuit has an impact on the system performanceT. So the frequency of the redundancy judgment circuit is set 5% of the original working frequency.

Fig.2 The structure of system

According to the allocation of the reliability design index among the function modules, the values of the reliability prediction belonging to the different function modules designed by triple modular redundancy (TMR) design are calculated through Eq. (4). However, the reliability of judgment circuit is set as 1. These values are shown in Table 1. Obviously, the reliabilities of the modules implemented TMR have been greatly improved which is shown from the value of reliability prediction in Table 1.

Table 1 Reliability index and reliability prediction of the redundancy function modules

The redundant design of the modules will occupy the space at the expense of hardware resources, and bring the small power consumption. So the limits of hardware resources are considered firstly during the fault-tolerant design. System resources occupancy rates of the function modules are set as shown in Table 2. When the hardware resources of the redundant design are increased by at least two times of the original design in system, the upper limitAmaxof the system available resources is also set as two times of the original system design.

Table 2 Resources occupancy rates of the function modules in system before redundant design

Fig.3 System resources occupancy rate and reliability of optimal configuration

Figure 4 shows the curves that the values ofTMTBSandTvary withRsat the corresponding function modules. The rate of change withRsin Fig.3 is greater than that withTin Fig.4 along with the increase of the redundant design modules, so the relative value ofTMTBSdecreases deduced by Eq. (6).

Fig.4 The curves of TMTBS and T with the optimal configuration

The results obtained in Fig.4 show that the system performance will be affected with the increase of the working frequency because of the redundancy design. However, if the increase of system reliability is more obvious, then the relative time interval between two correct tasks operated in system will be reduced which means that work efficiency and overall performance of system could be improved indirectly.

4 Conclusions

For the trade-offs of digital circuit system on reliability and performance, multi-objective optimization algorithm based on performance and reliability of discrete system resources configuration is proposed in this paper where a balanced assessment model of resources optimization is founded. The optimal acquisition system reliability and the optimized allocation scheme of the function modules in system are also obtained by this model, when the system must meet the constraints of costs. The simulation results prove the model feasible and can guide the system design of soft error mitigation caused by SEE.

[1] Luca S. Electronics System Design Techniques for Safety Critical Applications [M]. Berlin, Germany: Springer, 2008: 1-15.

[2] Pignol M. COTS-Based Applications in Space Avionics [C]. Proceeding of Design Automation and Test in Europe (DATE), Dresden, Germany, 2010: 1213-1219.

[3] Reis G A, Software Modulated Fault Tolerance [M]. Princeton University, USA: Ph.D. Dissertation, 2008: 6-20.

[4] Li J L, Tang Q P, Xu J J. A Software-Implemented Configurable Control Flow Checking Method [C]. Proceeding of Parallel Architectures, Algorithms and Programming (PAAP), Dallian, China, 2010: 199-205.

[5] Littlewood B, Strigini L. Software Reliability and Dependability: a Roadmap [C]. Proceeding of the 22nd International Conference on Software Engineering (ICSE’ 2000), Limerick, Ireland, 2000: 177-188.

[6] Mitra S, Seifert N, Zhang M,etal. Robust System Design with Built-in Soft-Error Resilience [J].IEEEComputer, 2005, 38(2): 43-52.

[7] Paul E, Dodd L, Massengill W. Basic mechanisms and Modeling of Single-Event Upset in Digital Microelectronics [J].IEEETransactionsonNuclearScience, 2003, 50(3): 583-602.

[8] Tang S D, Feng P H. Reliability of Consecutive-k-out-of-n:F System [J].JournalofAnhuiUniversityofTechnology:NaturalScience, 2010, 27(3): 331-335.

[9] Michalewicz Z. A Survey of Constraint Handling Techniques in Evolutionary Computation Methods [C]. Proceeding of the Fourth Annual Conference on Evolutionary Programming, San Diego, California, USA, 1995: 135-155.

1672-5220(2014)06-0850-03

Received date: 2014-08-08

* Correspondence should be addressed to ZHOU Guo-cai, E-mail: zguocai882@nwpu.edu.cn

CLC number: TP391 Document code: A