An FPGA-based LDPC decoder with optimized scale factor of NMS decoding algorithm

2022-11-28LIJinmingZHAGNPingpingWANGLanzhuWANGGuodong

Journal of Measurement Science and Instrumentation 2022年4期

LI Jinming, ZHAGN Pingping, WANG Lanzhu, WANG Guodong

(School of Instruments and Electronics, North University of China, Taiyuan 030051, China)

Abstract： Considering that the hardware implementation of the normalized minimum sum ( NMS ) decoding algorithm for low-density parity-check ( LDPC ) code is difficult due to the uncertainty of scale factor, an NMS decoding algorithm with variable scale factor is proposed for the near-earth space LDPC codes (8 177,7 154) in the consultative committee for space data systems ( CCSDS ) standard. The shift characteristics of field programmable gate array (FPGA) is used to optimize the quantization data of check nodes, and finally the function of LDPC decoder is realized. The simulation and experimental results show that the designed FPGA-based LDPC decoder adopts the scaling factor in the NMS decoding algorithm to improve the decoding performance, simplify the hardware structure, accelerate the convergence speed and improve the error correction ability.

Key words： LDPC code； NMS decoding algorithm； variable scale factor； quantization

0 Introduction

Low-density parity-check[1](LDPC) code is a kind of linear block code in essence. Because its error correction performance is close to the Shannon limit[2], it can effectively, precisely and reliably detect whether the data transmitted among devices are correct or missing. Since the 1990s, LDPC codes has been widely used in the field of satellite communication due to its excellent performance[3-5]. The LDPC code is recommended as the channel coding standard for deep-space communication and near-ground communication by International Consultative Committee for Space Data System in 2010[6]. In 2011, the Change-2 probe also tries to use LDPC code instead of Turbo code as the data link coding scheme. In 2016, the proposal is passed by Qualcomm that LDPC should be used as a channel code long code at the 5G standard voting conference held by 3GPP, which is a big win for LDPC code in 5G times. The excellent performance of LDPC code in space and 5G field lays the foundation for subsequent research on channel coding[7-8]. It has been used in communication standards such as DVB-S2[9], IEEE802.16e[10]and IEEE802.11[11], etc. Although LDPC code has good anti-jamming performance, its decoding complexity has always been the biggest obstacle to its application in communication systems.

Most traditional LDPC decoding algorithms use high-precision floating-point numbers for operation. However, it is difficult for these high-precision floating-point numbers to implement based on hardware. Therefore, low complexity and hardware-friendly LDPC decoder has been the research hotspot. A minimum sum-product (MSP) decoding algorithm is proposed based on normalization correction, which makes the signal-to-noise ratio (SNR) of the input signal of LDPC decoder reduce by 10 dB at the same output bit error rate whereas decoding performance is declined[12]. An extended minimum sum (EMS) decoding algorithm is shown based on the minimum sum (MS) decoding algorithm[13]. In the process of information iteration, the EMS algorithm reduces the amount of computation by shortening the information vector to be stored, which improves the decoding complexity and reduces the hardware storage resources needed whereas the hardware system throughput is low. The quantization complexity of the sum-product (SP) decoding algorithm and the MS decoding algorithm under three different measures are presented as well as the quantization accuracy of the algorithms[14]. The results show that the MS decoding algorithm is suitable for integer quantization, which makes all messages of the quantized MS algorithm be integers and takes fewer hardware resources. An MS decoding algorithm is presented based on integer operation[15], in which the variables are expressed as integers first, and then quantified to binary data for decoding operation. This method provides a new idea to reduce quantization complexity while reducing quantization accuracy. A normalized minimum sum (NMS) algorithm is proposed with better decoding performance when the scale factor is 0.8, the bit width of check node message is 8 bit and variable node message is 10 bit. Since the normalization factor of the algorithm is fixed, the error correction performance of the codeword needs to be improved[16].

By analysis of the advantages and disadvantages of these algorithms, such as the scale factor is fixed and error correction performance of the decoding algorithms is lower, an improved NMS algorithm is proposed based on the optimization of decoding algorithms[17-18]and the hardware implementation of the decoder[19]. Furthermore, taking (8 176,7 154) LDPC code in CCSDS standard[20]as a research object, a decoder is achieved based on field-programmable gate array (FPGA) corresponding to the proposed algorithm. In this design, the scale factor changes with the number of iterations during check node information processing, which has better decoding performance without increasing the hardware complexity. When the maximum number of iterations is 20 and theSNRis no less than 1.1, the decoder works very well.

1 Theoretical analysis of LDPC decoding

1.1 LDPC code structure

This study focuses on (8 176,7 154) LDPC code. Check matrixHis another form of LDPC code, which plays a decisive role in the encoding and decoding process. The size of matrixHin this study is 1 022×8 176, and the matrix elements are composed of “0” and “1”, with the first row of non-zero elements being known and the rest of the elements being obtained by shifting the first row of data circularly right. For convenience of research, matrixHis divided into 32 circulant submatricesAi, j(i=1,2；j=1, …,16) with the size of 511×511, and the division structure is expressed as

(1)

For any circulant submatrix of matrixH, there are two non-zero elements in each row and two non-zero elements in each column. Both the row weight and column weight of each circulant submatrix are 2. Thus, the row weight of the whole check matrix is 32 and the column weight is 4, which means that matrixHis an irregular LDPC code.

1.2 Decoding algorithm for LDPC code

There mainly are two kinds of LDPC decoding algorithms. One is called hard-decision decoding algorithm, including bit-flipping (BF) decoding algorithm, weighted bit-flipping (WBF) decoding algorithm, etc. The other is called soft-decision algorithm, including belief propagation (BP) algorithm[21-22], log-likelihood ratio-belief propagation LLR-BP algorithm, MS algorithm[23], etc. The hard-decision decoding algorithm is easy to implement in hardware and has low complexity, but its decoding performance is not ideal. The soft-decision decoding algorithm is hard to implement in hardware, but it can get the decoding performance close to the Shannon limit. Therefore, the future application of soft-decision decoding algorithms has broad prospects.

a) Initialization to calculates the channel message value received by the variable nodes.

(2)

whereL(Pi) is the initial probability likelihood ratio message of channel information transmitted to variable nodes；σis the noise variance of channel transmission；L(0)(qij) is the information passed by variable nodeito check nodejat iteration 0.

b) Update check node information row by row using variable nodes.

(3)

whereL(l)(rij) is the set of information passed by check nodeito variable nodejat iterationl；Rjis the set of variable nodes connected with check nodej；Rjiis the set of variable nodes connected to check nodejexcept nodei;i′∈Rjirepresents variable nodei′ in the set； andL(l-1)(qi′j) is the information passed by variable nodei′ to check nodejat iterationl-1.

c) Update variable node information column by column using check nodes.

(4)

whereL(l)(qij) is the set of information passed by variable nodeito check nodejat iterationl；Ciis the set of check nodes connected with variable nodei； andCijis the set of check nodes connected with variable nodeiexcept nodej；j′∈Cijrepresents check nodej′ in the set； andL(l)(rj′i) is the information passed by check nodej′ to variable nodeiat iterationl.

d) Codeword judgment.

(5)

whereL(l)(qi) is the hard decision information of the variable nodesiat iterationl.

e) Decoding verification.

1.3 Determination of scale factor

Since the LLR-BP algorithm contains tanhxand arctanhxfunctions, its information initialization requires channel estimation. Therefore, MS algorithm without channel estimation is proposed to further reduce the algorithm complexity. In the process of check node information processing, the hyperbolic tangent complex operation is transformed into comparison and addition operation[24], which greatly reduces the amount of operation and facilitate hardware implementation. The differences between the MS algorithm and the LLR-BP algorithm are shown as a)-b).

a) Initialization is different.

Since this algorithm does not require channel estimation, the initial message from the variable nodeito the check nodejbecomes

L(0)(qij)=L(Pi)=y.

(6)

b) Check node information processing is different.

(7)

The MS algorithm is an approximate calculation by using LLR-BP algorithm when processing check node information. Eq.(3) is processed according to the properties of tanhxand arctanhxfunctions. Thus it can be gotten that

(8)

Then

(9)

It can be found that the value of the check node information calculated by the MS algorithm is larger than that of the LLR-BP algorithm. To compensate for this error, an NMS algorithm based on the MS algorithm is proposed[25-26]. The essence of the algorithm is to multiply a scale factor of (0,1) by the check node formula in the MS algorithm, which optimizes the decoding performance without increasing the complexity of the algorithm. The scale factor is represented byα, and the improvement is expressed as

Blacky was a good, nice little pig, neither dirty nor greedy. He had nice dainty ways (for a pig), and his skin was always as smooth and shining as black satin. He was much cleverer than Browny and Whitey, and his mother s heart used to swell6 with pride when she heard the farmer s friends say to each other that some day the little black fellow would be a prize pig.

(10)

(11)

1.4 Algorithm simulation

To verify the decoding performance of the improved NMS algorithm with the variable scale factor proposed, Matlab2018b is used to carry out simulation experiments. Under differentSNRs of 1.1, 1.3, 1.5 and 1.8, the coded data is quantized into 8-bit binary data, and then the corresponding program is designed. The decoding performance of the NMS decoding algorithm with a variable scale factor is verified by comparing it with the NMS decoding algorithm with a fixed scale factor of 0.5 and 0.75, respectively. The simulation results of three algorithms under differentSNRs are shown as Fig.1. The abscissa represents the number of iterations, and the ordinate represents the number of bit errors.

(a) SNR=1.1

It can be seen that when theSNRis less than 1.5, the number of iterations can be reduced by the proposed NMS decoding algorithm with variable scale factor, and the decoding speed is faster. The advantage of this algorithm is that the convergence speed is faster when theSNRis greater than 1.5.

2 Hardware implementation of LDPC decoder

2.1 Data quantification

For an FPGA-based LDPC decoder, its storage and operation information is binary. By observing the noise signal processed by the additive white Gaussian noise (AWGN) channel, it can be found that the distribution of signal value is in the range of (-4,4). Each signal value is quantified as 8-bit binary data to facilitate FPGA-based hardware implementation, including one symbol bit, three integer bit and four decimal bit, as shown in Fig.2.

2.2 Design of check node processing circuit

To achieve the data processing of check nodes based on FPGA, the check node information processing circuit is designed according to Eq.(10) as shown in Fig.3. Check node information processing is based on the check matrixHto compute the messages transmitted to the check nodes by the variable nodes. It is horizontal processing of the confidence message matrix, involving 32 non-zero data per row in the confidence message matrix, each of which is in 8-bit binary form. Since the storage structure of the check matrixHand that of the confidence message matrix are the same, it is easy to design a check node information processing module with a parallelism of 2 according to the block characteristics of the matrixH, and each parallel processing module completes the update of 32 confidence message data register with the same storage address.

Fig.3 Circuit block diagram of check node information processing

2.3 FPGA-based LDPC decoder

The design of the (8 176,7 154) LDPC decoder based on FPGA is implemented with ZYNQ7020 development board. Fig.4 shows the block diagram of the decoder, including data storage module, initialization module, check node information processing module, variable node information processing and decoding decision module, address update module, verification module and control module. Among them, the data storage module consists of receiving information memory, non-zero element position information memory of check matrix and confidence message memory. In addition, for the convenience of testing, the serial port receiving/sending module and receiving control module are designed.

Fig.4 Overall block diagram of FPGA-based decoder

First, the FPGA receives 8 176 bit of signals processed by the upper computer through the serial port. The upper computer processing is divided into two steps. One is to add noise to the codeword sequence after encoding, and the other is to quantify the noisy sequence. Each signal is composed of 8 bit of binary data. Then, the confidence message is initialized according to the non-zero element position of the check matrix. After that, the check node information is processed row by row to obtain the updated confidence message. Finally, the variable node information is processed, including updating the confidence message and the decoding judgment. If the verification is successful, the decoding is successful, and then the decoded information sequence is sent out via the serial port. If the verification fails, the iteration operation between the check node and the variable node continues until reaching the maximum number of iterations. That is to say, the decoding fails, and the final result of decoding is output via the serial port.

3 Test and discussion

3.1 Simulation test

The simulation test is to verify the function of the decoder designed. The clock frequency is 200 MHz, theSNRis 1.5, and the maximum number of iterations is 15. The simulation diagram of the decoder designed is shown as Fig.5. It can be seen that different modes correspond to different scale factors, and the mode changes with the number of iterations. After five iterations, the mode remains unchanged, and after nine iterations, the decoding is successful. In Fig.5, code is a sequence of codewords after successful decoding, and JUD_ Done indicates that the verification is correct and the decoding is successful. The codeword sequence after successful decoding is compared with the codeword sequence before coding, and it can be seen that the decoding is correct. Therefore, the decoder function is realized successfully.

Fig.5 Simulation results of LDPC decoder designed

3.2 Board-level test

The board-level test process is that the 7 154 bit of original information codes are encoded to generate 1 022 bit of check codes, totaling 8 176 bit of data, and then the data are transmitted over the AWGN channel by binary phase-shift keying (BPSK) modulation. Receiver completes BPSK demodulation and quantifies each datum into 8-bit binary data, then feeds it into the decoder via the serial port for iterative decoding. After decoding, the result is sent to the upper computer for display.

To achieve the board-level test of the decoder, VB6.0 is used for the design and implementation of the upper computer. An FPGA development board is ZYNQ7020, and the development environment is VIVADO2018.3. The main function of the upper computer is to superimpose a set of noises on the encoded codeword sequence, and these additive noises are normally distributed random noises with zero mean and unit variance. Then the noise sequence is quantified and sent to the FPGA for decoding. Finally, the decoding result can be compared with the codeword sequence before encoding to verify whether the decoding is successful or not. If successful, the serial port of the upper computer displays “Data comparison complete without error” (see Fig.6). If the decoding fails, the upper computer counters the error position and the number and displays them (as shown in Fig.7). The test results show that the (8 176,7 154) LDPC decoder can run smoothly when the maximum number of iterations is 15 and theSNRis no less than 1.1.

Fig.6 Display of successful decoding serial port

Fig.7 Display of failed decoding serial port

To test the algorithm performance on FPGA board, the encoded data is taken with noise after quantization when theSNRis fixed at 1.1 and 1.5, respectively, and then carried out an FPGA board-level test under different iterations. Table 1 shows the code error numbers of different scale factors under different iterations whenSNRis 1.1.

Table 2 shows the code error numbers of different scale factors under different iterations whenSNRis 1.5. The value marked in black italics in Table 1 indicates the code error numbers after 6 iterations when the scale factor is 0.75, andαvariable is the variable scale factor of NMS decoding algorithm proposed in this paper.

Table 1 Code error numbers of different scale factors under different iterations (SNR=1.1)

Table 2 Code error numbers of different scale factors under different iterations (SNR=1.5)

By comparing the data in Table 1, it can be found that the NMS algorithm with variable scale factor has better decoding performance owing to its fewer number of code error and its fewer number of iterations required for successful decoding.

4 Conclusions

An improved NMS decoding algorithm with variable scale factor is proposed for near-ground space communication (8 176,7 154) LDPC codes under the CCSDS standard. The scale factor can vary with the number of iterations. To facilitate FPGA-based hardware implementation, an improved quantization method is proposed, which replaces the multiplication of floating-point numbers with the addition via the quantized binary data shift to the right. Based on this, an LDPC decoder with the variable scale factor is designed and implemented based on FPGA. The core of this decoder is the iterative processing of check node information and variable node information. It proves that the decoder designed has a simplified FPGA structure, and improves the error correction ability, lower error rate, better decoding performance, and can reduce the number of iterations to a certain extent by simulation and board-level tests.