APP下载

Approximate Iteration Detection and Precoding in Massive MIMO

2018-06-07ChuanTangYerongTaoYancangChenCangLiuLuechaoYuanZuochengXing

China Communications 2018年5期

Chuan Tang*, Yerong Tao Yancang Chen Cang Liu, Luechao Yuan, Zuocheng Xing

1 Luoyang Electronic Equipment Test Center, LuoYang 471000, China

2 National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha 410073, China

I. INTRODUCTION

Massive multiple-input multiple-output(MIMO) [1,2] is one of the promising wireless data transmission technique for 5G [3], which provides high spectral efficiency and energy efficiency. Compared with small scale MIMO(4×4, 8×8), massive MIMO equips with hundreds of antennas at the base stations (BS) and communicates with a few tens of users. More antennas can improve multiplexing gain and diversity gain, which provide higher data rate and more reliability respectively. Moreover,massive MIMO is insensitive to noise which makes linear methods (minimum mean square error, MMSE) of signal processing (e.g. detection, precoding) achieve approximate optimal performance with perfect channel state information (CSI) [2].

However, more antennas bring about larger scale matrix to be handled by signal processing, which requires better algorithms and more powerful hardware chips in the physical layer,especially the ability of handling large scale matrix. Matrix inversion is one of the critical tasks for massive MIMO data detection and precoding. For exact matrix inversion algorithms such as Cholesky decomposition, the computation complexity is O(M3), where M is the dimensions of the matrix, so it is unacceptable to process exact matrix inversion for massive MIMO systems.

In this paper, the authors compare some approximate iteration methods in precision and complexity, and then improve these methods with iteration re finement at the cost of little complexity and no extra hardware resource.

Recently, some researches about data detection and precoding using approximate algorithms [4,5,6,7,8,9] have been proposed based on the diagonal-approximate and Hermitian positive definite feature of the MMSE filtering matrix in massive MIMO systems.Matrix inversion based on polynomial expansion [4] (PE) uses few orders (1-order mostly)Neumann series approximating matrix inversion, which meets the demand of bit-error rate(BER) with O(M2) complexity. In addition,reference [8,9] adopt the PE method in precoding to reduce complexity, and reference[10,11] also use the PE method for channel estimation. Due to the Hermitian positive definite feature, detection and precoding method based on Conjugate Gradient (CG)[5] is proposed with enough precision and low complexity, and it does not have the problem of divergence. CG method can directly obtain the detection signal vector or precoding vector without explicit matrix inversion, however the inversion of MMSE filtering matrix is used in the soft-output detection. Similarly, Gauss-Seidel (GS) [6] method directly computes each element of detection signal vector in a certain order, which conducts fine-grain iteration in serial. Newton iteration method has the best iteration efficiency whose error has quadratic convergence, but its complexity also increases fast. So a further approximate method with lower complexity named diagonal band Newton iteration (DBNI) is proposed in [7]. All these methods can work in iteration mode, and the complexity and precision is determined by the number of iterations.

In this paper, we focus on detection and precoding methods based on low complexity approximate algorithms which conduct few iterations (1 or 2) to obtain enough precision.We discuss about these approximate iteration detection and precoding methods and analysis their computation complexity. Then we combine these approximate iteration methods with the iteration re finement (IR), which improves precision with little complexity cost and no extra hardware resource. These modified methods could be called by a joint name as IR-based methods which are a combination of three approximate iteration methods in essence. The results show that the IR-based methods have 27%-83% normalized meansquared error (NMSE) improvement on the detection symbol vector and precoding symbol vector. However, combining the detection with soft-input soft-output (SISO) Viterbi decoding, the precision improvement of detection signal vector is covered (in other words, unobvious) because of the SISO decoding. What is more, when using SISO Viterbi decoding, the inversion of MMSE filtering matrix is used for calculating log-likelihood ratio (LLR), and we find that the BER performance is not improved as the precision of the estimate inversion of MMSE filtering matrix (or the number of iterations for approximate iteration detection)increases monotonously, in other words, we could calculate LLR with a rough estimate inversion of MMSE filtering matrix to provide enough good BER performance.

Notation: Lower case and upper case boldface represent column vectors and matrices respectively, x and X. X(i,j) represents the (i,j)element of matrix X and x(i) represents the i-th element of vector x. (•)T, (•)†, (•)-1denote transpose, conjugate transpose and inverse.‖•‖ and ‖•‖Fdenote the Euclidean norm and Frobenius norm respectively. E{·} denotes the expectation. O(Mx) denotes the boundary of complexity is CMxfor some C>0. XXXomeans octal number.

II. BACKGROUND

In this section, we would introduce the system model and MMSE linear detection and precoding, then we would show how to calculate LLR with MMSE filtering matrix.

2.1 System model for uplink and downlink

In this paper, we consider the multi-user(MU) massive MIMO system. Focusing on the uplink, we assume that there are B receive antennas at base station (BS) and U users with single antenna. Bit streams from different users are encoded by the channel encoder independently, and then mapped to symbols from an energy-normalized modulation constellation Ω. Vectorandrepresent the complex transmitted symbol vector and received symbol vector, the Rayleigh fading channel matrix H ∈CB×Ucan be represented as, where theand hm,nmeans the complex channel gain from transmit antenna n to receive antenna m. The system can be modeled as

whereis the complex additive white Gaussian noise (AWGN)vector whose elements are mutually independent and zero mean with variance N0.We assume that the transmitted signals are i.i.d Gaussian distribution and satisfythen the signal noise ratio(SNR) of uplink is U/N0.

In the downlink, the channel matrix is H†due to the reciprocity. The bit streams, which are also mapped to symbols from energy-normalized modulation constellation Ω, are encoded for each user at the BS. The transmitted vector after precoding q ∈CB×1propagates through the channel with AWGN noise can be presented as

where yd∈CU×1is the received symbol vector for U users, and the nd∈CU×1models downlink noise with variance Ndper complex entry.

2.2 MMSE detection and precoding for massive MIMO

In the uplink of massive MIMO systems,nonlinear detection method induces incredible complexity, so linear detection method such as MMSE [12] is adopted mostly. The MMSE estimate symbol vector s for the transmitted symbol vector s~ can be represented as

whereand I∈CU×Uis the identity matrix. Because the different hiare independent, Gram matrix G is a diagonally dominant matrix [1] when B→∞ and U is constant. Obviously, A is also diagonally dominant and Hermitian positive definite. The inversion of A is the main complex computation because it requires O(M3) number of operations for traditional methods, where M=U means the dimensions of the matrix A.

Given the downlink, the BS encodes the bit streams for each users independently, then U independent bits streams are mapped to symbols in constellation Ω composing symbol vector t∈CU×1for all users. To eliminate the interference of other users, the vector t is preprocessed with precoding matrix P ∈ CB×Uas:

where the vector q is the downlink transmitted symbol vector after precoding (or named as precoding vector) mentioned in (2). To maximize the signal-to-interference-and-noise-ratio (SINR) at the user side, we adopt linear MMSE precoding matrix [13]

Combining (5) with (4),whereand v is called intermediate precoding vector for convenience. To obtain precoding vector q, we mainly need to calculate v. Similar to detection in (3), Adis diagonally dominant and Hermitian positive definite, and we should calculate the inversion of Ador directly solve linear system of equations to obtain v. Moreover, if Nd= N0, then. Because of the similar form of detection and precoding, we do not distinguish Adand A for simplicity in the following.

2.3 Log-likelihood ratio computation

Given the uplink, we utilize SISO decoding to obtain the bit streams in BS.To this end, the detector should output LLR as the input of SISO decoder. We setwhere wiis the column vector of W and the i-th row vector of W isdue to the Hermitian feature of W.Then combining (1) and (3), the i-th element(transmitted symbol of i-th user) of MMSE estimate symbol vector s can also be presented as

whereis the equalized channel gain for i-th user, andis the noise plus interference (NPI) of i-th user with the variance. Then the post-equalization SINR ρiis known as.The max-log approximated LLR of bit b for i-th user can be calculated as [14]:

where φ and φ′ are the symbols belong to modulation constellation Ω, and the setandinclude all the constellation symbols whose b-th bit equals to 0 and 1 in Ω respectively. To obtain the LLR information, we require W, i.e. the inversion of A, however some algorithms solving linear system of equations are adopted in detection to get estimate vector s directly, and when calculating ρi,µiand νi,the matrix A is replaced by an approximate matrix which only have the diagonal elements of A [5,6].

III. APPROXIMATE ITERATION METHODS FOR DETECTION AND PRECODING

In this section, we introduce some approximate iteration methods for data detection and precoding, and discuss about their relationship, advantages and deficiencies. Due to the similar form, we mainly introduce these methods used for detection, and the method for precoding could simply replace the estimate vector s and yMFwith intermediate precoding vector v and t respectively.

3.1 Polynomial expansion method

Matrix inversion based on polynomial expansion (PE method for short) can reduce the computing complexity obviously, which is based on the Neumann series. PE method transforms the matrix inversion of A to L-order matrix polynomial approximately as

where the X is an invertible matrix close to A−1, which must satisfy the condition

We represent A=D+E, where D and E own the main diagonal elements and the off diagonal elements of A respectively. Because A is a diagonally dominant matrix, X can be set as D−1. Thus (8) can be rewritten as

and it can be written in an iteration form,

where= D−1, and the subscript k means the number of iterations which is equal to the order L in (10). Given the estimate symbol vector, the k-th iteration estimation of detection vectorand intermediate precoding vectorcan be presented as

whereandare the initial estimate vectors of s and v, and adopted as initial estimate vectors in the other methods below for fair comparison.

3.2 Conjugate gradient method

Conjugate gradient method (CG method) is used for obtaining extreme value of a quadratic function, and is also a method of solving linear system of equations. A quadratic function can be presented as

where ∇F(s) is the gradient of F(s), and A must be positive definite. When ∇F(s)=0,i.e. As=yMF, we can obtain the minimum value of F(s), and the s corresponding to the minimum value is also the solution of As=yMF, so we can get the estimate symbol vector directly by searching the s which corresponds to the minimum value of (14). The CG method can approach the minimum value by iterations, and it can converge to the exact result after M iterations. CG method is detailed in Algorithm 1. Being different from [5], our initial input isinstead of zero vector for the sake of fair comparison.

However, we do not have the inversion of matrix A in the intermediate process of CG method. Assumingwhere i andnare the n-th column of the identity matrix I and the approximate inversionof A after k times iteration respectively, we could get the n-th column of the approximate inversionby calculatingusing CG method. After M times calculations for every column, we obtain the approximate matrix.

3.3 Gauss-seidel method and jacobi method

Algorithm 1. Conjugate gradient method.Input: A and yMF Initialization: s D y r y As p r cg cg−1 0=MF MF,0= − =0 0 0,1: for k=1, … ,M do 2: e Ap k−1= ,k−1αk k k k− − − −1 1 1 1=r p e 2/( )H ;3: s s pα ;4: r r e cg cg k k k k= +− − −1 1 1= +β −1;5: end for k k k k= −− − −1 1 1 α ,βk k k=r 2 r 2/ −1 , p r p k k k k Output: scg M is the estimation of symbol vector.(PS: For precoding, replace si cg and yMF with vi cg and t respectively.)

Gauss-Seidel method (GS method) and Jacobi method (JA method) are the most common methods for solving linear system of equations. Assuming A = E + D = U + L + D , U and L are strictly upper triangular matrix and strictly lower triangular matrix respectively.Then the equation As=yMFcan be rewritten as

Concerning the Jacobi method, we can transform (15) into

Seeing from (16), Jacobi method is essentially the same as PE method with initial input, so we only discuss about PE method in the following.

Concerning the GS method, we can transform (15) into

and its representation for each component showed in (18) is more executable.

Seeing from (18), GS method is a method of fine-grain iteration, in which each element of estimate symbol vectoris calculated in a certain order serially.

For precoding, the intermediate precoding vectorandcan be presented as

However, compared with JA method, GS method sacrifices the parallelism, and its precision depends on the computation sequence because of the problem of error propagation,and is not better than the precision of JA method necessarily. So some researches combine JA method and GS method using a weight coefficient, and improve the precision by adjusting the weight and the order of computation,such as SOR-based [15,16] and SSOR-based[17] methods.

Similar to CG method, we cannot obtain the inversion matrix of A explicitly. So the inversion matrix of A should also be acquired by calculatingfor M times like CG method.

3.4 Newton iteration method

Newton iteration method [18,19] is based on the Taylor series which only considers the 1-order, and can improve the precision by iterations. Ifis the rough and original estimation of A−1, then the (k+1)-th iteration estimation is

For convergence,must satisfy the condition

then (21) converges to A−1quadratically.Similar to PE method,can be set as D−1which can satisfy the condition (22), and then we can get the relationship [7,19] between NI method and PE method that

Seeing from (23), NI method converges much faster than PE method. However, from the second iteration, it requires twice matrix-by-matrix multiplications (O(M3)) for each iteration which induces enormous complexity. So DBNI method [7] is proposed. DBNI method only conducts two iterations (k=2), its precision is higher than theand its complexity is O(M2). Regarding the feature of diagonal dominance for, we can neglect the elements far away from the main diagonal by setting them to be zero to get the approximate inputfor next iteration, which is the main principle of DBNI method. The estimation of matrix A using DBNI method can be presented as

where

In (25), E is the band width which means we only consider E elements adjacent with the diagonal element in a row for both right and left side.

Considering the estimate symbol vector based on NI method,can be represented as

and the derivation of relationship between Rk−1and R0refers to [19]. For DBNI method, the estimate symbol vectorcan be presented aswhich can be unfold further for lower complexity [7].

Similarly, the intermediate precoding vectorandcan be presented as

IV. ITERATION REFINEMENT BASED METHODS AND COMPLEXITY ANALYSIS

In this section, we introduce the iteration re-finement technique and IR-based methods.For the simplicity, we just introduce these algorithms for detection in this section, and these algorithms could be applied to precoding simply by replacing s and yMFwith v and t respectively. Then we would detail the complexity of different approximate methods.

4.1 Iteration re finement

If skis an approximate solution of As=yMF,then the remnant r is

where e=sk−s is the error of the estimate symbol vector sk. Then we can obtain an estimationof the error e by calculating Ae=r, and=s +ˆ can be regarded as ankimproved estimation of symbol vector s. Although calculating eˆ also induces some error,iteration refinement can effectively improve the estimation of s, especially when the method for calculating s has low complexity and low precision [20], such as the approximate iteration methods mentioned above. Given that the complexity of calculating Ae=r is the same as that of As=yMF, we can also adopt the approximate iteration methods mentioned above to deal with the process of iteration refinement. On the one hand, we could share the resource of the hardware by using the same method in both data detection (or precoding)and iteration refinement. On the other hand,we could acquire better precision by using the method with high precision to implement iteration refinement at the cost of hardware resource.

4.2 Iteration re finement based approximate iteration methods

We define IR-based approximate iteration methods (IR-based methods for short) as follows: first we adopt approximate iteration methods to obtain the intermediate estimate vector, then we refine the error of estimate vector with iteration re finement using approximate iteration methods again, and we obtain a higher precision estimated vector finally.We assumeis the final estimated vector

which conducts i iterations for calculating the intermediate estimate vector using α method and j iterations for iteration re finement using β method (α and β are one of approximate iteration methods mentioned in Section III) as showed in (29).

whereandare the intermediate estimate vector obtained by α method and the estimate error vector obtained by β method respectively, and,jandare the equivalent estimate inversion of A corresponding to α, β and IR-based methods respectively.

Seeing from (29), the equivalent estimate inversion of A using IR-based methodis a tradeoff between the inversion of α and β in the way of NI method, i.e. the IR-based method is a method combining α, β and NI method. The estimate inversion of matrix A using IR-based method is equal to the inversion estimated by the NI methods with a special initial input. The special initial input is a tradeoff ofand. Because the NI method is a non-linear iteration method (quadratic iteration, which has the fastest convergence speed), the tradeoff ofandis not a simple mean value, which makes the IR-based methods have better performance under the situation of the same number of iterations.Specially, if α and β are both NI method and

For the sake of low complexity, we usually adopt the approximate iteration methods mentioned above with 1 or 2 iterations in the massive MIMO systems, thus there is an obvious error between the estimation and true value,and we can acquire an effective improvement on precision by using iteration refinement.Especially, the precision ofis better(showed in the following) than 2 iterations without iteration re finement for most cases.

4.3 Complexity analysis

In traditional MIMO systems, data detection and precoding usually adopt exact inversion based on Cholesky factorization, whose complexity is O(M3). However, in massive MIMO systems, the complexity of O(M3) is unacceptable as the number of antennas increases markedly. So we use approximate methods to limit the complexity on O(M2). In this paper,we evaluate the complexity of the approximate iteration methods in terms of the number of complex-value multiplications, and we assume that matrix A and yMFare known because they are required for all methods. Then the complexity of different approximate iteration methods are showed in the table I, where E is the band width of DBNI method. To show clearly, the complexity of matrix inversion Xkand symbol vector skfor 2 iterations are showed in figure 1. The x-axis is M which is the dimensions of the matrix A, and the y-axis is the number of complex multiplications.

Table I. The complexity of different algorithms.

Fig. 1. The complexity of matrix inversion Xk and symbol vector sk for 2 iterations.

Concerning the iteration refinement, the calculation modes and complexity are different depending on whether the matrix inversion estimation of A is known. If the matrix inversion estimation of A is unknown or, known but α and β in (29) are not the same method,we can re fine the error according to (28), and it requires computing r (M2complexity) and an initial estimation error(M complexity), then a better estimation errorcan be obtained by approximate iteration methods mentioned above. The complexity of the iteration refinement iswhere C() is the complexity of calculating(n > 0) which is the same as the complexity of calculating skin table I for k=n. Otherwise, if the matrix inversion estimation Xkis known and α and β in (29) are the same method, we also require r (M2complexity),but without calculatingwe can directly obtain(M2complexity). Furthermore, we can continue the iteration for better error estimation by calculating (28) withas initial input. If we require iteration re finement with n (n>k) iterations, the complexity is, whereis the complexity of calculating. So if we adopt GS method to re fine the error and we have known k iterations estimation of matrix inversion based on GS method, we can finish the iteration re finement of n (n>k) iterations with just(n − k) M2+ 2M2complexity.

Seeing from the table I, the estimation of symbol vector skis always O(M2) complexity,because it can convert matrix-by-matrix multiplication into matrix-by-vector multiplication[7,22], and when combining with iteration re finement the complexity is still O(M2). However, the inversion of A, which usually induces O(M3) complexity, is useful in the soft decision decoding, only when k=1 for PE and NI methods or k=2 for DBNI method, the complexity of calculating the inversion is O(M2).If we use IR-based methods, we convert once approximate estimation with 2 iterations into two approximate estimation with 1 iterations.For example, when we use PE method with 2 iteration to compute, then(M2complexity), the whole complexity is M3+3M2. However, if we use, the whole complexity is 4M2. Moreover, the precision ofis higher than that of, which is showed in Section V. As a brief summary, concerning the same total number of iterations, if the inversion of A is not necessary, IR-based methods and approximate iteration methods are all on the complexity of O(M2), but the former mostly have higher precision; otherwise, if the inversion of A is necessary, IR-based methods could provide similar or higher precision with much lower complexity.

V. NUMERICAL SIMULATION AND ANALYSIS

In this section, we evaluate the performance for different approximate methods. If sˆ represents the approximation of the MMSE estimate symbol vector s, then the mean-squared error (MSE) of the approximate method is, and we use the normalized MSE(NMSE) which is equal to MSE/M to represent the precision of methods for detection and precoding. All results about NMSE are average of 1000 times simulation. We also conduct an experiment on detection with 16-QAM demapper and rate-1/2 SISO Viterbi decoder with [133o171o] generator polynomial, and we show the bit error-rate (BER) of different algorithms for different signal noise ratios (SNR).

Firstly, we show the relationship between the precision and the number of iterations for different methods of detection and precoding in figure 2. The configure of users and BS antennas U ×B is 16× 160. We see that the NI method has the fastest speed of convergence,which obtains a high precision estimation with no more than 10 iterations. CG method reaches the highest precision at the 16-th iteration,the error is due to the round error instead of the algorithm. Further, observing the first three iterations in the partial enlarged drawing, the GS method has the best NMSE performance and we mainly focus on the first few iterations for the sake of low complexity. Besides, we do not discuss about Jacobi method any more due to the equivalency of Jacobi and PE methods.

Fig. 2. The NMSE of MMSE detection or intermediate precoding vector for different approximate methods.

Figure 3 shows the effect of iteration re finement for different methods. The configuration of users and BS antennas U×B is 16× 160.(i,j)_XXX means the vector that we use i times iterations for obtaining the estimation of detection symbol vector or intermediate precoding vector, and adopt XXX (SELF means that α and β in (29) are the same method) method to refine the estimation error for j times.Considering the cost and precision, we discuss about two schemes, one is to refine the error using the same method (SELF mode) which do not need extra hardware resource, the other is using the GS method (which has the best precision for first few iterations showed in figure 2) for better precision at the cost of extra hardware and parallelism. We only conduct 1 iteration for iteration refinement which is of high efficiency and low complexity.

Given all methods except for NI method, as shown in figure 3(a), when the total number of iterations is the same (both are i+1 for (i+1,0)and (i,1)_SELF), the ΔNMSE is always positive number which means the precision of (i,1)_SELF is better than the precision of(i+1,0) for iterations<5. Moreover, the effect of iteration re finement is more obvious when the estimations of detection symbol vector and intermediate precoding vector have lower precision. Further, as the number of iterations increases, the ΔNMSE is smaller and smaller, i.e. the effect of the iteration re finement is more unobvious, so we mainly concern about the case that the estimations of detection symbol vector and intermediate precoding vector conduct 1 iteration. Compared to the (2,0) of these methods, the NMSE of (1,1)_SELF decreases 27%, 44%, 67% and 83% for GS, CG,PE and DBNI methods respectively.

Fig. 3. The comparison among methods with and without iteration re finement.

Given the NI method, the NMSE ofis the same as the, as proved in Section IV,and for i>1 the NMSE ofis lower thanwhich can be easily derived from (29), so no curve about NI method is showed in figure 3(a).

Similarly, if we use GS method to conduct iteration re finement as showed in figure 3(b),the trend is the same, and the NMSE of (1,1)_GS decreases 27%, 63%, 72%, 88% and 97%,for GS, NI, CG, PE and DBNI methods compared with (2,0) of these methods. Specially,NI method converges more and more faster as the number of iteration increases as showed in figure 2. For i>1,and theis showed in (29) whereand. Because the precision ofis better thanfor i>1, so the NMSE ofis lower than. So in figure 3(b) there is no curve when i>1 for NI method.

Figure 3(c) shows the ΔNMSE of (i,1)_GS and (i,1)_SELF which presents obvious better precision for (i,1)_GS methods. In brief summary for figure 3, IR-based methods provide better precision under the case of the same total number of iterations for most approximate iteration methods dealing with detection and precoding, and it is more appropriate for the method with lower precision.

Furthermore, we conduct an experiment about BER of different approximate detection methods and exact detection as showed in figure 4. The configuration of users and BS antennas U×B is also 16× 160. Because GS and CG methods are not worthwhile to obtain the estimation of A−1, so we use D−1to calculate LLR for all approximate methods and EXACT detection (detection using exact A−1and calculating LLR using D−1), except for EXACT_II which uses exact detection and exact inversion of A to calculate LLR. Seeing from figure 4, when using D−1to calculate LLR, the BER performance is good and very close to EXACT detection for all approximate detection methods except for the cases of (1,0)for all methods and the case of (2,0) for DBNI.So with the same methods of calculating LLR,the difference of precision among different detection methods are covered by SISO Viterbi decoding, and the case of (1,1)_SELF or (2,0)which do not require extra hardware are the best choice for concerning both the complexity and precision. However, when compared with EXACT_II, there still exists about 1dB gap even for the BER performance of EXACT. So the inversion of A has an important effect on the performance of SISO Viterbi decoding.

Finally, we consider that how the precision of the inversion of A in fluences the BER performance of SISO Viterbi decoding. Because PE method and Jacobi method are the same and have the relationship with NI method, so we mainly consider NI, GS and CG methods.As mentioned in last paragraph, the inversion of A has an important effect on the performance of SISO Viterbi decoding, however it is not worthwhile to estimate the A−1with many iterations, especially when B/ U is small. In figure 5, we show the BER performance after exact MMSE detection and the SISO Viterbi decoding using the approximate iteration inversion with different iterations (iterations=0 means using D−1as A−1to calculate LLR).The points connected by the dash line on the right of each sub figure correspond to the BER performance of SISO Viterbi decoding using exact A−1to calculate LLR for different SNR.Interestingly, it does not show a monotone trend, especially when the ratio of B/ U is small. Moreover, when B/ U is small (8 or 10 in figure 5), we find using D−1to calculate LLR is close to the results using exact A−1,and only for the cases using the approximate inversion with iterations bigger than 3 which induces high complexity, the BER performance can surpass that of the case using D−1.However if the B/ U is too small to converge,the BER performance is still mainly determined by the error of approximate detection.In brief summary, given the complexity and precision all together, directly using D−1to calculate LLR is a good choice for massive MIMO data detection. If better BER performance is required, thus more iterations are needed, then the IR-based method is a good choice for massive MIMO systems.

Fig. 4. The BER of different approximate detection methods and exact detection.

Fig. 5. The BER of different approximate detection methods for using the estimate inversion of A with different number of iterations to calculate LLR.

VI. CONCLUSION

In this paper, we discuss about these approximate iteration detection and precoding methods and analysis their computation complexity. Then we combine the iteration re finement with these approximate iteration methods which are named as IR-based methods and are the combination of three approximate iteration methods in essence. Compared with methods without iteration refinement, IR-based methods for detection and precoding provide 27%,44%, 67% and 83% decrease on NMSE for GS, CG, PE and DBNI methods if refine the error with oneself, and little complexity and no extra hardware is required. Moreover, although IR-based methods have better NMSE performance than approximate iteration methods with the same number of iterations, the BER performance is mainly determined by SISO Viterbi decoding, and the BER performance of one method with IR after SISO Viterbi decoding is almost the same to that of the method without IR and is close to the case of exact detection with approximate LLR value.Finally, we find that the change of BER performance is not monotonic as the number of iterations for the estimate inversion of A (for calculating the LLR) increases, and using D−1to calculate LLR is a good choice for most cases in massive MIMO detection considering both complexity and precision.

[1] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Transactions on Wireless Communications, vol. 9, no. 11, 2010, pp. 3590–3600.

[2] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson,T. L. Marzetta, O. Edfors, F. Tufvesson, P. Print,F. Rusek, and D. Persson, “Scaling up MIMO:Opportunities and challenges with very large arrays,” IEEE Signal Processing Magazine, vol. 30,no. 1, 2013, pp. 40–60.

[3] C. X. Wang, F. Haider, X. Gao, X. H. You, Y. Yang,D. Yuan, H. Aggoune, H. Haas, S. Fletcher, and E. Hepsaydir, “Cellular architecture and key technologies for 5G wireless communication networks,” IEEE Communications Magazine, vol.52, no. 2, 2014, pp. 122–130.

[4] M. Wu, B. Yin, G. Wang, C. Dick, J. R. Cavallaro,and C. Studer, “Large-scale MIMO detection for 3GPP LTE: Algorithms and FPGA implementations,” IEEE Journal of Selected Topics in Signal Processing, vol. 8, no. 5, 2014, pp. 916–929.

[5] B. Yin, M. Wu, J. R. Cavallaro, and C. Studer,“Conjugate gradient-based soft-output detection and precoding in massive MIMO systems,”Proc. IEEE Global Communications Conference,2014, pp. 3696–3701.

[6] L. Dai, X. Gao, X. Su, S. Han, C.-L. I, and Z. Wang,“Low-complexity soft-output signal detection based on gauss-seidel method for uplink multiuser large-scale MIMO systems,” IEEE Transactions on Vehicular Technology, vol. 64, no. 10,2015, pp. 4839–4845.

[7] C. Tang, C. Liu, L. Yuan, and Z. Xing, “High precision low complexity matrix inversion based on newton iteration for data detection in the massive MIMO” IEEE Communications Letters, vol.20, no. 3, 2016, pp. 490–493.

[8] A. Kammoun, A. Müller, E. Bjornson, et al., “Linear Precoding Based on Polynomial Expansion:Large-Scale Multi-Cell MIMO Systems,” IEEE Journal of Selected Topics in Signal Processing,vol. 8, no. 5, 2014, pp. 861–875.

[9] A. Müller, et al., “Linear precoding based on polynomial expansion: reducing complexity in massive MIMO,” EURASIP Journal on Wireless Communications and Networking. No. 63, 2016, pp. 1-22.

[10] N. Shariati, E. Bjornson, M. Bengtsson, et al.,“Low-complexity channel estimation in largescale MIMO using polynomial expansion,” Proc.IEEE International Symposium on Personal Indoor and Mobile Radio Communications, 2013,pp. 1157-1162.

[11] Z. Chen, X. Hou, S. Han, C. Yang, G. Wang, and M. Lei, “Low complexity channel estimation in TDD coordinated multi-point transmission systems,” Proc. IEEE WCNC, 2013, pp. 3128–3133.

[12] C. Studer, S. Fateh, and D. Seethaler, “ASIC implementation of soft-input soft-output MIMO detection using MMSE parallel interference cancellation,” IEEE Journal of Solid-State Circuits,vol. 46, no. 7, 2011, pp. 1754 – 1765.

[13] C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, “A vector-perturbation technique for near-capacity multiantenna multiuser communication-part i: channel inversion and regularization,” IEEE Transactions on Communications,vol. 53, no. 1, 2005, pp. 195–202.

[14] C. Studer, “Iterative MIMO decoding: Algorithms and VLSI implementation aspects,” Ph.D.dissertation, ETH Zurich, 2009.

[15] T. Xie, Q. Han, H. Xu, and Z. Qi, “A low-complexity linear precoding scheme based on SOR method for massive MIMO systems,” Proc. Vehicular Technology Conference, 2015, pp.1-5.

[16] D. Xiong, W. Peng, D. Chen, et al., “Adaptive joint precoding and pre-equalization with reduced complexity in massive MIMO systems,”Wireless Communications & Mobile Computing,vol. 16, no. 18, 2016, pp. 3190-3200.

[17] J. Ning, Z. Lu, T. Xie, and J. Quan, “Low complexity signal detector based on SSOR method for massive MIMO systems,” Proc. IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, 2015, pp. 1–4.

[18] M. Ylinen, A. Burian, and J. Takala, “Direct versus iterative methods for fixed-point implementation of matrix inversion,” Proc. International Symposium on Circuits and Systems, 2004, pp.III–225–8.

[19] V. Pan and R. Schreiber, “An improved newton iteration for the generalized inverse of a matrix,with applications,” Siam Journal on Scientific &Statistical Computing, vol. 12, no. 5, 1991, pp.1109–1131.

[20] J. J. Leader, “Numerical analysis and scientific computation,” Pearson Addison Wesley Boston,2004.

[21] C. Tang, C. Liu, L. Yuan, et al., “Approximate iteration detection with iterative refinement in massive MIMO systems,” IET Communications,vol. 11, no. 7, 2017, pp. 1152-1157.

[22] N. Shariati, E. Bjornson, M. Bengtsson, and M.Debbah, “Low-complexity polynomial channel estimation in large-scale MIMO with arbitrary statistics,” IEEE Journal of Selected Topics in Signal Processing, vol. 8, no. 5, 2014, pp. 815 – 830.