ZTE Communications, 2021, 19(4): 34-44 doi: 10.12142/ZTECOM.202104004

Message Passing Based Detection for Orthogonal Time Frequency Space Modulation

YUAN Zhengdao1, LIU Fei2, GUO Qinghua,3, WANG Zhongyong2

The Open University of Henan, Zhengzhou 450000, China

Zhengzhou University, Zhengzhou 450001, China

University of Wollongong, Wollongong NSW 2522, Australia

Received: 2021-10-10  

Fund supported: the National Natural Science Foundation of China61901417U180415261801434
Science and Technology Research Project of Henan Province212102210556212102210566212400410179

About authors

YUAN Zhengdao received the B.E. degree in communication and information system from Henan University of Science and Technology, China in 2006, the M.E. degree in communication engineering from Soochow University, China in 2009, and the Ph.D. degree in information and communication engineering from the National Digital Switching System Engineering and Technological Research Center, China in 2018. He is currently an associate professor with the Open University of Henan. He was a visiting scholar with the University of Wollongong, Australia in 2019. His research interests are mainly in massive MIMO, sparse channel estimation, message passing algorithm, and iterative receiver.

LIU Fei received the B.E. and M.E. degrees in information and communication engineering from Zhengzhou University, China in 2015 and 2017, respectively. He is currently working toward the Ph.D. degree with the School of Information and Engineering, Zhengzhou University, China. His research interests are message passing algorithm, sparse signal recovery, and OTFS.

GUO Qinghua (qguo@uow.edu.au) received the B.E. degree in electronic engineering and the M.E. degree in signal and information processing from Xidian University, China in 2001 and 2004, respectively, and the Ph.D. degree in electronic engineering from the City University of Hong Kong, China in 2008. He is currently an associate professor with the School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, Australia and an adjunct associate professor with the School of Engineering, The University of Western Australia. His research interests include signal processing, machine learning and telecommunications. He was a recipient of the Australian Research Council’s inaugural Discovery Early Career Researcher Award in 2012. E-mail:qguo@uow.edu.au

WANG Zhongyong received the B.S. and M.S. degrees in automatic control from Harbin Shipbuilding Engineering Institute, China in 1986 and 1988, respectively, and the Ph.D. degree in automatic control theory and application from Xi’an Jiaotong University, China in 1998. From 1988 to 2002, he has been with the Department of Electronics, Zhengzhou University, China. Now he is a professor with the Department of Communication Engineering, Zhengzhou University. His research interests include numerous aspects within embedded systems, signal processing, and communication theory.

Abstract

The orthogonal time frequency space (OTFS) modulation has emerged as a promising modulation scheme for wireless communications in high-mobility scenarios. An efficient detector is of paramount importance to harvesting the time and frequency diversities promised by OTFS. Recently, some message passing based detectors have been developed by exploiting the features of the OTFS channel matrices. In this paper, we provide an overview of some recent message passing based OTFS detectors, compare their performance, and shed some light on potential research on the design of message passing based OTFS receivers.

Keywords: OTFS ; detection ; message passing ; belief propagation ; approximate message passing (AMP) ; unitary AMP (UAMP)

PDF (2103KB) Metadata Metrics Related articles Export EndNote| Ris| Bibtex  Favorite

Cite this article

YUAN Zhengdao, LIU Fei, GUO Qinghua, WANG Zhongyong. Message Passing Based Detection for Orthogonal Time Frequency Space Modulation. [J], 2021, 19(4): 34-44 doi:10.12142/ZTECOM.202104004

1 Introduction

Recently the orthogonal time frequency space (OTFS) modulation has attracted much attention due to its capability of achieving reliable communications in high mobility scenarios[16]. OTFS is a two-dimensional modulation scheme, and the information is modulated in the delay Doppler (DD) domain, which is in contrast to the time frequency (TF) domain modulation in the orthogonal frequency division multiplexing (OFDM). In OTFS, each symbol spreads over the time and frequency domains through the two dimensional (2D) inverse symplectic finite Fourier transform (SFFT), leading to both time and frequency diversities[12]. It has been shown that OTFS can significantly outperform OFDM in high mobility scenarios[7].

To harvest the diversities promised by OTFS, the design of a powerful detector is paramount. The optimal maximum a posteriori (MAP) detector is impractical due to its complexity growing exponentially with the length of the OTFS block. Recently, significant efforts have been devoted to the design of more efficient detectors. In Ref. [8], an effective channel matrix in the DD domain was derived, based on which a low-complexity two-stage detector was proposed. The first-order Neumann series was used in Ref. [9] to approximately solve the matrix inverse problem involved in the linear minimum mean squared error (MMSE) estimation based detection. A detection scheme was developed in Ref. [10], where the MMSE equalization was used in the first iteration, followed by parallel interference cancellation with a soft-output sphere decoder in subsequent iterations. A rectangular waveform was considered in Ref. [11], where the sparsity and quasi banded structure of channel matrices without fractional Doppler shifts were exploited to reduce the detection complexity. The linear equalizers were extended to the multiple input and multiple output (MIMO)-OTFS systems in Ref. [12]. A cross-domain method was proposed in Ref. [13], where a conventional linear MMSE estimator is adopted for equalization in the time domain and a low-complexity symbol-by-symbol detection is utilized in the DD domain. A low complexity iterative rake decision feedback detector was proposed in Ref. [14], which extracts and coherently combines the multiple copies of the symbols (due to multipath propagation) in the DD grid using maximal ratio combining (MRC).

Another line of OTFS detector design is based on factor graphs and message passing techniques[15, 23]. When the number of channel paths is small, the effective channel matrix in the DD domain is sparse, which allows efficient detection using the message passing algorithm (MPA)[2]. An expectation propagation (EP) algorithm was proposed in Ref. [16], where EP is used for message update with Gaussian approximation. A variational Bayes (VB) based detector was proposed in Ref. [17] to achieve better convergence. Studying the matched filtering processing, the authors in Ref. [18] proposed a message passing detector, which is combined with a probability clipping solution. The detectors in Refs. [2, 17, 19] take advantage of the sparsity of the channel matrix in the DD domain, and their complexity depends on the number of nonzero elements in each row of the channel matrix, which is denoted by S. Without considering fractional Doppler shifts, S is equal to the number of channel paths. In general, a wideband system can provide sufficient delay resolution. The Doppler resolution depends on the time duration of the OTFS block. To fulfill the low latency requirement in wireless communications, the time duration of an OTFS block should be relatively small, where it is necessary to consider fractional Doppler shifts[2, 20]. In this case, the value of S can be significantly larger than the number of channel paths. In the case of rich scattering environments, the complexity of these detectors can be a concern and the short loops in the corresponding system graph model may result in significant performance. To overcome the above issues, the design of OTFS detectors based approximate message passing (AMP)[2122] was investigated in Ref. [25]. AMP works well for independent and identically distributed (sub-)Gaussian system transfer matrix, but it suffers from performance loss or even diverges for a general system transfer matrix[2729]. Instead, the works in Refs. [2526] resort to the unitary AMP (UAMP)[2729], which is a variant of AMP and was formerly called UTAMP[27]. In UAMP, a unitary transformation of the original model is used, where the unitary matrix for transformation can be the conjugate transpose of the left singular matrix of the general system transfer matrix[27] obtained through singular value decomposition (SVD). It is shown in Ref. [25] that UAMP is well suitable for OTFS due to the structure of block circulant matrix with circulant block (BCCB) of the DD domain channel matrices, where the 2D discrete Fourier transform is used for the unitary transformation, leading to very efficient implementation using the 2D fast Fourier transform (FFT) algorithm. In addition, as the noise variance is normally unknown, the noise variance estimation is also incorporated into the UAMP-based detector in Ref. [25].

In this paper, we provide an overview of the message passing based detectors, provide some comparison results, and discuss potential research on the design of message passing based OTFS receivers. The notations used in this paper are as follows. Boldface lower-case and upper-case letters denote vectors and matrices, respectively. We use ()H and ()T to denote the conjugate transpose and the transpose, respectively. The superscript * denotes the conjugate operation. We define []M as the mod M operation. We use N(x|x̂,νx) to denote the probability density function of a complex Gaussian variable with mean x̂ and variance νx. The notation f(x)q(x) denotes the expectation of the function f(x) with respect to the distribution q(x). The relation f(x)=cg(x) for some positive constant c is written as f(x)g(x). The notation represents the Kronecker product, and ab and a/b represent the component-wise product and the division between vectors a and b, respectively. We use X=reshapeMx to denote that the vector x is reshaped as an M×N matrix X column by column, where the length of x is MN, and use x=vecX to represent vectorization of matrix X column by column. The notation diag(a) represents a diagonal matrix with the elements of a as its diagonal. We use |A|2 to denote the element-wise magnitude squared operation for matrix A. The notations 1 and 0 are used to denote an all-ones vector and an all-zeros vector with a proper length, respectively. The j-th entry of q is denoted by qj. The superscript t of st denotes the iteration index of the variable s involved in an iterative algorithm.

2 System Model

The modulation and demodulation for OTFS are illustrated in Fig. 1, which are implemented with the 2D inverse SFFT (ISFFT) and SFFT at the transmitter and receiver, respectively[1, 24]. Before the OTFS modulation, a (coded) bit sequence is mapped to symbols x[k,l],k=0,...,N-1, l=0,...,M-1 in the DD domain, where x[k,l]A={α1,...,α|A|}, |A| is the cardinality of A, l and k denote the indices of the delay and Doppler shifts, respectively, and N and M are the number of grids of the DD plane. At the transmitter side, ISFFT is performed to convert the DD domain symbols to signals in the time-frequency (TF) domain.

Figure 1

Figure 1   Modulation and demodulation in an orthogonal time frequency space (OTFS) system[2]


Xtf[n,m]=1MNk=0N-1l=0M-1x[k,l]ej2π(nkN-mlM).

After that, the signals Xtfm,n in the TF domain are converted to a continuous-time waveform st using the Heisenberg transform with a transmit waveform gtxt[2], i.e.,

s(t)=n=0N-1m=0M-1Xtf[n,m]gtx(t-nT)ej2πmΔf(t-nT),

where Δf is subcarrier spacing and T=1/Δf. Then the signal st is transmitted over a time-varying channel and the received signal in the time domain is given as:

r(t)=h(τ,ν)s(t-τ)ej2πν(t-τ)dτdν,

where h(τ,ν) is the channel impulse response in the continuous DD domain, and it can be expressed as[1]:

h(τ,ν)=i=0P-1hiδ(τ-τi)δ(ν-νi),

with δ() being the Dirac delta function, P being the number of channel paths, and hi, τi and νi being the gain, delay and Doppler shift associated with the i-th path, respectively. The delay and Doppler-shift taps for the i-th path are given by

τi=liMΔf, νi=ki+κiNT,

where li and ki are the delay and Doppler indices of the i-th path, and κi[-1/2,1/2] is a fractional Doppler associated with the i-th path. In the above equation, MΔf is the system bandwidth and NT is the duration of an OTFS block.

At the receiver, a receive waveform grxt is used to transform the received signal rt to the TF domain, i.e.,

Y(t,f)=grx*(t'-t)r(t')e-j2πf(t'-t)dt',

which is then sampled at t=nT and f=mΔf, yielding Yn, m. Then SFFT is applied to Yn, m to generate the DD domain signal yk,l, i.e.,

y[k,l]=1MNn=0N-1m=0M-1Y[n,m]e-j2π(nkN-mlM).

Assuming that the transmitted waveform and the received waveform satisfy the bi-orthogonal property[1], in the DD domain we have the input-output relationship[2].

y[k,l]=i=0P-1c=-NiNihix[k-ki+c]N,[l-li]M1N1-e-j2π(-c-κi)1-e-j2π-c-κiNe-j2πli(ki+κi)MN+ω[k,l],

where Ni<N is an integer, and ω[k,l] is the noise in the DD domain. We can see that for each path, the transmitted signal is circularly shifted, and scaled by a corresponding channel gain. We arrange {xk,l} as a vector xCMN×1, where the j-th element xj is xk,l with j=kM+l. Similarly, a vector yCMN×1 can also be constructed based on y[k,l]. Then Eq. (8) can be rewritten in a vector form as:

y=Hx+ω,

where HCMN×MN is the effective channel matrix in the DD domain, and ω denotes a white Gaussian noise with mean 0 and variance ϵ-1 (or precision ϵ). The channel matrix H in Eq. (9) can be represented as[25]:

H=i=0P-1c=-NiNiIN(-[c-ki]N)[IM(li)hi×1-e-j2π(-c-κi)N-Ne-j2π-c-κiNe-j2πli(ki+κi)MN],

where IN(-[q-ki]N) denotes an N×N matrix obtained by circularly shifting the rows of the identity matrix by -[q-ki]N, and IM(li) is obtained similarly. Without fractional Doppler, i.e., κi=0, the channel matrix H is reduced to

H=i=0P-1IN(ki)IM(li)hie-j2πlikiMN.

3 Message Passing (MP) Based Detectors

Based on the model (9) in the DD domain, several detectors have been proposed using the message passing techniques.

3.1 MP Detector in Ref. [2]

In model (9), the MN×MN DD domain complex channel matrix H is sparse (especially in the case without fractional Doppler shifts), which makes belief propagation suitable for implementing the OTFS detectors. In Eq. (2), y and ω are length-MN complex vectors with elements denoted by y[d] and ω[d], 1dMN, the element of H is denoted by H[d,c], 1d,cMN,x is a length-MN symbol vector with elements x[c]A, 1cMN, and 𝒜 denotes the modulation alphabet.

Thanks to the sparsity of H, the joint distribution of the random variables in model (9) can be represented with a sparsely-connected factor graph with MN variable nodes corresponding to x and MN observation nodes corresponding to y. As shown in Fig. 2, each observation node y[d] is connected to a set of variable nodes {x[es],es(d)}, and similarly, each variable node x[c] is connected to a set of observation nodes y[es], es𝒥[c], where (d) and 𝒥(c) respectively denote the sets of indexes of non-zero elements in the d-th row and c-th columns of H, (d)=𝒥(c)=S and 1sS. The probability mass function (PMF) pc,es={pc,es(aj)|ajA} represents the messages from variable nodes x[c] to factor nodes y[es].

Figure 2

Figure 2   Graph representation used to derive the message passing (MP) detector in Ref. [2]


Based on the factor graph in Fig. 2, a message passing algorithm was proposed in Ref. [2], and the detector is called MP detector in this paper. The following is a brief derivation of the message computations in the i-th iteration of the message computations.

1) Messages passing from observation node y[d] to variable node xes

The message is approximated to be Gaussian, and the mean μd,esi and variance (σd,esi)2are computed as

μd,esi=e𝒥d,eesj=1Qpe,di-1(aj)ajH[d,e],
(σd,esi)2=e𝒥d,eesj=1Qpe,di-1(aj)|aj|2|H[d,e]|2-j=1Qpe,di-1(aj)ajH[d,e]2+ϵ-1.

2) Messages passing from variable node x[c] to observation node y[es]

The PMF pc,di can be updated as

pc,esi(aj)=Δp˜c,esi(aj)+(1-Δ)p˜c,esi-1(aj),

where Δ[0,1] is the damping factor and

p˜c,esi(aj)e𝒥(c),eesPr(y[e]|x[c]=aj,H)=e𝒥(c),eesςi(e,c,j)k=1Qςi(e,c,j),

with

ςi(e,c,k)=exp-y[e]-μe,ci-H[e,c]ak2(σe,ci)2.

After a certain number of iterations by repeating 1) and 2), the decision on the transmitted symbol can be obtained, i.e.,

x̂[c]=argminajApci(aj),   c=1,...,MN,

where

pci(aj)=e𝒥(c)ςi(e,c,j)k=1Qςi(e,c,j).

The MP detector is summarized in Algorithm 1.

Algorithm 1. MPA detector in Ref. [2]

Input: y, H, Initialize: pc,es0=1/|A||, c=1,...,MN, es𝒥(c), i=1

1: Repeat

2: d: update μd,esi and (σd,esi)2 with Eqs. (12) and (13)

3: c: update pc,di with Eq. (14)

4: i=i+1

5: Until terminate

Output: The decision on transmitted symbols x̂[c] using Eq. (17)

The MP algorithm shown above is an approximation to loopy belief propagation since it approximates the interference to be Gaussian to achieve lower complexity. The complexity of the algorithm is 𝒪(MNS|𝒜|) per iteration, which depends on the sparsity of the channel, i.e., the value of S. When S is small, the detector is very attractive because it has low complexity and the detector delivers a good performance as no short loops in the factor graph model. However, in the case of rich-scatting environments and fractional Doppler shifts, the value of S can be large, leading to a denser factor graph model, which can affect the performance of the MP detector and result in a significant increase in computational complexity.

3.2 VB Detector

The VB detector was proposed in Ref. [17] to guarantee the convergence of the iterative detector, which can be implemented with variational message passing. With model (9), the optimal MAP detection can be formulated as:

x̂=argmaxxp(x|y).

However, the complexity of solving the above optimization problem increases exponentially with the size of x. VB is adopted to achieve low complexity approximate detection. In this method, a distribution q(x) from a tractable distribution family 𝒬 is found as an approximation to the a posteriori distribution p(x|y). The trial distribution q(x) can be obtained by minimizing the Kullback-Leibler divergence 𝒟(q||p), i.e.,

q*(x)=argmaxq𝒬𝒟(q||p)=argmaxq𝒬𝔼q[-lnq(x)+lnp(x|y)],

where the expectation is taken over x according to the trial distribution q(x).

To simplify the optimization problem, q(x) is assumed to be fully factorized, i.e.,

q(x)=k,lqk,l(xk,l),

where k[0,N-1], M[0,M-1] and xk,l denotes the (kM+l)-th entry of x. With this assumption, q(x) can be updated iteratively by maximizing . Since the noise sample ωk,l and data symbol xk,l,k,l are independent, and ωk,l~𝒞𝒩(ωk,l;0,ϵ-1), p(x|y) can be rewritten as:

p(x|y)k,lp(xk,l)p(yk,l|y),

where yk,l=hk,lTx+ωk,l, hk,l denotes the equivalent channel vector whose (kM+l)-th entry is hk,l[k,l]. Then the distribution p(x|y) can be further rewritten as:

p(x|y)k,lζk,l(xk,l)k',l'ψk,l(xk,l,xk',l'),

where

ζk,l(xk,l)=p(xk,l)exp-ρk,l|xk,l|2+ηk,lxk,lϵ-2,
ψk,l(xk,l,xk',l')=exp-ϱk,l,k',l'xk,lxk',l'ϵ-2,

with ρk,l=k',l'|hk',l'(k,l)|2, ηk,l=2k',l'Rhk',l'[k,l]yk,l', and ϱk,l,k',l'=2hk,l[k,l]hk,l*[k',l']. Substituting p(x|y) in Eq. (23) and q(x) into yields

=𝔼qk,llnψk,l(xk,l,xk',l')-k,llnqk,l(xk,l)ζk,l(xk,l)=𝔼q-k,lϱk,l,k',l'xk,lxk',l'ϵ-2-k,llnqk,l(xk,l)ζk,l(xk,l).

To find a stationary point of , the partial derivations of with respect to all local functions qk,l(xk,l), k, l need to be zero. Take the latent variable xk,l as an example. Setting the partial derivation /qk,l(xk,l) to zero leads to:

𝔼q\k,l-k',l'ϱk,l,k',l'xk,lxk',l'ϵ-2+lnζk,l(xk,l)-lnqk,liter(xk,l)+C=0,

where qk,l=(k',l')(k,l)qk',l'iter-1(xk,l), qk',l'iter-1(xk,l) is obtained in the (iter-1)-th iteration and C denotes a constant.

Then, solving Eq. (27) for qk,l(xk,l) results in the local distribution, which can be expressed as:

qk,liter(xk,l)ζk.l(xk,l)exp𝔼q\k,l-k',l'ϱk,l,k',l'xk,lxk',l'ϵ-2p(xk,l)exp-ρk,l|xk,l|2-mk,ldk,lϵ-2,

where mk,l=ηk,l-(k',l')(k,l)ϱk,l,k',l'𝔼qk',l'iter-1x[k',l'].

It is noted that the variance of xk,l is underestimated and only the noise variance is considered in Eq. (28). To fix the underestimation, a practical solution is to repeat the above procedure to approximate the a posteriori distribution for all the data symbols iteratively, resulting in the approximate marginal qk,l*(xk,l),k,l. Then, the decision on the symbols can be made by maximizing the approximate marginal distribution qk,l*(xk,l), i.e.,

x̂k,l=argmaxxk,lAqk,l*(xk,l).

The complexity of the algorithm per iteration is 𝒪(MNS|𝒜|).

3.3 UAMP Detector

Leveraging the UAMP algorithm, the UAMP detector was developed in Ref.[25], where the BCCB structure of the DD domain channel matrix is exploited, leading to a highly efficient OTFS detector with 2D FFT. It can be seen from Eqs. (10) and (11) that the DD domain channel matrix H has a BCCB structure. A useful property of the BCCB matrix H is that it can be diagonalized using 2D Discrete Fourier Transform matrix, i.e.,

H=FHΛF,

where F=FNFM with FN and FM being respectively the normalized N-point and M-point DFT matrices. In Eq. (30), matrix Λ is a diagonal matrix, i.e.,Λ=diag(d), and d is a length-MN vector that can be computed using 2D FFT.

d=vec(FFT2(C)),

where FFT2() represents the 2D FFT operation, C=reshapeMH(:,1) is an M×N matrix, and H(:,1) with length-MN is the first column of matrix H.

The above property is exploited in the design of the UAMP detector, leading to high computational efficiency while with outstanding performance compared with the existing detectors. Instead of using model (9) directly, the UAMP algorithm[2729] works with the unitary transform of the model. The channel matrix H admits the diagonalization in Eq. (30), leading to the following unitary transform of the OTFS system model:

r=ΛFx+ω',

where r=Fy, ω'=Fω, and the noise ω' has the same distribution with ω as F is an unitary matrix. The precision of the noise is still denoted by ϵ, which needs to be estimated. Define Φ=ΛF and an auxiliary vector z=Φx. Then we can factorize the joint distribution of the unknown variables x,z,ϵ given r as

p(x,z,ϵ|r)=p(ϵ)p(r|z,ϵ)p(z|x)p(x)=p(ϵ)jp(rj|zj,ϵ)p(zj|x)ip(xi)=fϵjfrj(zj,ϵ)fδj(zj,x)ifxi(xi),

where indices i,j[1:MN]. To facilitate the factor graph representation of the factorization in Eq. (33), the relevant notations are listed in Table 1, which shows the correspondence between the factor nodes and their associated distributions. The factor graph representation for the factorization in Eq. (33) is depicted in Fig. 3.

Table 1   Factors, underlying distributions and functional forms associated with Eq. (31)

FactorDistributionFunction Form
frjprj|zj,ϵNzj;rj,ϵ-1
fδjpzj|xδzj-Φjx
fxipxi(1/|A|)a=1Aδxi-αa
fϵp(ϵ)ϵ-1

New window| CSV


Figure 3

Figure 3   Factor graph representation of Eq. (31)


Following the UAMP algorithm, a UAMP based iterative detector can be designed, which is summarized in Algorithm 2. According to the derivation of (U)AMP using loopy belief propagation, UAMP provides the message from variable node zj to function node frj, which is Gaussian and denoted by mzjfrj(zj)=𝒩(zj|pj,νpj). Here, the mean pj and the variance νpj are given in Lines 1 and 2 of the Algorithm in a vector form. With the mean field rule[23] at the function node frj, we can compute the message passed from function node frj to variable node ϵ, i.e.,

mfrjϵ(ϵ)explogfrj(rj|zj,ϵ)bzjϵexp-ϵ(|rj-ẑj|2+vzj),

where b(zj) is the belief of zj. It turns out that b(zj) is also Gaussian with its variance and mean given by

νzj=1/1/νpj+ϵ̂,  ẑ=νzjpj/νpj+ϵ̂rj,

respectively, where ϵ̂ is the estimate of ϵ in the last iteration. They can be expressed in a vector form shown in Lines 3 and 4 in Algorithm 2. The estimate of ϵ can be obtained based on the belief b(ϵ) at the variable node ϵ shown in Fig. 3, i.e.,

b(ϵ)fϵ(ϵ)j=1MNmfrjϵ(ϵ).

And the estimate is given as

ϵ̂=0ϵb(ϵ)dϵ=MN/j=1MN|rj-ẑj|2+νzj,

which can be rewritten in a vector form shown in Line 5 of the algorithm. With the mean field rule at the function node frj again, the message passed from the function node frj to the variable node zj can be computed as:

mfrjzj(zj)explogfrj(rj|zj,ϵ̂)bϵ𝒩(hj|rj,ϵ̂-1).

Then the UAMP algorithm with known noise can be used as if the true noise precision is ϵ̂, leading to Lines 6–15 and Lines 1–2 of the Algorithm 2. In Lines 10–13, the Gaussian message is combined with the discrete prior to obtain the MMSE estimates of the symbols in terms of their posterior means and variances. There is an extra operation in Line 14, which averages the variances of xj. Thanks to the special form of the unitary matrix F, 2D FFT is used in the implementations in Lines 2 and 9. It can be seen that the UAMP detector does not require any matrix-vector products, the algorithm requires only element-wise vector operations or scalar operations, except Lines 2 and 9, which are implemented with FFT. So the complexity of the UAMP detector is 𝒪MNlog(MN)+𝒪MN|𝒜| per OTFS block per iteration, which is independent of S.

Algorithm 2. UAMP detector for OTFS

Unitary transform:r=Fy=ΛFx+ω with F=FNFM. Calculated d with Eq. (29), and define vector Λ=d·d*.

Initialize s-1=0, x̂=0, ϵ̂(0)=1, νx(0)=1, and t=0.

Input: y, H

Repeat

1: νp=νxtΛ

2: p=dvecFFT2reshapeM(x̂t)-νpst-1

3: νz=1./1./νp+ϵ̂t

4: z=νzp./νp+ϵ̂tr

5: ϵ̂t+1=MN/r-z22+1Tνz

6: νs=1./νp+1/ϵt+11

7: st=νsr-p̂

8: νq=ΛTνs/(MN)

9: q=x̂(t)+νqvecIFFT2reshapeM(dst)

10: j:ξj,a=exp-νq-1|αa-qj|2

11: j:βj,a=ξj,a/a=1|A|ξj,a

12: j:x̂jt+1=a=1|A|αaβj,a

13: j:νxjt+1=a=1|A|βj,a|αa-x̂jt+1|2

14: νxt+1=1MNj=1MNνxjt+1

15: t=t+1

Until terminated

Output: the estimate of x i.e., x̂

Compared with the UAMP detector, the MP and VB detectors have a complexity of 𝒪(MNS|𝒜|) per OTFS block per iteration, which can be considerably higher than that of the UAMP detector in the case of rich scattering environments and when fractional Doppler shifts have to be considered (leading to a large S). Moreover, the UAMP detector can deliver much better performance when the number of paths is relatively large. In particular, the UAMP detector with estimated noise precision can significantly outperform other detectors with perfect noise precision. We note that, the OTFS detector can be implemented directly with the AMP algorithm. However, due to the deviation of the channel matrix from the i.i.d. Gaussian matrix, the AMP detector may perform poorly.

4 Turbo Processing in Coded Systems

It is well known that joint decoding and detection can bring significant system performance improvement, and it can be realized in a way that the detector and decoder exchange information iteratively, i.e., the turbo processing[3031]. The OTFS detectors can be incorporated into a turbo receiver by endowing the OTFS detectors with the capabilities of taking the output log-likelihood ratios (LLRs) of the decoder as (soft) input and producing (soft) output in the form of extrinsic LLRs of the coded bits, i.e., the so-called soft input soft output (SISO) detector.

A typical turbo system is shown in Fig. 4, where Π and Π-1 represent interleaver and de-interleaver, respectively. The information bits are encoded and interleaved before symbol mapping, where each symbol xjA={α1,...,α|A|} in the DD domain is mapped from a sub-sequence of the coded bit sequence, which is denoted by cj=cj1,...,cjlog|A|. Each αacorresponds to a length-𝒜log|𝒜| binary sequence, which is denoted by αa1,...,αalog|A|. Based on the LLRs provided by the SISO decoder and the output of the OTFS demodulator as shown in Fig. 4, the task of the SISO OTFS detector is to compute the extrinsic LLR for each coded bit, i.e.,

Le(cjq)=lnP(cjq=0|r)P(cjq=1|r)-La(cjq),

Figure 4

Figure 4   Iterative joint detection and decoding in a coded OTFS system[25]


where La(cjq) is the output extrinsic LLR of the decoder in the last iteration. The extrinsic LLR Le(cjq) is passed to the decoder. The extrinsic LLR Le(cjq) can be expressed in terms of extrinsic mean and variance of the symbols[32], i.e.,

Le(cjq)=lnαaAq0exp(-|αa-mje|2vje)q'qP(cjq'=αaq')αaAq1exp(-|αa-mje|2vje)q'qP(cjq'=αaq'),

where mje and vje are the extrinsic mean and variance of xj, and 𝒜q0 and 𝒜q0 represent the subsets of all αa corresponding to cjq=0 and cjq=1, respectively. The extrinsic variance and mean are defined in Ref. [32].

vje=(1/vjp-1/vj)-1,mje=vje(mjp/vjp-mj/vj),

where mj and vj are the a priori mean and variance of xj calculated based on the output LLRs of the SISO decoder[30] and mjp and vjp are a posteriori mean and variance of xj.

Taking the UAMP detector as an example, we show the incorporation of the OTFS detector into a turbo receiver. According to the derivation of the UAMP algorithm, we can find that q and νq consist of the extrinsic means and variances of the symbols in x as they are the messages passed from the observation side and do not contain the immediate a priori information about x. Hence we have mje=qj and vje=νq. Then Eq. (40) can be readily used to compute the extrinsic LLRs of the coded bits. With the LLRs provided by the SISO decoder, one can compute the probability p(xj=αa) for each xj, which is no longer the “non-informative prior ” in Algorithm 2. Therefore, ξj,a in Line 7 of the algorithm is changed to

ξj,a=p(xj=αa)exp(-νq-1|αa-qj|2).

In addition, the iteration of the UAMP detector can be combined with the iteration between the SISO decoder and detector, which leads to a single loop iteration (i.e., inner iterations are not required).

The computational complexity of the detectors is summarized in Table 2. In the above discussion, we focus on the bi-orthogonal waveform. The detectors can be extended to OTFS systems with other waveforms, such as the simple rectangular waveform[25].

Table 2   Computational complexity of various detectors per iteration

DetectorsComplexity
MP detectorO(MNS|A|)
VB detectorO(MNS|A|)
UAMP detectorOMNlog(MN)+OMN|A|

MP: message passing UAMP: unitary approximate message passing

VB: variational Bayes

New window| CSV


5 Simulation Results

In this section, we compare the performance of the message passing based detectors. The low complexity MRC detector in Ref. [14] is also included. We set M=256 and N=32, i.e., there are 32 time slots and 256 subcarriers in the TF domain. Both quadrature phase shift keying (QPSK) modulation and 16-quadrature amplitude modulation (QAM) are considered. The carrier frequency is 3 GHz, and the subcarrier spacing is 2 kHz. The speed of the mobile user is set to v=135 km/h, leading to a maximum Doppler frequency shift index kmax=6. We assume that the maximum delay index is lmax=14. The Doppler index of the i-th path is uniformly drawn from the set [-kmax,kmax] and the delay index is in the range of [1,lmax] excluding the first path (l1=0). We assume that the fractional Doppler κi is uniformly distributed within [-1/2,1/2], and the channel coefficients hi are independently drawn from a complex Gaussian distribution with mean 0 and variance ηli, where the normalized power delay profile ηi=exp(-αli)/iexp(-αli) with α being 0 or 0.1. The maximum number of iterations is set to 15 for all iterative detectors. We note that, all detectors except the MRC detector require the noise variance. The UAMP detector performs noise precision estimation, while the other detectors (except the MRC detector) including the AMP detector assume perfect noise precision. We evaluate the performance of the detectors in a variety of scenarios including the bi-orthogonal and rectangular waveforms with integer or fractional Doppler shifts, and QPSK or 16-QAM for modulations. In addition, both uncoded and coded systems are evaluated.

Fig. 5 shows the BER performance of various detectors in the case of the bi-orthogonal waveform with different numbers of paths, where we assume no fractional Doppler shifts, i.e., S=P. We also assume α=0, and QPSK is used. From this figure, we can see that, the MP detector performs well when P=6, but with the increase of P, its performance becomes worse. The VB detector has a similar trend. The MRC detector performs similarly to the MP and VB detectors when P=6 and delivers better performance than the MP and VB detectors with larger P. The AMP and UAMP detectors perform well, where we can see that they enjoy the diversity gain and achieve better performance with the increase of P. In all cases, the UAMP based detector delivers the best performance and significantly outperforms other detectors.

Figure 5

Figure 5   BER performance of detectors with bi-orthogonal waveform and integer Doppler shifts (results are based on Ref. [25])


With the rectangular waveform and factional Doppler shifts, we compare the bit error ratio (BER) performance of the AMP, UAMP and MRC detectors in Fig. 6, where the number of paths P=9 and α=0.1 is used for the power delay profile. Both QPSK and 16-QAM are considered. Due to the deviation of the channel matrix from the i.i.d. (sub-) Gaussian matrix, AMP exhibits performance loss, leading to significantly worse performance compared with the UAMP detector. Thanks to the robustness of UAMP against a general matrix, UAMP performs well. We can see that the MRC detector performs better than the AMP detector. The UAMP detector performs the best and the gaps between other detectors with the UAMP detector become larger in the case of higher order modulation 16-QAM, compared with QPSK.

Figure 6

Figure 6   BER performance of detectors with the rectangular waveform and fractional Doppler shifts (results are based on Ref. [25])


We then evaluate the performance of the detectors in a coded OTFS system, where the turbo receiver in Fig. 4 is employed. The number of paths P=14, and a rectangular waveform is used. In Fig. 7(a), we show the performance of the uncoded system with the AMP and UAMP detectors. In Fig. 7(b), we use a rate-1/2 convolutional code with a generator [5,7]8 followed by a random interleaver and QPSK modulation. The length of the codeword is MN. The BCJR algorithm is used for the SISO decoder. We can find that the performance gaps between the AMP detector and the UAMP detector become larger in the coded system. The turbo receiver can achieve much better performance (about 3.5-4 dB at the BER of 10-4) thanks to the joint processing of decoding and detection. In Fig. 7(c), we investigate the performance of the system with a more powerful LDPC. The 8 192 information bits are coded at rate R=1/2 by an irregular LDPC code with an average column weight of 3, then the coded bits are randomly interleaved and mapped. As expected, the system performance is improved considerably when the LDPC is used. From Fig. 7(c), we can see that the use of the LDPC code can improve the performance of the UAMP based detector significantly and the performance gap between AMP and UAMP increases when the LDPC is used.

Figure 7

Figure 7   BER performance comparison of coded and uncoded system with rectangular waveform (part of the results is based on Ref. [25])


6 Conclusions and Potential Future Work

In this paper, we review and compare the recently proposed message passing based OTFS detectors, which exploit the structures of the OTFS channel matrices, such as sparsity and BCCB. According to the results, the MP and VB detectors are more suitable in the scenarios that the number of paths is relatively small and the modulation order is low, where they deliver good performance while with relatively low complexity. The UAMP detector seems very promising especially in the case of rich-scattering environments and/or when fractional Doppler shifts have to be considered, where the UAMP detector is attractive in both computational complexity and performance. The results also show that the OTFS system with a turbo receiver can provide significant performance gain.

The message passing techniques seem promising in the design of OTFS receivers. In this paper, we assume the OTFS channel matrix is known, which however has to be estimated for practical applications. Message passing based OTFS channel estimation has been investigated in the literature, such as the work in Ref. [33]. With the message passing techniques, channel estimation and detection can be integrated for joint channel estimation and detection, which is expected to lead to superior system performance and/or significant reduction of the training overhead. This is because the data symbols can be used to serve as a virtual training sequence and the guard band between the training symbols and data symbols is not necessary.

It has been shown that joint decoding and detection based on a turbo receiver can significantly improve the system performance. The system performance can be potentially further improved by optimizing the error control codes. This requires fast and accurate performance prediction of the iterative receiver, so that the error control codes, e.g., LDPC, can be optimized.

The message passing techniques could be used to implement sophisticated receivers in more complex systems, such as multi-user OTFS systems, grant-free multiple access with OTFS, multiple-output-multiple-input (MIMO)-OTFS, integrated sensing and communication with OTFS.

Reference

HADANI R, RAKIB S, TSATSANIS M, et al.

Orthogonal time frequency space modulation

[C]//2017 IEEE Wireless Communications and Networking Conference (WCNC). San Francisco, USA: IEEE, 2017: 16. DOI: 10.1109/WCNC.2017.7925924

[Cited within: 5]

RAVITEJA P, PHAN K T, HONG Y, et al.

Interference cancellation and iterative detection for orthogonal time frequency space modulation

[J]. IEEE transactions on wireless communications, 2018, 17(10): 65016515. DOI: 10.1109/TWC.2018.2860011

[Cited within: 11]

SURABHI G D, AUGUSTINE R M, CHOCKALINGAM A.

On the diversity of uncoded OTFS modulation in doubly-dispersive channels

[J]. IEEE transactions on wireless communications, 2019, 18(6): 30493063. DOI:10.1109/TWC.2019.2909205

HADANI R, MONK A.

OTFS: A new generation of modulation addressing the challenges of

5G [EB/OL]. [2021-10-01].

URL    

LI S Y, YUAN J H, YUAN W J, et al.

Performance analysis of coded OTFS systems over high-mobility channels

[J]. IEEE transactions on wireless communications, 2021, 20(9): 60336048. DOI: 10.1109/TWC.2021.3071493

WEI Z Q, YUAN W J, LI S Y, et al.

Orthogonal time-frequency space modulation: a promising next-generation waveform

[J]. IEEE wireless communications, 2021, 28(4): 136144. DOI: 10.1109/MWC.001.2000408

[Cited within: 1]

FARHANG A, REZAZADEHREYHANI A, DOYLE L E, et al.

Low complexity modem structure for OFDM-based orthogonal time frequency space modulation

[J]. IEEE wireless communications letters, 2018, 7(3): 344347. DOI: 10.1109/LWC.2017.2776942

[Cited within: 1]

LI L, WEI H, HUANG Y, et al.

A simple two-stage equalizer with simplified orthogonal time frequency space modulation over rapidly time-varying channels

[EB/OL]. [2021-10-10].

URL     [Cited within: 1]

LONG F, NIU K, DONG C, et al.

Low complexity iterative LMMSE-PIC equalizer for OTFS

[C]//2019 IEEE International Conference on Communications (ICC). Shanghai, China: IEEE, 2019: 16. DOI: 10.1109/ICC.2019.8761635

[Cited within: 1]

ZEMEN T, HOFER M, LOESCHENBRAND D.

Low-complexity equalization for orthogonal time and frequency signaling (OTFS)

[EB/OL]. [2021-10-10].

URL     [Cited within: 1]

SURABHI G D, CHOCKALINGAM A.

Low-complexity linear equalization for OTFS modulation

[J]. IEEE communications letters, 2020, 24(2): 330334. DOI: 10.1109/LCOMM.2019.2956709

[Cited within: 1]

SINGH P, MISHRA H B, BUDHIRAJA R.

Low-complexity linear MIMO-OTFS receivers

[C]//2021 IEEE International Conference on Communications Workshops (ICC Workshops). Montreal, Canada: IEEE, 2021: 16. DOI: 10.1109/ICCWorkshops50388.2021.9473839

[Cited within: 1]

LI S Y, YUAN W J, WEI Z Q, et al.

Cross domain iterative detection for orthogonal time frequency space modulation

[J]. IEEE transactions on wireless communications, 2021, (99): 1. DOI: 10.1109/TWC.2021.3110125

[Cited within: 1]

THAJ T, VITERBO E.

Low complexity iterative rake decision feedback equalizer for zero-padded OTFS systems

[J]. IEEE transactions on vehicular technology, 2020, 69(12): 15606–15622. DOI: 10.1109/TVT.2020.3044276

[Cited within: 2]

KSCHISCHANG F R, FREY B J, LOELIGER H A.

Factor graphs and the sum-product algorithm

[J]. IEEE transactions on information theory, 2001, 47(2): 498519. DOI: 10.1109/18.910572

[Cited within: 1]

LI H, DONG Y Y, GONG C H, et al.

Low complexity receiver via expectation propagation for OTFS modulation

[J]. IEEE communications letters, 2021, 25(10): 31803184. DOI: 10.1109/LCOMM.2021.3101827

[Cited within: 1]

YUAN W J, WEI Z Q, YUAN J H, et al.

A simple variational Bayes detector for orthogonal time frequency space (OTFS) modulation

[J]. IEEE transactions on vehicular technology, 2020, 69(7): 79767980. DOI:10.1109/TVT.2020.2991443

[Cited within: 3]

ZHANG H J, ZHANG T T.

A low-complexity message passing detector for OTFS modulation with probability clipping

[J]. IEEE wireless communications letters, 2021, 10(6): 12711275. DOI: 10.1109/LWC.2021.3063904

[Cited within: 1]

TIWARI S, DAS S S, RANGAMGARI V.

Low complexity LMMSE Receiver for OTFS

[J]. IEEE communications letters, 2019, 23(12): 22052209. DOI: 10.1109/LCOMM.2019.2945564

[Cited within: 1]

RAVITEJA P, VITERBO E, HONG Y.

OTFS performance on static multipath channels

[J]. IEEE wireless communications letters, 2019, 8(3): 745748. DOI: 10.1109/LWC.2018.2890643

[Cited within: 1]

DONOHO D L, MALEKI A, MONTANARI A.

Message passing algorithms for compressed sensing: motivation and construction

[C]//2010 IEEE Information Theory Workshop on Information Theory (ITW 2010, Cairo). Cairo, Egypt: IEEE, 2010: 15. DOI: 10.1109/ITWKSPS.2010.5503193

[Cited within: 1]

DONOHO D L, MALEKI A, MONTANARI A.

Message passing algorithms for compressed sensing: analysis and validation

[C]//2010 IEEE Information Theory Workshop on Information Theory (ITW 2010, Cairo). Cairo, Egypt: IEEE, 2010: 15. DOI: 10.1109/ITWKSPS.2010.5503228

[Cited within: 1]

WINN J, BISHOP C M.

Variational message passing

[J]. Journal of machine learning research, 2005, 6(4): 661694.

[Cited within: 2]

MONK A, HADANI R, TSATSANIS M, et al.

OTFS - Orthogonal Time Frequency Space

[EB-OL]. [2021-10-10]. .

URL     [Cited within: 1]

YUAN Z D, LIU F, YUAN W J, et al.

Iterative detection for orthogonal time frequency space modulation with unitary approximate message passing

[J]. IEEE transactions on wireless communications, 2021. DOI:10.1109/TWC.2021.3097173

[Cited within: 11]

LIU F, YUAN Z D, GUO Q H, WANG Z Y, et al.

Multi-block UAMP based detection for OTFS with rectangular waveform

[J]. IEEE wireless communications letters, 2021. DOI: 10.1109/LWC.2021.3126871

[Cited within: 1]

GUO Q H, XI J T.

Approximate message passing with unitary transformation

[EB/OL]. [2021-10-10].

URL     [Cited within: 5]

YUAN Z D, GUO Q H, LUO M.

Approximate message passing with unitary transformation for robust bilinear recovery

[J]. IEEE transactions on signal processing, 2021, 69: 617630. DOI: 10.1109/TSP.2020.3044847

LUO M, GUO Q H, JIN M, et al.

Unitary approximate message passing for sparse Bayesian learning

[J]. IEEE transactions on signal processing, 2021, 69: 60236039. DOI: 10.1109/TSP.2021.3114985

[Cited within: 3]

TUCHLER M, SINGER A C, KOETTER R.

Minimum mean squared error equalization using a priori information

[J]. IEEE transactions on signal processing, 2002, 50(3): 673683. DOI: 10.1109/78.984761

[Cited within: 2]

GUO Q H, PING L.

LMMSE turbo equalization based on factor graphs

[J]. IEEE journal on selected areas in communications, 2008, 26(2): 311319. DOI: 10.1109/JSAC.2008.080208

[Cited within: 1]

GUO Q H, HUANG D D.

A concise representation for the soft-in soft-out LMMSE detector

[J]. IEEE communications letters, 2011, 15(5): 566568. DOI: 10.1109/LCOMM.2011.032811.102073

[Cited within: 2]

LIU F, YUAN Z D, GUO Q H, et al.

Message passing based structured sparse signal recovery for estimation of OTFS channels with fractional Doppler shifts

[J]. IEEE transactions on wireless communications, 2021. DOI: 10.1109/TWC.2021.3087501

[Cited within: 1]

/