Читать книгу Multi-Processor System-on-Chip 2 - Liliana Andrade - Страница 40

2.3.1. Turbo decoder

Оглавление

Among the three code classes, Turbo codes are most challenging with respect to high-throughput decoding. The decoding structure consists of two component decoders connected through an interleaver/deinterleaver. These component decoders work cooperatively by exchanging extrinsic information in an iterative loop. The respective bit-wise log-likelihood ratios are computed in a forward and a backward recursion on the trellis in the MAP-based component decoder. Thus, decoding is inherently serial on component decoder (MAP recursions) and on Turbo decoder level (iterative loop). The kernel operation in the MAP is the Add-Compare-Select (ACS) operation. Depending on the encoder’s memory depth, several ACS operations have to be executed to process a single trellis step in the forward and backward recursion, respectively. In total, 4 · K trellis computations and two interleavings have to be performed for one single Turbo decoding iteration. Hence, P is determined by the number of parallel trellis step computations performed by a decoding architecture. The maximum value of P for one decoding iteration, i.e. P = 1, can be achieved by an architecture if the forward/backward recursions of the MAP algorithm are unrolled and pipelined (functional parallelism), yielding the XMAP architecture. P = I if, in addition, iterations are unrolled and pipelined. For a detailed overview and discussion of such highly parallel decoders, we refer to (Weithoffer et al. 2018). Figure 2.2 (left) shows the layout of a Turbo decoder exploiting all levels of parallelism (P = I) and an optimized interleaver such that ω is very small (Weithoffer et al. 2018). The decoder achieves 102 Gbit/s information bit throughput for an information block length of 128 bit, on a low Vt 28 nm Fully Depleted Silicon on Insulator (FD-SOI) technology under worst-case PVT conditions, running with 800 MHz and performing four decoding iterations. Taking into account the code rate of R = 1/3, the corresponding coded throughput is 306 Gbit/s. The total area is 23.61 mm2. Different colors represent the eight different MAP decoders originating from the four unrolled iterations (each iteration requires two MAP decoders). Although this architecture achieves an information bit throughput of more than 100 Gbit/s for small block sizes, it still suffers from limited flexibility in terms of block sizes and varying number of iterations, a large area and high power consumption. Hence, further research is mandatory.

Figure 2.2. Left: 306 Gbit/s turbo decoder. Middle: 288 Gbit/s LDPC decoder. Right: 764 Gbit/s polar decoder

Multi-Processor System-on-Chip 2

Подняться наверх