Читать книгу Artificial Intelligence and Quantum Computing for Advanced Wireless Networks - Savo G. Glisic - Страница 43
Design Example 3.1
ОглавлениеAs an illustration of the computations involved, we consider a simple network consisting of only two segments (cascaded linear FIR filters shown in Figure 3.8). The first segment is defined as
(3.35)
For simplicity, the second segment is limited to only three taps:
(3.36)
Figure 3.8 Oversimplified finite impulse response (FIR) network.
Here ( a is the vector of filter coefficient and should not be confused with the variable for the activation value used earlier). To adapt the filter coefficients, we evaluate the gradients ∂e2(k)/∂a and ∂e2(k)/∂b. For filter b, the desired response is available directly at the output of the filter of interest and the gradient is which yields the standard LMS update Δb(k) = 2μe(k)u(k). For filter a, we have
(3.37)
which yields
(3.38)
Here, approximately 3M multiplications are required at each iteration of this update, which is the product of the orders of the two filters. This computational inefficiency corresponds to the original approach of unfolding a network in time to derive the gradient. However, we observe that at each iteration this weight update is repeated. Explicitly writing out the product terms for several iterations, we get
Iteration | Calculation | |||||||
---|---|---|---|---|---|---|---|---|
k | e(k) | [ | + | b1x(k − 1) | + | b2x(k − 2) | ] | |
k + 1 | e(k + 1) | [ | box(k + 1) | + | + | b2x(k − 1) | ] | |
k + 2 | e(k + 2) | [ | box(k + 2) | + | b1x(k − 1) | + | ] | |
k + 3 | e(k + 3) | [ | box(k + 3) | + | b1x(k − 2) | + | b2x(k + 1) | ] |
Therefore, rather than grouping along the horizontal in the above equations, we may group along the diagonal (boxed terms). Gathering these terms, we get
(3.39)
where δ(k) is simply the error filtered backward through the second cascaded filter as illustrated in Figure 3.8. The alternative weight update is thus given by
Equation (3.40) represents temporal backpropagation. Each update now requires only M + 3 multiplications, the sum of the two filter orders. So, we can see that a simple reordering of terms results into a more efficient algorithm. This is the major advantage of the temporal backpropagation algorithm.