Читать книгу Artificial Intelligent Techniques for Wireless Communication and Networking - Группа авторов - Страница 32
Delayed Rewards
ОглавлениеMost real systems have interruptions in the state’s sensation, the actuators, or the feedback on the reward. For instance, delays in the effects of a braking system, or delays between a recommendation system’s choices and consequent user behaviors. There are a number of possible methods to deal with this, including memory-based agents that leverage a memory recovery system to allocate credit to distant past events that are helpful in forecasting [1, 15].