Читать книгу Artificial Intelligent Techniques for Wireless Communication and Networking - Группа авторов - Страница 18
1.2.3.2 Modifying the Objective Function
ОглавлениеIn order to optimize the policy acquired by a deep RL algorithm, one can implement an objective function that diverts from the real victim. By doing so, a bias is typically added, although this can help with generalization in some situations. The main approaches to modify the objective function are