Читать книгу Artificial Intelligent Techniques for Wireless Communication and Networking - Группа авторов - Страница 21
1.3 Deep Reinforcement Learning: Value-Based and Policy-Based Learning 1.3.1 Value-Based Method
ОглавлениеAlgorithms such as Deep-Q-Network (DQN) use Convolutional Neural Networks (CNNs) to help the agent select the best action [9]. While these formulas are very complicated, these are usually the fundamental steps (Figure 1.4):
Figure 1.4 Value based learning.
1 Take the status picture, transform it to grayscale, and excessive parts are cropped.
2 Run the picture through a series of contortions and pooling in order to extract the important features that will help the agent make the decision.
3 Calculate each possible action’s Q-Value.
4 To find the most accurate Q-Values, conduct back-propagation.