Читать книгу Green Internet of Things and Machine Learning - Группа авторов - Страница 20
1.2.2.4 Reinforcement Learning
ОглавлениеReinforcement learning does not need training examples. In the reinforcement learning, models are given an environment, group of some actions, a goal and a reward. This algorithm learns by rewards and penalties. For every correct output, a reward is given and a penalty for every wrong output. To produce the desired output, the algorithm has to maximize these rewards. It is named reinforcement learning because for every reward the model gets a reinforcement that it is on right path. The reward feedback system helps the model to predict future behavior [9]. Figure 1.4 shows the complete process of reinforcement learning.
The following are algorithms which are based reinforcement learning:
• State Action Reward State action (SARSA)
• Q-Learning
• Deep Q Neural Network (DQN)