Читать книгу Intelligent Security Management and Control in the IoT - Mohamed-Aymen Chalouf - Страница 46

2.5. Access controller for IoT terminals based on reinforcement learning

The difficulty of observing the system state, described in section 2.4, has led us to consider strategies making it possible to deduce the blocking factor even in the presence of very noisy measurements.

It is in this sense that we relied on deep learning techniques, which demonstrated great effectiveness in automatically extracting characteristics of system “features” in the presence of data tainted with noise or even of incomplete data (Rolnick et al. 2017).

Given the lack of data, we have considered the class of reinforcement learning techniques.

More particularly, we considered the “Twin Delayed Deep Deterministic policy gradient algorithm” (TD3) technique, which can tackle a continuous action space, and which has shown greater effectiveness in learning speed and in performance than existing approaches (Fujimoto et al. 2018).

We formulate, in what follows, the problem of access in the IoT as a reinforcement learning problem, in which an agent finds iteratively a sub-optimal blocking factor, making it possible to reduce the access conflict.

Intelligent Security Management and Control in the IoT

Подняться наверх