LG AI NIMay 3, 2021

RL-IoT: Reinforcement Learning to Interact with IoT Devices

Giulia Milan, Luca Vassio, Idilio Drago, Marco Mellia

arXiv:2105.00884v31.6Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of interoperability and automatic verification for IoT devices with closed protocols, representing an incremental advancement in applying RL to specific domain problems.

The paper tackles the problem of autonomously interacting with IoT devices that use poorly documented protocols by proposing RL-IoT, a reinforcement learning system that recovers protocol semantics and controls devices to achieve goals with minimal interactions, achieving results like completing non-trivial patterns with as few as 400 interactions in a case study with a Yeelight smart bulb.

Our life is getting filled by Internet of Things (IoT) devices. These devices often rely on closed or poorly documented protocols, with unknown formats and semantics. Learning how to interact with such devices in an autonomous manner is the key for interoperability and automatic verification of their capabilities. In this paper, we propose RL-IoT, a system that explores how to automatically interact with possibly unknown IoT devices. We leverage reinforcement learning (RL) to recover the semantics of protocol messages and to take control of the device to reach a given goal, while minimizing the number of interactions. We assume to know only a database of possible IoT protocol messages, whose semantics are however unknown. RL-IoT exchanges messages with the target IoT device, learning those commands that are useful to reach the given goal. Our results show that RL-IoT is able to solve both simple and complex tasks. With properly tuned parameters, RL-IoT learns how to perform actions with the target device, a Yeelight smart bulb in our case study, completing non-trivial patterns with as few as 400 interactions. RL-IoT paves the road for automatic interactions with poorly documented IoT protocols, thus enabling interoperable systems.

View on arXiv PDF Code

Similar