Active Learning of Symbolic Automata Over Rational Numbers
This work addresses the limitation of L* to finite alphabets, making it applicable to new domains such as software engineering and AI, though it is an incremental extension of an existing method.
The paper extends the L* algorithm to learn symbolic automata over rational numbers, enabling applications to infinite and dense alphabets like real RGX and time series, with an optimal query complexity linear in transitions and predicate size.
Automata learning has many applications in artificial intelligence and software engineering. Central to these applications is the $L^*$ algorithm, introduced by Angluin. The $L^*$ algorithm learns deterministic finite-state automata (DFAs) in polynomial time when provided with a minimally adequate teacher. Unfortunately, the $L^*$ algorithm can only learn DFAs over finite alphabets, which limits its applicability. In this paper, we extend $L^*$ to learn symbolic automata whose transitions use predicates over rational numbers, i.e., over infinite and dense alphabets. Our result makes the $L^*$ algorithm applicable to new settings like (real) RGX, and time series. Furthermore, our proposed algorithm is optimal in the sense that it asks a number of queries to the teacher that is at most linear with respect to the number of transitions, and to the representation size of the predicates.