Learning Regular Languages over Large Ordered Alphabets
This addresses a limitation in automata learning for domains with large alphabets, such as numerical data, but it is incremental as it builds on existing methods.
The authors tackled the problem of learning regular languages over large or infinite alphabets by extending Angluin's L* algorithm to handle automata with transitions labeled by finite partitions of the alphabet, and they implemented and demonstrated it on subsets of natural or real numbers.
This work is concerned with regular languages defined over large alphabets, either infinite or just too large to be expressed enumeratively. We define a generic model where transitions are labeled by elements of a finite partition of the alphabet. We then extend Angluin's L* algorithm for learning regular languages from examples for such automata. We have implemented this algorithm and we demonstrate its behavior where the alphabet is a subset of the natural or real numbers. We sketch the extension of the algorithm to a class of languages over partially ordered alphabets.