AIApr 20, 2018

Understanding AI Data Repositories with Automatic Query Generation

arXiv:1804.07819v11.71 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of scaling AI systems and improving knowledge coverage in domains like health or geology, but it is an early-stage, incremental approach.

The paper tackles the problem of understanding AI data repositories by proposing automatic query generation techniques that require no human domain expertise, aiming to identify incomplete or erroneous knowledge and test AI capabilities, though efficacy assessment is left for future work.

We describe a set of techniques to generate queries automatically based on one or more ingested, input corpuses. These queries require no a priori domain knowledge, and hence no human domain experts. Thus, these auto-generated queries help address the epistemological question of how we know what we know, or more precisely in this case, how an AI system with ingested data knows what it knows. These auto-generated queries can also be used to identify and remedy problem areas in ingested material -- areas for which the knowledge of the AI system is incomplete or even erroneous. Similarly, the proposed techniques facilitate tests of AI capability -- both in terms of coverage and accuracy. By removing humans from the main learning loop, our approach also allows more effective scaling of AI and cognitive capabilities to provide (1) broader coverage in a single domain such as health or geology; and (2) more rapid deployment to new domains. The proposed techniques also allow ingested knowledge to be extended naturally. Our investigations are early, and this paper provides a description of the techniques. Assessment of their efficacy is our next step for future work.

View on arXiv PDF

Similar