J. D. Knowles

LG
3papers
81citations
Novelty45%
AI Score22

3 Papers

LGNov 2, 2018
Deep Optimisation: Solving Combinatorial Optimisation Problems using Deep Neural Networks

J. R. Caldwell, R. A. Watson, C. Thies et al.

Deep Optimisation (DO) combines evolutionary search with Deep Neural Networks (DNNs) in a novel way - not for optimising a learning algorithm, but for finding a solution to an optimisation problem. Deep learning has been successfully applied to classification, regression, decision and generative tasks and in this paper we extend its application to solving optimisation problems. Model Building Optimisation Algorithms (MBOAs), a branch of evolutionary algorithms, have been successful in combining machine learning methods and evolutionary search but, until now, they have not utilised DNNs. DO is the first algorithm to use a DNN to learn and exploit the problem structure to adapt the variation operator (changing the neighbourhood structure of the search process). We demonstrate the performance of DO using two theoretical optimisation problems within the MAXSAT class. The Hierarchical Transformation Optimisation Problem (HTOP) has controllable deep structure that provides a clear evaluation of how DO works and why using a layerwise technique is essential for learning and exploiting problem structure. The Parity Modular Constraint Problem (MCparity) is a simplistic example of a problem containing higher-order dependencies (greater than pairwise) which DO can solve and state of the art MBOAs cannot. Further, we show that DO can exploit deep structure in TSP instances. Together these results show that there exists problems that DO can find and exploit deep problem structure that other algorithms cannot. Making this connection between DNNs and optimisation allows for the utilisation of advanced tools applicable to DNNs that current MBOAs are unable to use.

LGMay 9, 2014
Hellinger Distance Trees for Imbalanced Streams

R. J. Lyon, J. M. Brooke, J. D. Knowles et al.

Classifiers trained on data sets possessing an imbalanced class distribution are known to exhibit poor generalisation performance. This is known as the imbalanced learning problem. The problem becomes particularly acute when we consider incremental classifiers operating on imbalanced data streams, especially when the learning objective is rare class identification. As accuracy may provide a misleading impression of performance on imbalanced data, existing stream classifiers based on accuracy can suffer poor minority class performance on imbalanced streams, with the result being low minority class recall rates. In this paper we address this deficiency by proposing the use of the Hellinger distance measure, as a very fast decision tree split criterion. We demonstrate that by using Hellinger a statistically significant improvement in recall rates on imbalanced data streams can be achieved, with an acceptable increase in the false positive rate.

IMJul 30, 2013
A Study on Classification in Imbalanced and Partially-Labelled Data Streams

R. J. Lyon, J. M. Brooke, J. D. Knowles et al.

The domain of radio astronomy is currently facing significant computational challenges, foremost amongst which are those posed by the development of the world's largest radio telescope, the Square Kilometre Array (SKA). Preliminary specifications for this instrument suggest that the final design will incorporate between 2000 and 3000 individual 15 metre receiving dishes, which together can be expected to produce a data rate of many TB/s. Given such a high data rate, it becomes crucial to consider how this information will be processed and stored to maximise its scientific utility. In this paper, we consider one possible data processing scenario for the SKA, for the purposes of an all-sky pulsar survey. In particular we treat the selection of promising signals from the SKA processing pipeline as a data stream classification problem. We consider the feasibility of classifying signals that arrive via an unlabelled and heavily class imbalanced data stream, using currently available algorithms and frameworks. Our results indicate that existing stream learners exhibit unacceptably low recall on real astronomical data when used in standard configuration; however, good false positive performance and comparable accuracy to static learners, suggests they have definite potential as an on-line solution to this particular big data challenge.