CLJul 17, 2023
On the application of Large Language Models for language teaching and assessment technologyAndrew Caines, Luca Benedetto, Shiva Taslimipoor et al. · cambridge
The recent release of very large language models such as PaLM and GPT-4 has made an unprecedented impact in the popular media and public consciousness, giving rise to a mixture of excitement and fear as to their capabilities and potential uses, and shining a light on natural language processing research which had not previously received so much attention. The developments offer great promise for education technology, and in this paper we look specifically at the potential for incorporating large language models in AI-driven language teaching and assessment systems. We consider several research areas and also discuss the risks and ethical considerations surrounding generative AI in education technology for language learners. Overall we find that larger language models offer improvements over previous models in text generation, opening up routes toward content generation which had not previously been plausible. For text generation they must be prompted carefully and their outputs may need to be reshaped before they are ready for use. For automated grading and grammatical error correction, tasks whose progress is checked on well-known benchmarks, early investigations indicate that large language models on their own do not improve on state-of-the-art results according to standard evaluation metrics. For grading it appears that linguistic features established in the literature should still be used for best performance, and for error correction it may be that the models can offer alternative feedback styles which are not measured sensitively with existing methods. In all cases, there is work to be done to experiment with the inclusion of large language models in education technology for language learners, in order to properly understand and report on their capacities and limitations, and to ensure that foreseeable risks such as misinformation and harmful bias are mitigated.
IVSep 22, 2021
Uncertainty-Aware Training for Cardiac Resynchronisation Therapy Response PredictionTareen Dawood, Chen Chen, Robin Andlauer et al.
Evaluation of predictive deep learning (DL) models beyond conventional performance metrics has become increasingly important for applications in sensitive environments like healthcare. Such models might have the capability to encode and analyse large sets of data but they often lack comprehensive interpretability methods, preventing clinical trust in predictive outcomes. Quantifying uncertainty of a prediction is one way to provide such interpretability and promote trust. However, relatively little attention has been paid to how to include such requirements into the training of the model. In this paper we: (i) quantify the data (aleatoric) and model (epistemic) uncertainty of a DL model for Cardiac Resynchronisation Therapy response prediction from cardiac magnetic resonance images, and (ii) propose and perform a preliminary investigation of an uncertainty-aware loss function that can be used to retrain an existing DL image-based classification model to encourage confidence in correct predictions and reduce confidence in incorrect predictions. Our initial results are promising, showing a significant increase in the (epistemic) confidence of true positive predictions, with some evidence of a reduction in false negative confidence.
IVJun 24, 2020
Interpretable Deep Models for Cardiac Resynchronisation Therapy Response PredictionEsther Puyol-Antón, Chen Chen, James R. Clough et al.
Advances in deep learning (DL) have resulted in impressive accuracy in some medical image classification tasks, but often deep models lack interpretability. The ability of these models to explain their decisions is important for fostering clinical trust and facilitating clinical translation. Furthermore, for many problems in medicine there is a wealth of existing clinical knowledge to draw upon, which may be useful in generating explanations, but it is not obvious how this knowledge can be encoded into DL models - most models are learnt either from scratch or using transfer learning from a different domain. In this paper we address both of these issues. We propose a novel DL framework for image-based classification based on a variational autoencoder (VAE). The framework allows prediction of the output of interest from the latent space of the autoencoder, as well as visualisation (in the image domain) of the effects of crossing the decision boundary, thus enhancing the interpretability of the classifier. Our key contribution is that the VAE disentangles the latent space based on `explanations' drawn from existing clinical knowledge. The framework can predict outputs as well as explanations for these outputs, and also raises the possibility of discovering new biomarkers that are separate (or disentangled) from the existing knowledge. We demonstrate our framework on the problem of predicting response of patients with cardiomyopathy to cardiac resynchronization therapy (CRT) from cine cardiac magnetic resonance images. The sensitivity and specificity of the proposed model on the task of CRT response prediction are 88.43% and 84.39% respectively, and we showcase the potential of our model in enhancing understanding of the factors contributing to CRT response.
HCJun 24, 2019
Multisensory cues facilitate coordination of stepping movements with a virtual reality avatarOmar Khan, Imran Ahmed, Joshua Cottingham et al.
The effectiveness of simple sensory cues for retraining gait have been demonstrated, yet the feasibility of humanoid avatars for entrainment have yet to be investigated. Here, we describe the development of a novel method of visually cued training, in the form of a virtual partner, and investigate its ability to provide movement guidance in the form of stepping. Real stepping movements were mapped onto an avatar using motion capture data. The trajectory of one of the avatar step cycles was then accelerated or decelerated by 15% to create a perturbation. Healthy participants were motion captured while instructed to step in time to the avatar's movements, as viewed through a virtual reality headset. Step onset times were used to measure the timing errors (asynchronies) between them. Participants completed either a visual-only condition, or auditory-visual with footstep sounds included. Participants' asynchronies exhibited slow drift in the Visual-Only condition, but became stable in the Auditory-Visual condition. Moreover, we observed a clear corrective response to the phase perturbation in both auditory-visual conditions. We conclude that an avatar's movements can be used to influence a person's own gait, but should include relevant auditory cues congruent with the movement to ensure a suitable accuracy is achieved.