Graham Taylor

LG
h-index28
15papers
895citations
Novelty40%
AI Score33

15 Papers

LGDec 7, 2014Code
Theano-based Large-Scale Visual Recognition with Multiple GPUs

Weiguang Ding, Ruoyan Wang, Fei Mao et al.

In this report, we describe a Theano-based AlexNet (Krizhevsky et al., 2012) implementation and its naive data parallelism on multiple GPUs. Our performance on 2 GPUs is comparable with the state-of-art Caffe library (Jia et al., 2014) run on 1 GPU. To the best of our knowledge, this is the first open-source Python-based AlexNet implementation to-date.

CVMay 22, 2025
Optimizing Image Capture for Computer Vision-Powered Taxonomic Identification and Trait Recognition of Biodiversity Specimens

Alyson East, Elizabeth G. Campolongo, Luke Meyers et al.

1) Biological collections house millions of specimens with digital images increasingly available through open-access platforms. However, most imaging protocols were developed for human interpretation without considering automated analysis requirements. As computer vision applications revolutionize taxonomic identification and trait extraction, a critical gap exists between current digitization practices and computational analysis needs. This review provides the first comprehensive practical framework for optimizing biological specimen imaging for computer vision applications. 2) Through interdisciplinary collaboration between taxonomists, collection managers, ecologists, and computer scientists, we synthesized evidence-based recommendations addressing fundamental computer vision concepts and practical imaging considerations. We provide immediately actionable implementation guidance while identifying critical areas requiring community standards development. 3) Our framework encompasses ten interconnected considerations for optimizing image capture for computer vision-powered taxonomic identification and trait extraction. We translate these into practical implementation checklists, equipment selection guidelines, and a roadmap for community standards development including filename conventions, pixel density requirements, and cross-institutional protocols. 4)By bridging biological and computational disciplines, this approach unlocks automated analysis potential for millions of existing specimens and guides future digitization efforts toward unprecedented analytical capabilities.

LGFeb 4, 2025
LAST SToP For Modeling Asynchronous Time Series

Shubham Gupta, Thibaut Durand, Graham Taylor et al.

We present a novel prompt design for Large Language Models (LLMs) tailored to Asynchronous Time Series. Unlike regular time series, which assume values at evenly spaced time points, asynchronous time series consist of timestamped events occurring at irregular intervals, each described in natural language. Our approach effectively utilizes the rich natural language of event descriptions, allowing LLMs to benefit from their broad world knowledge for reasoning across different domains and tasks. This allows us to extend the scope of asynchronous time series analysis beyond forecasting to include tasks like anomaly detection and data imputation. We further introduce Stochastic Soft Prompting, a novel prompt-tuning mechanism that significantly improves model performance, outperforming existing fine-tuning methods such as QLoRA. Through extensive experiments on real world datasets, we demonstrate that our approach achieves state-of-the-art performance across different tasks and datasets.

LGJan 27, 2024
Towards Stable Preferences for Stakeholder-aligned Machine Learning

Haleema Sheraz, Stefan C. Kremer, Joshua August Skorburg et al.

In response to the pressing challenge of kidney allocation, characterized by growing demands for organs, this research sets out to develop a data-driven solution to this problem, which also incorporates stakeholder values. The primary objective of this study is to create a method for learning both individual and group-level preferences pertaining to kidney allocations. Drawing upon data from the 'Pairwise Kidney Patient Online Survey.' Leveraging two distinct datasets and evaluating across three levels - Individual, Group and Stability - we employ machine learning classifiers assessed through several metrics. The Individual level model predicts individual participant preferences, the Group level model aggregates preferences across participants, and the Stability level model, an extension of the Group level, evaluates the stability of these preferences over time. By incorporating stakeholder preferences into the kidney allocation process, we aspire to advance the ethical dimensions of organ transplantation, contributing to more transparent and equitable practices while promoting the integration of moral values into algorithmic decision-making.

LGNov 18, 2019
Learning Permutation Invariant Representations using Memory Networks

Shivam Kalra, Mohammed Adnan, Graham Taylor et al.

Many real-world tasks such as classification of digital histopathology images and 3D object detection involve learning from a set of instances. In these cases, only a group of instances or a set, collectively, contains meaningful information and therefore only the sets have labels, and not individual data instances. In this work, we present a permutation invariant neural network called Memory-based Exchangeable Model (MEM) for learning set functions. The MEM model consists of memory units that embed an input sequence to high-level features enabling the model to learn inter-dependencies among instances through a self-attention mechanism. We evaluated the learning ability of MEM on various toy datasets, point cloud classification, and classification of lung whole slide images (WSIs) into two subtypes of lung cancer---Lung Adenocarcinoma, and Lung Squamous Cell Carcinoma. We systematically extracted patches from lung WSIs downloaded from The Cancer Genome Atlas~(TCGA) dataset, the largest public repository of WSIs, achieving a competitive accuracy of 84.84\% for classification of two sub-types of lung cancer. The results on other datasets are promising as well, and demonstrate the efficacy of our model.

LGApr 3, 2018
Convolutional Neural Networks Regularized by Correlated Noise

Shamak Dutta, Bryan Tripp, Graham Taylor

Neurons in the visual cortex are correlated in their variability. The presence of correlation impacts cortical processing because noise cannot be averaged out over many neurons. In an effort to understand the functional purpose of correlated variability, we implement and evaluate correlated noise models in deep convolutional neural networks. Inspired by the cortex, correlation is defined as a function of the distance between neurons and their selectivity. We show how to sample from high-dimensional correlated distributions while keeping the procedure differentiable, so that back-propagation can proceed as usual. The impact of correlated variability is evaluated on the classification of occluded and non-occluded images with and without the presence of other regularization techniques, such as dropout. More work is needed to understand the effects of correlations in various conditions, however in 10/12 of the cases we studied, the best performance on occluded images was obtained from a model with correlated noise.

LGMar 2, 2018
Quantitatively Evaluating GANs With Divergences Proposed for Training

Daniel Jiwoong Im, He Ma, Graham Taylor et al.

Generative adversarial networks (GANs) have been extremely effective in approximating complex distributions of high-dimensional, input data samples, and substantial progress has been made in understanding and improving GAN performance in terms of both theory and application. However, we currently lack quantitative methods for model assessment. Because of this, while many GAN variants are being proposed, we have relatively little understanding of their relative abilities. In this paper, we evaluate the performance of various types of GANs using divergence and distance functions typically used only for training. We observe consistency across the various proposed metrics and, interestingly, the test-time metrics do not favour networks that use the same training-time criterion. We also compare the proposed metrics to human perceptual scores.

LGAug 16, 2017
BitNet: Bit-Regularized Deep Neural Networks

Aswin Raghavan, Mohamed Amer, Sek Chai et al.

We present a novel optimization strategy for training neural networks which we call "BitNet". The parameters of neural networks are usually unconstrained and have a dynamic range dispersed over all real values. Our key idea is to limit the expressive power of the network by dynamically controlling the range and set of values that the parameters can take. We formulate this idea using a novel end-to-end approach that circumvents the discrete parameter space by optimizing a relaxed continuous and differentiable upper bound of the typical classification loss function. The approach can be interpreted as a regularization inspired by the Minimum Description Length (MDL) principle. For each layer of the network, our approach optimizes real-valued translation and scaling factors and arbitrary precision integer-valued parameters (weights). We empirically compare BitNet to an equivalent unregularized model on the MNIST and CIFAR-10 datasets. We show that BitNet converges faster to a superior quality solution. Additionally, the resulting model has significant savings in memory due to the use of integer-valued parameters.

LGDec 13, 2016
Generative Adversarial Parallelization

Daniel Jiwoong Im, He Ma, Chris Dongjoo Kim et al.

Generative Adversarial Networks have become one of the most studied frameworks for unsupervised learning due to their intuitive formulation. They have also been shown to be capable of generating convincing examples in limited domains, such as low-resolution images. However, they still prove difficult to train in practice and tend to ignore modes of the data generating distribution. Quantitatively capturing effects such as mode coverage and more generally the quality of the generative model still remain elusive. We propose Generative Adversarial Parallelization, a framework in which many GANs or their variants are trained simultaneously, exchanging their discriminators. This eliminates the tight coupling between a generator and discriminator, leading to improved convergence and improved coverage of modes. We also propose an improved variant of the recently proposed Generative Adversarial Metric and show how it can score individual GANs or their collections under the GAP model.

CVSep 30, 2016
Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Roberto DiCecco, Griffin Lacey, Jasmina Vasiljevic et al.

Convolutional Neural Networks (CNNs) have gained significant traction in the field of machine learning, particularly due to their high accuracy in visual recognition. Recent works have pushed the performance of GPU implementations of CNNs to significantly improve their classification and training times. With these improvements, many frameworks have become available for implementing CNNs on both CPUs and GPUs, with no support for FPGA implementations. In this work we present a modified version of the popular CNN framework Caffe, with FPGA support. This allows for classification using CNN models and specialized FPGA implementations with the flexibility of reprogramming the device when necessary, seamless memory transactions between host and device, simple-to-use test benches, and the ability to create pipelined layer implementations. To validate the framework, we use the Xilinx SDAccel environment to implement an FPGA-based Winograd convolution engine and show that the FPGA layer can be used alongside other layers running on a host processor to run several popular CNNs (AlexNet, GoogleNet, VGG A, Overfeat). The results show that our framework achieves 50 GFLOPS across 3x3 convolutions in the benchmarks. This is achieved within a practical framework, which will aid in future development of FPGA-based CNNs.

CVFeb 24, 2016
Automatic Moth Detection from Trap Images for Pest Management

Weiguang Ding, Graham Taylor

Monitoring the number of insect pests is a crucial component in pheromone-based pest management systems. In this paper, we propose an automatic detection pipeline based on deep learning for identifying and counting pests in images taken inside field traps. Applied to a commercial codling moth dataset, our method shows promising performance both qualitatively and quantitatively. Compared to previous attempts at pest detection, our approach uses no pest-specific engineering which enables it to adapt to other species and environments with minimal human effort. It is amenable to implementation on parallel hardware and therefore capable of deployment in settings where real-time performance is required.

CVNov 20, 2015
Hand Pose Estimation through Semi-Supervised and Weakly-Supervised Learning

Natalia Neverova, Christian Wolf, Florian Nebout et al.

We propose a method for hand pose estimation based on a deep regressor trained on two different kinds of input. Raw depth data is fused with an intermediate representation in the form of a segmentation of the hand into parts. This intermediate representation contains important topological information and provides useful cues for reasoning about joint locations. The mapping from raw depth to segmentation maps is learned in a semi/weakly-supervised way from two different datasets: (i) a synthetic dataset created through a rendering pipeline including densely labeled ground truth (pixelwise segmentations); and (ii) a dataset with real images for which ground truth joint positions are available, but not dense segmentations. Loss for training on real images is generated from a patch-wise restoration process, which aligns tentative segmentation maps with a large dictionary of synthetic poses. The underlying premise is that the domain shift between synthetic and real data is smaller in the intermediate representation, where labels carry geometric and topological meaning, than in the raw input domain. Experiments on the NYU dataset show that the proposed training method decreases error on joints over direct regression of joints from depth data by 15.7%.

LGNov 12, 2015
Learning Human Identity from Motion Patterns

Natalia Neverova, Christian Wolf, Griffin Lacey et al.

We present a large-scale study exploring the capability of temporal deep neural networks to interpret natural human kinematics and introduce the first method for active biometric authentication with mobile inertial sensors. At Google, we have created a first-of-its-kind dataset of human movements, passively collected by 1500 volunteers using their smartphones daily over several months. We (1) compare several neural architectures for efficient learning of temporal multi-modal data representations, (2) propose an optimized shift-invariant dense convolutional mechanism (DCWRNN), and (3) incorporate the discriminatively-trained dynamic features in a probabilistic generative framework taking into account temporal characteristics. Our results demonstrate that human kinematics convey important information about user identity and can serve as a valuable component of multi-modal authentication systems.

NCJun 1, 2015
Learning with hidden variables

Yasser Roudi, Graham Taylor

Learning and inferring features that generate sensory input is a task continuously performed by cortex. In recent years, novel algorithms and learning rules have been proposed that allow neural network models to learn such features from natural images, written text, audio signals, etc. These networks usually involve deep architectures with many layers of hidden neurons. Here we review recent advancements in this area emphasizing, amongst other things, the processing of dynamical inputs by networks with hidden nodes and the role of single neuron models. These points and the questions they arise can provide conceptual advancements in understanding of learning in the cortex and the relationship between machine learning approaches to learning with hidden nodes and those in cortical circuits.

NEDec 22, 2014
Generative Class-conditional Autoencoders

Jan Rudy, Graham Taylor

Recent work by Bengio et al. (2013) proposes a sampling procedure for denoising autoencoders which involves learning the transition operator of a Markov chain. The transition operator is typically unimodal, which limits its capacity to model complex data. In order to perform efficient sampling from conditional distributions, we extend this work, both theoretically and algorithmically, to gated autoencoders (Memisevic, 2013), The proposed model is able to generate convincing class-conditional samples when trained on both the MNIST and TFD datasets.