BaTFLED: Bayesian Tensor Factorization Linked to External Data
This addresses the need for machine learning methods that handle tensor-structured responses, which is incremental as it builds on existing tensor factorization techniques.
The authors tackled the problem of predicting values in three-dimensional response tensors using input features for each dimension, and their BaTFLED algorithm far outperformed elastic net and neural net models on 'cold start' tasks from simulated data.
The vast majority of current machine learning algorithms are designed to predict single responses or a vector of responses, yet many types of response are more naturally organized as matrices or higher-order tensor objects where characteristics are shared across modes. We present a new machine learning algorithm BaTFLED (Bayesian Tensor Factorization Linked to External Data) that predicts values in a three-dimensional response tensor using input features for each of the dimensions. BaTFLED uses a probabilistic Bayesian framework to learn projection matrices mapping input features for each mode into latent representations that multiply to form the response tensor. By utilizing a Tucker decomposition, the model can capture weights for interactions between latent factors for each mode in a small core tensor. Priors that encourage sparsity in the projection matrices and core tensor allow for feature selection and model regularization. This method is shown to far outperform elastic net and neural net models on 'cold start' tasks from data simulated in a three-mode structure. Additionally, we apply the model to predict dose-response curves in a panel of breast cancer cell lines treated with drug compounds that was used as a Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge.