Completion of the DrugMatrix Toxicogenomics Database using 3-Dimensional Tensors
This work incrementally improves the accuracy of the world's largest in-vivo toxicogenomics database, aiding studies on drug effects across species like rats to humans.
The researchers tackled the problem of completing the DrugMatrix toxicogenomics dataset by using a 3-dimensional tensor completion method, which achieved lower mean squared and absolute errors compared to existing decomposition and factorization techniques.
We explore applying a tensor completion approach to complete the DrugMatrix toxicogenomics dataset. Our hypothesis is that by preserving the 3-dimensional structure of the data, which comprises tissue, treatment, and transcriptomic measurements, and by leveraging a machine learning formulation, our approach will improve upon prior state-of-the-art results. Our results demonstrate that the new tensor-based method more accurately reflects the original data distribution and effectively captures organ-specific variability. The proposed tensor-based methodology achieved lower mean squared errors and mean absolute errors compared to both conventional Canonical Polyadic decomposition and 2-dimensional matrix factorization methods. In addition, our non-negative tensor completion implementation reveals relationships among tissues. Our findings not only complete the world's largest in-vivo toxicogenomics database with improved accuracy but also offer a promising methodology for future studies of drugs that may cross species barriers, for example, from rats to humans.