LG AI CV MLAug 13, 2018

Multimodal Deep Neural Networks using Both Engineered and Learned Representations for Biodegradability Prediction

Garrett B. Goh, Khushmeen Sakloth, Charles Siegel, Abhinav Vishnu, Jim Pfaendtner

arXiv:1808.04456v22.211 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of limited labeled data in domains like chemical sciences by improving biodegradability prediction, though it is incremental as it builds on existing methods.

The paper tackles the problem of predicting biodegradability in chemical sciences by developing a multimodal CNN-MLP neural network that combines domain-specific feature engineering with learned representations, achieving a 27% lower error classification rate of 0.125 compared to the state-of-the-art.

Deep learning algorithms excel at extracting patterns from raw data, and with large datasets, they have been very successful in computer vision and natural language applications. However, in other domains, large datasets on which to learn representations from may not exist. In this work, we develop a novel multimodal CNN-MLP neural network architecture that utilizes both domain-specific feature engineering as well as learned representations from raw data. We illustrate the effectiveness of such network designs in the chemical sciences, for predicting biodegradability. DeepBioD, a multimodal CNN-MLP network is more accurate than either standalone network designs, and achieves an error classification rate of 0.125 that is 27% lower than the current state-of-the-art. Thus, our work indicates that combining traditional feature engineering with representation learning can be effective, particularly in situations where labeled data is limited.

View on arXiv PDF

Similar