MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-based Protein Structure Prediction
This work addresses protein structure prediction for bioinformatics, offering a simpler and faster model with improved performance, though it appears incremental as it builds on existing deep learning approaches.
The authors tackled protein property prediction from amino acid sequences by proposing MUST-CNN, a deep convolutional neural network with a novel multilayer shift-and-stitch technique, which achieved state-of-the-art results on two large datasets.
Predicting protein properties such as solvent accessibility and secondary structure from its primary amino acid sequence is an important task in bioinformatics. Recently, a few deep learning models have surpassed the traditional window based multilayer perceptron. Taking inspiration from the image classification domain we propose a deep convolutional neural network architecture, MUST-CNN, to predict protein properties. This architecture uses a novel multilayer shift-and-stitch (MUST) technique to generate fully dense per-position predictions on protein sequences. Our model is significantly simpler than the state-of-the-art, yet achieves better results. By combining MUST and the efficient convolution operation, we can consider far more parameters while retaining very fast prediction speeds. We beat the state-of-the-art performance on two large protein property prediction datasets.