CVMay 11, 2018

Classification of Protein Crystallization X-Ray Images Using Major Convolutional Neural Network Architectures

Soheil Ghafurian, Peter Orth, Corey Strickland, Hua Su, Sangita Patel, Steven Soisson, Belma Dogdas

arXiv:1805.04563v11.71 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for more reliable automation in protein crystallization analysis for researchers, though it is incremental as it applies established CNN architectures to this domain.

The paper tackled the problem of automating protein crystallization X-ray image classification to improve accuracy over existing methods, achieving 81.43% accuracy with ResNet and reducing missed crystal rates to as low as 0.1% using a top-3 strategy.

The generation of protein crystals is necessary for the study of protein molecular function and structure. This is done empirically by processing large numbers of crystallization trials and inspecting them regularly in search of those with forming crystals. To avoid missing the hard-gained crystals, this visual inspection of the trial X-ray images is done manually as opposed to the existing less accurate machine learning methods. To achieve higher accuracy for automation, we applied some of the most successful convolutional neural networks (ResNet, Inception, VGG, and AlexNet) for 10-way classification of the X-ray images. We showed that substantial classification accuracy is gained by using such networks compared to two simpler ones previously proposed for this purpose. The best accuracy was obtained from ResNet (81.43%), which corresponds to a missed crystal rate of 5.9%. This rate could be lowered to less than 0.1% by using a top-3 classification strategy. Our dataset consisted of 486,000 internally annotated images, which was augmented to more than a million to address class imbalance. We also provide a label-wise analysis of the results, identifying the main sources of error and inaccuracy.

View on arXiv PDF

Similar