AdaptoVision: A Multi-Resolution Image Recognition Model for Robust and Scalable Classification
This provides a more efficient model for image classification in resource-constrained environments, though it appears incremental as it builds on existing CNN techniques.
The paper tackles the problem of balancing computational complexity and classification accuracy in image recognition by introducing AdaptoVision, a CNN architecture that achieves state-of-the-art results on the BreakHis dataset and competitive accuracies of 95.3% on CIFAR-10 and 85.77% on CIFAR-100.
This paper introduces AdaptoVision, a novel convolutional neural network (CNN) architecture designed to efficiently balance computational complexity and classification accuracy. By leveraging enhanced residual units, depth-wise separable convolutions, and hierarchical skip connections, AdaptoVision significantly reduces parameter count and computational requirements while preserving competitive performance across various benchmark and medical image datasets. Extensive experimentation demonstrates that AdaptoVision achieves state-of-the-art on BreakHis dataset and comparable accuracy levels, notably 95.3\% on CIFAR-10 and 85.77\% on CIFAR-100, without relying on any pretrained weights. The model's streamlined architecture and strategic simplifications promote effective feature extraction and robust generalization, making it particularly suitable for deployment in real-time and resource-constrained environments.