CVJul 25, 2017

Representation Learning on Large and Small Data

Chun-Nan Chou, Chuen-Kai Shie, Fu-Chieh Chang, Jocelyn Chang, Edward Y. Chang

arXiv:1707.09873v11.78 citations

Originality Synthesis-oriented

AI Analysis

This provides practical guidance for representation learning across data scales, though it's an incremental synthesis of existing approaches rather than introducing new methods.

This book chapter addresses representation learning challenges for both large and small datasets, showing that CNN model enhancements improve performance on big data while transfer representation learning boosts accuracy for small data applications like medical diagnosis.

Deep learning owes its success to three key factors: scale of data, enhanced models to learn representations from data, and scale of computation. This book chapter presented the importance of the data-driven approach to learn good representations from both big data and small data. In terms of big data, it has been widely accepted in the research community that the more data the better for both representation and classification improvement. The question is then how to learn representations from big data, and how to perform representation learning when data is scarce. We addressed the first question by presenting CNN model enhancements in the aspects of representation, optimization, and generalization. To address the small data challenge, we showed transfer representation learning to be effective. Transfer representation learning transfers the learned representation from a source domain where abundant training data is available to a target domain where training data is scarce. Transfer representation learning gave the OM and melanoma diagnosis modules of our XPRIZE Tricorder device (which finished $2^{nd}$ out of $310$ competing teams) a significant boost in diagnosis accuracy.

View on arXiv PDF

Similar