ALiSNet: Accurate and Lightweight Human Segmentation Network for Fashion E-Commerce
This work addresses the need for lightweight and accurate on-device human segmentation to enhance privacy and enable applications like virtual try-on in fashion e-commerce, but it is incremental as it builds on existing methods.
The paper tackled the problem of accurately segmenting human bodies from photos for fashion e-commerce applications, achieving a model with 97.6% mIoU accuracy and a size of 4MB, outperforming Apple Person Segmentation at 94.4% mIoU.
Accurately estimating human body shape from photos can enable innovative applications in fashion, from mass customization, to size and fit recommendations and virtual try-on. Body silhouettes calculated from user pictures are effective representations of the body shape for downstream tasks. Smartphones provide a convenient way for users to capture images of their body, and on-device image processing allows predicting body segmentation while protecting users privacy. Existing off-the-shelf methods for human segmentation are closed source and cannot be specialized for our application of body shape and measurement estimation. Therefore, we create a new segmentation model by simplifying Semantic FPN with PointRend, an existing accurate model. We finetune this model on a high-quality dataset of humans in a restricted set of poses relevant for our application. We obtain our final model, ALiSNet, with a size of 4MB and 97.6$\pm$1.0$\%$ mIoU, compared to Apple Person Segmentation, which has an accuracy of 94.4$\pm$5.7$\%$ mIoU on our dataset.