Anomaly Object Segmentation with Vision-Language Models for Steel Scrap Recycling
This work addresses the challenge of reducing CO2 emissions in the steel industry by improving impurity detection, but it appears incremental as it applies existing methods to a specific domain.
The paper tackles the problem of detecting impurities in steel scrap recycling by proposing a vision-language-model-based anomaly detection method, achieving automated fine-grained anomaly detection.
Recycling steel scrap can reduce carbon dioxide (CO2) emissions from the steel industry. However, a significant challenge in steel scrap recycling is the inclusion of impurities other than steel. To address this issue, we propose vision-language-model-based anomaly detection where a model is finetuned in a supervised manner, enabling it to handle niche objects effectively. This model enables automated detection of anomalies at a fine-grained level within steel scrap. Specifically, we finetune the image encoder, equipped with multi-scale mechanism and text prompts aligned with both normal and anomaly images. The finetuning process trains these modules using a multiclass classification as the supervision.