CVFeb 26

D-FINE-seg: Object Detection and Instance Segmentation Framework with multi-backend deployment

arXiv:2602.23043v1h-index: 1Has Code
Originality Incremental advance
AI Analysis

This work provides an incremental improvement in real-time instance segmentation performance for practitioners using transformer-based models.

This paper extends the D-FINE object detection architecture to D-FINE-seg for real-time instance segmentation. It achieves an improved F1-score over Ultralytics YOLO26 on the TACO dataset while maintaining competitive latency.

Transformer-based real-time object detectors achieve strong accuracy-latency trade-offs, and D-FINE is among the top-performing recent architectures. However, real-time instance segmentation with transformers is still less common. We present D-FINE-seg, an instance segmentation extension of D-FINE that adds: a lightweight mask head, segmentation-aware training, including box cropped BCE and dice mask losses, auxiliary and denoising mask supervision, and adapted Hungarian matching cost. On the TACO dataset, D-FINE-seg improves F1-score over Ultralytics YOLO26 under a unified TensorRT FP16 end-to-end benchmarking protocol, while maintaining competitive latency. Second contribution is an end-to-end pipeline for training, exporting, and optimized inference across ONNX, TensorRT, OpenVINO for both object detection and instance segmentation tasks. This framework is released as open-source under the Apache-2.0 license. GitHub repository - https://github.com/ArgoHA/D-FINE-seg.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes