S3Net: Innovating Stereo Matching and Semantic Segmentation with a Single-Branch Semantic Stereo Network in Satellite Epipolar Imagery
This work addresses the lack of integrated multitask learning for 3D reconstruction in satellite imagery, offering incremental improvements over existing methods.
The paper tackles the problem of integrating stereo matching and semantic segmentation in satellite imagery by proposing S3Net, which improves mIoU from 61.38 to 67.39 for segmentation and reduces D1-Error from 10.051 to 9.579 and EPE from 1.439 to 1.403 for disparity estimation.
Stereo matching and semantic segmentation are significant tasks in binocular satellite 3D reconstruction. However, previous studies primarily view these as independent parallel tasks, lacking an integrated multitask learning framework. This work introduces a solution, the Single-branch Semantic Stereo Network (S3Net), which innovatively combines semantic segmentation and stereo matching using Self-Fuse and Mutual-Fuse modules. Unlike preceding methods that utilize semantic or disparity information independently, our method dentifies and leverages the intrinsic link between these two tasks, leading to a more accurate understanding of semantic information and disparity estimation. Comparative testing on the US3D dataset proves the effectiveness of our S3Net. Our model improves the mIoU in semantic segmentation from 61.38 to 67.39, and reduces the D1-Error and average endpoint error (EPE) in disparity estimation from 10.051 to 9.579 and 1.439 to 1.403 respectively, surpassing existing competitive methods. Our codes are available at:https://github.com/CVEO/S3Net.