CVAug 4, 2023Code
Multi-interactive Feature Learning and a Full-time Multi-modality Benchmark for Image Fusion and SegmentationJinyuan Liu, Zhu Liu, Guanyao Wu et al.
Multi-modality image fusion and segmentation play a vital role in autonomous driving and robotic operation. Early efforts focus on boosting the performance for only one task, \emph{e.g.,} fusion or segmentation, making it hard to reach~`Best of Both Worlds'. To overcome this issue, in this paper, we propose a \textbf{M}ulti-\textbf{i}nteractive \textbf{F}eature learning architecture for image fusion and \textbf{Seg}mentation, namely SegMiF, and exploit dual-task correlation to promote the performance of both tasks. The SegMiF is of a cascade structure, containing a fusion sub-network and a commonly used segmentation sub-network. By slickly bridging intermediate features between two components, the knowledge learned from the segmentation task can effectively assist the fusion task. Also, the benefited fusion network supports the segmentation one to perform more pretentiously. Besides, a hierarchical interactive attention block is established to ensure fine-grained mapping of all the vital information between two tasks, so that the modality/semantic features can be fully mutual-interactive. In addition, a dynamic weight factor is introduced to automatically adjust the corresponding weights of each task, which can balance the interactive feature correspondence and break through the limitation of laborious tuning. Furthermore, we construct a smart multi-wave binocular imaging system and collect a full-time multi-modality benchmark with 15 annotated pixel-level categories for image fusion and segmentation. Extensive experiments on several public datasets and our benchmark demonstrate that the proposed method outputs visually appealing fused images and perform averagely $7.66\%$ higher segmentation mIoU in the real-world scene than the state-of-the-art approaches. The source code and benchmark are available at \url{https://github.com/JinyuanLiu-CV/SegMiF}.
CVAug 24, 2023Code
Learning Heavily-Degraded Prior for Underwater Object DetectionChenping Fu, Xin Fan, Jiewen Xiao et al.
Underwater object detection suffers from low detection performance because the distance and wavelength dependent imaging process yield evident image quality degradations such as haze-like effects, low visibility, and color distortions. Therefore, we commit to resolving the issue of underwater object detection with compounded environmental degradations. Typical approaches attempt to develop sophisticated deep architecture to generate high-quality images or features. However, these methods are only work for limited ranges because imaging factors are either unstable, too sensitive, or compounded. Unlike these approaches catering for high-quality images or features, this paper seeks transferable prior knowledge from detector-friendly images. The prior guides detectors removing degradations that interfere with detection. It is based on statistical observations that, the heavily degraded regions of detector-friendly (DFUI) and underwater images have evident feature distribution gaps while the lightly degraded regions of them overlap each other. Therefore, we propose a residual feature transference module (RFTM) to learn a mapping between deep representations of the heavily degraded patches of DFUI- and underwater- images, and make the mapping as a heavily degraded prior (HDP) for underwater detection. Since the statistical properties are independent to image content, HDP can be learned without the supervision of semantic labels and plugged into popular CNNbased feature extraction networks to improve their performance on underwater object detection. Without bells and whistles, evaluations on URPC2020 and UODD show that our methods outperform CNN-based detectors by a large margin. Our method with higher speeds and less parameters still performs better than transformer-based detectors. Our code and DFUI dataset can be found in https://github.com/xiaoDetection/Learning-Heavily-Degraed-Prior.
CVDec 29, 2022Code
Practical Exposure Correction: Great Truths Are Always SimpleLong Ma, Tianjiao Ma, Xinwei Xue et al.
Improving the visual quality of the given degraded observation by correcting exposure level is a fundamental task in the computer vision community. Existing works commonly lack adaptability towards unknown scenes because of the data-driven patterns (deep networks) and limited regularization (traditional optimization), and they usually need time-consuming inference. These two points heavily limit their practicability. In this paper, we establish a Practical Exposure Corrector (PEC) that assembles the characteristics of efficiency and performance. To be concrete, we rethink the exposure correction to provide a linear solution with exposure-sensitive compensation. Around generating the compensation, we introduce an exposure adversarial function as the key engine to fully extract valuable information from the observation. By applying the defined function, we construct a segmented shrinkage iterative scheme to generate the desired compensation. Its shrinkage nature supplies powerful support for algorithmic stability and robustness. Extensive experimental evaluations fully reveal the superiority of our proposed PEC. The code is available at https://rsliu.tech/PEC.
CVAug 7, 2023Code
Bilevel Generative Learning for Low-Light VisionYingchi Liu, Zhu Liu, Long Ma et al.
Recently, there has been a growing interest in constructing deep learning schemes for Low-Light Vision (LLV). Existing techniques primarily focus on designing task-specific and data-dependent vision models on the standard RGB domain, which inherently contain latent data associations. In this study, we propose a generic low-light vision solution by introducing a generative block to convert data from the RAW to the RGB domain. This novel approach connects diverse vision problems by explicitly depicting data generation, which is the first in the field. To precisely characterize the latent correspondence between the generative procedure and the vision task, we establish a bilevel model with the parameters of the generative block defined as the upper level and the parameters of the vision task defined as the lower level. We further develop two types of learning strategies targeting different goals, namely low cost and high accuracy, to acquire a new bilevel generative learning paradigm. The generative blocks embrace a strong generalization ability in other low-light vision tasks through the bilevel optimization on enhancement tasks. Extensive experimental evaluations on three representative low-light vision tasks, namely enhancement, detection, and segmentation, fully demonstrate the superiority of our proposed approach. The code will be available at https://github.com/Yingchi1998/BGL.
CVNov 20, 2022
CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image FusionJinyuan Liu, Runjia Lin, Guanyao Wu et al.
Infrared and visible image fusion targets to provide an informative image by combining complementary information from different sensors. Existing learning-based fusion approaches attempt to construct various loss functions to preserve complementary features, while neglecting to discover the inter-relationship between the two modalities, leading to redundant or even invalid information on the fusion results. Moreover, most methods focus on strengthening the network with an increase in depth while neglecting the importance of feature transmission, causing vital information degeneration. To alleviate these issues, we propose a coupled contrastive learning network, dubbed CoCoNet, to realize infrared and visible image fusion in an end-to-end manner. Concretely, to simultaneously retain typical features from both modalities and to avoid artifacts emerging on the fused result, we develop a coupled contrastive constraint in our loss function. In a fused image, its foreground target / background detail part is pulled close to the infrared / visible source and pushed far away from the visible / infrared source in the representation space. We further exploit image characteristics to provide data-sensitive weights, allowing our loss function to build a more reliable relationship with source images. A multi-level attention module is established to learn rich hierarchical feature representation and to comprehensively transfer features in the fusion process. We also apply the proposed CoCoNet on medical image fusion of different types, e.g., magnetic resonance image, positron emission tomography image, and single photon emission computed tomography image. Extensive experiments demonstrate that our method achieves state-of-the-art (SOTA) performance under both subjective and objective evaluation, especially in preserving prominent targets and recovering vital textural details.
CVMay 26Code
Underwater360: Reconstructing Underwater Scenes from Panoramic Images with Omnidirectional Gaussian SplattingJiangbei Hu, Weichao Song, Shibo Yu et al.
Underwater scene reconstruction is essential for immersive exploration of aquatic environments, yet remains challenging due to complex participating-media effects such as absorption and scattering, as well as the limited field of view (FoV) of conventional cameras. Although combining panoramic imaging with 3D Gaussian Splatting (3DGS) offers a promising direction for photorealistic underwater rendering, traditional 3DGS struggles with both spherical projection distortion and underwater medium degradation. In this paper, we propose \textbf{Underwater360}, a physics-informed omnidirectional 3DGS framework for underwater panoramic scene reconstruction. First, we introduce an Omnidirectional Gaussian Splatting module that performs ray casting directly in spherical camera space instead of relying on 2D projection approximations, thereby reducing geometric distortions under 360$^\circ$ FoV. Second, we design a physics-based appearance-medium modeling architecture with pose-conditioned appearance embeddings to explicitly decouple intrinsic scene radiance from depth-dependent backscatter and attenuation, enabling physically grounded scene appearance restoration. Finally, we establish a new panoramic underwater benchmark dataset containing both synthetic and real-world scenes. Extensive experiments demonstrate that Underwater360 achieves superior performance in underwater novel view synthesis and scene appearance restoration, delivering improved rendering quality and cross-view consistency in complex underwater environments. The code and datasets are released at https://github.com/SwcK423/Underwater360
CVJul 21, 2023Code
DPM-OT: A New Diffusion Probabilistic Model Based on Optimal TransportZezeng Li, ShengHao Li, Zhanpeng Wang et al.
Sampling from diffusion probabilistic models (DPMs) can be viewed as a piecewise distribution transformation, which generally requires hundreds or thousands of steps of the inverse diffusion trajectory to get a high-quality image. Recent progress in designing fast samplers for DPMs achieves a trade-off between sampling speed and sample quality by knowledge distillation or adjusting the variance schedule or the denoising equation. However, it can't be optimal in both aspects and often suffer from mode mixture in short steps. To tackle this problem, we innovatively regard inverse diffusion as an optimal transport (OT) problem between latents at different stages and propose the DPM-OT, a unified learning framework for fast DPMs with a direct expressway represented by OT map, which can generate high-quality samples within around 10 function evaluations. By calculating the semi-discrete optimal transport map between the data latents and the white noise, we obtain an expressway from the prior distribution to the data distribution, while significantly alleviating the problem of mode mixture. In addition, we give the error bound of the proposed method, which theoretically guarantees the stability of the algorithm. Extensive experiments validate the effectiveness and advantages of DPM-OT in terms of speed and quality (FID and mode mixture), thus representing an efficient solution for generative modeling. Source codes are available at https://github.com/cognaclee/DPM-OT
CVMar 14, 2022
Automated Learning for Deformable Medical Image Registration by Jointly Optimizing Network Architectures and Objective FunctionsXin Fan, Zi Li, Ziyang Li et al.
Deformable image registration plays a critical role in various tasks of medical image analysis. A successful registration algorithm, either derived from conventional energy optimization or deep networks requires tremendous efforts from computer experts to well design registration energy or to carefully tune network architectures for the specific type of medical data. To tackle the aforementioned problems, this paper proposes an automated learning registration algorithm (AutoReg) that cooperatively optimizes both architectures and their corresponding training objectives, enabling non-computer experts, e.g., medical/clinical users, to conveniently find off-the-shelf registration algorithms for diverse scenarios. Specifically, we establish a triple-level framework to deduce registration network architectures and objectives with an auto-searching mechanism and cooperating optimization. We conduct image registration experiments on multi-site volume datasets and various registration tasks. Extensive results demonstrate that our AutoReg may automatically learn an optimal deep registration network for given volumes and achieve state-of-the-art performance, also significantly improving computation efficiency than the mainstream UNet architectures (from 0.558 to 0.270 seconds for a 3D image pair on the same configuration).
CVJun 14, 2023
OT-Net: A Reusable Neural Optimal Transport SolverZezeng Li, Shenghao Li, Lianbao Jin et al.
With the widespread application of optimal transport (OT), its calculation becomes essential, and various algorithms have emerged. However, the existing methods either have low efficiency or cannot represent discontinuous maps. A novel reusable neural OT solver OT-Net is thus presented, which first learns Brenier's height representation via the neural network to obtain its potential, and then gained the OT map by computing the gradient of the potential. The algorithm has two merits, 1) it can easily represent discontinuous maps, which allows it to match any target distribution with discontinuous supports and achieve sharp boundaries. This can well eliminate mode collapse in the generated models. 2) The OT map can be calculated straightly by the proposed algorithm when new target samples are added, which greatly improves the efficiency and reusability of the map. Moreover, the theoretical error bound of the algorithm is analyzed, and we have demonstrated the empirical success of our approach in image generation, color transfer, and domain adaptation.
NAJan 29, 2013
A new cubic nonconforming finite element on rectanglesZhaoliang Meng, Zhongxuan Luo, Dongwoo Sheen
A new nonconforming rectangle element with cubic convergence for the energy norm is introduced. The degrees of freedom (DOFs) are defined by the twelve values at the three Gauss points on each of the four edges. Due to the existence of one linear relation among the above DOFs, it turns out the DOFs are eleven. The nonconforming element consists of $P_2\oplus \Span\{x^3y-xy^3\}$. We count the corresponding dimension for Dirichlet and Neumann boundary value problems of second-order elliptic problems. We also present the optimal error estimates in both broken energy and $L_2(Ø)$ norms. Finally, numerical examples match our theoretical results very well.
NAOct 4, 2013
Convergence analysis of a family of 14-node brick elementsZhaoliang Meng, Zhongxuan Luo, Dongwoo Sheen et al.
In this paper, we will give convergence analysis for a family of 14-node elements which was proposed by I. M. Smith and D. J. Kidger in 1992. The 14 DOFs are taken as the value at the eight vertices and six face-centroids. For second-order elliptic problem, we will show that among all the Smith-Kidger 14-node elements, Type 1, Type 2 and type 5 elements can get the optimal convergence order and Type 6 get lower convergence order. Motivated by the proof, we also present a new 14-node nonconforming element. If we change the DOFs into the value at the eight vertices and the integration value of six faces, we show that Type 1, Type 2 and Type 5 keep the optimal convergence order and Type 6 element improve one order accuracy which means that it also get optimal convergence order.
NAJan 25, 2013
Constructing Cubature Formulas of Degree 5 with Few PointsZhaoliang Meng, Zhongxuan Luo
This paper will devote to construct a family of fifth degree cubature formulae for $n$-cube with symmetric measure and $n$-dimensional spherically symmetrical region. The formula for $n$-cube contains at most $n^2+5n+3$ points and for $n$-dimensional spherically symmetrical region contains only $n^2+3n+3$ points. Moreover, the numbers can be reduced to $n^2+3n+1$ and $n^2+n+1$ if $n=7$ respectively, the later of which is minimal.
AGJan 6, 2012
An Invariant of Algebraic Curves from the Pascal TheoremZhongxuan Luo
In 1640's, Blaise Pascal discovered a remarkable property of a hexagon inscribed in a conic - Pascal Theorem, which gave birth of the projective geometry. In this paper, a new geometric invariant of algebraic curves is discovered by a different comprehension to Pascal's mystic hexagram or to the Pascal theorem. Using this invariant, the Pascal theorem can be generalized to the case of cubic (even to algebraic curves of higher degree), that is, {\em For any given 9 intersections between a cubic $Γ_3$ and any three lines $a,b,c$ with no common zero, none of them is a component of $Γ_3$, then the six points consisting of the three points determined by the Pascal mapping applied to any six points (no three points of which are collinear) among those 9 intersections as well as the remaining three points of those 9 intersections must lie on a conic.} This generalization differs quite a bit and is much simpler than Chasles's theorem and Cayley-Bacharach theorems.
NAOct 26, 2013
On the Singularity of Multivariate Hermite InterpolationZhaoliang Meng, Zhongxuan Luo
In this paper we study the singularity of multivariate Hermite interpolation of type total degree. We present a method to judge the singularity of the interpolation scheme considered and by the method to be developed, we show that all Hermite interpolation of type total degree on $m=d+k$ points in $\R^d$ is singular if $d\geq 2k$. And then we solve the Hermite interpolation problem on $m\leq d+3$ nodes completely. Precisely, all Hermite interpolations of type total degree on $m\leq d+1$ points with $d\geq 2$ are singular; for $m=d+2$ and $m=d+3$, only three cases and one case can produce regular Hermite interpolation schemes, respectively. Besides, we also present a method to compute the interpolation space for Hermite interpolation of type total degree.
NADec 14, 2018
High accuracy analysis of a nonconforming discrete Stokes complex over rectangular meshesXinchen Zhou, Zhaoliang Meng, Xin Fan et al.
This work is devoted to the high accuracy analysis of a discrete Stokes complex over rectangular meshes with a simple structure. The 0-form in the complex is a non $C^0$ nonconforming element space for biharmonic problems. This plate element contains only 12 degrees of freedom (DoFs) over a rectangular cell with a zeroth order weak continuity for the normal derivative, therefore only the lowest convergence order can be obtained by a standard consistency error analysis. Nevertheless, we prove that, if the rectangular mesh is uniform, an $O(h^2)$ convergence rate in discrete $H^2$-norm will be achieved. Moreover, based on a duality argument, it has an $O(h^3)$ convergence order in discrete $H^1$-norm if the solution region is convex. The 1-form and 2-form constitute a divergence-free pair for incompressible flow. We also show its higher accuracy than that derived from a usual error estimate under uniform partitions, which explains the phenomenon observed in our previous work. Numerical tests verify our theoretical results.
NANov 19, 2018
Two 11-node nonconforming triangular prism elements for 3D elliptic problemsXinchen Zhou, Zhaoliang Meng, Xin Fan et al.
This work introduces two 11-node triangular prism elements for 3D elliptic problems. The degrees of freedom (DoFs) of both elements are at the vertices and face centroids of a prism cell. The first element is $H^1$-nonconforming and works for second order problems, which achieves a second order convergence rate in discrete $H^1$-norm. The other is $H^2$-nonconforming and solves fourth order problems, with a first order convergence rate in discrete $H^2$-norm. Numerical examples verify our theoretical results.
NAJan 25, 2013
A decomposition method to construct cubature formulae of degree 3Zhaoliang Meng, Zhongxuan Luo
Numerical integration formulas in $n$-dimensional Euclidean space of degree three are discussed. For the integrals with permutation symmetry we present a method to construct its third-degree integration formulas with $2n$ real points. We present a decomposition method and only need to deal with $n$ one-dimensional moment problems independently.
CVMar 22
CTFS : Collaborative Teacher Framework for Forward-Looking Sonar Image Semantic Segmentation with Extremely Limited LabelsPing Guo, Chengzhou Li, Guanchen Meng et al.
As one of the most important underwater sensing technologies, forward-looking sonar exhibits unique imaging characteristics. Sonar images are often affected by severe speckle noise, low texture contrast, acoustic shadows, and geometric distortions. These factors make it difficult for traditional teacher-student frameworks to achieve satisfactory performance in sonar semantic segmentation tasks under extremely limited labeled data conditions. To address this issue, we propose a Collaborative Teacher Semantic Segmentation Framework for forward-looking sonar images. This framework introduces a multi-teacher collaborative mechanism composed of one general teacher and multiple sonar-specific teachers. By adopting a multi-teacher alternating guidance strategy, the student model can learn general semantic representations while simultaneously capturing the unique characteristics of sonar images, thereby achieving more comprehensive and robust feature modeling. Considering the challenges of sonar images, which can lead teachers to generate a large number of noisy pseudo-labels, we further design a cross-teacher reliability assessment mechanism. This mechanism dynamically quantifies the reliability of pseudo-labels by evaluating the consistency and stability of predictions across multiple views and multiple teachers, thereby mitigating the negative impact caused by noisy pseudo-labels. Notably, on the FLSMD dataset, when only 2% of the data is labeled, our method achieves a 5.08% improvement in mIoU compared to other state-of-the-art approaches.
CVMay 25, 2023Code
A Task-guided, Implicitly-searched and Meta-initialized Deep Model for Image FusionRisheng Liu, Zhu Liu, Jinyuan Liu et al.
Image fusion plays a key role in a variety of multi-sensor-based vision systems, especially for enhancing visual quality and/or extracting aggregated features for perception. However, most existing methods just consider image fusion as an individual task, thus ignoring its underlying relationship with these downstream vision problems. Furthermore, designing proper fusion architectures often requires huge engineering labor. It also lacks mechanisms to improve the flexibility and generalization ability of current fusion approaches. To mitigate these issues, we establish a Task-guided, Implicit-searched and Meta-initialized (TIM) deep model to address the image fusion problem in a challenging real-world scenario. Specifically, we first propose a constrained strategy to incorporate information from downstream tasks to guide the unsupervised learning process of image fusion. Within this framework, we then design an implicit search scheme to automatically discover compact architectures for our fusion model with high efficiency. In addition, a pretext meta initialization technique is introduced to leverage divergence fusion data to support fast adaptation for different kinds of image fusion tasks. Qualitative and quantitative experimental results on different categories of image fusion problems and related downstream tasks (e.g., visual enhancement and semantic understanding) substantiate the flexibility and effectiveness of our TIM. The source code will be available at https://github.com/LiuZhu-CV/TIMFusion.
CVApr 21, 2022Code
Toward Fast, Flexible, and Robust Low-Light Image EnhancementLong Ma, Tengyu Ma, Risheng Liu et al.
Existing low-light image enhancement techniques are mostly not only difficult to deal with both visual quality and computational efficiency but also commonly invalid in unknown complex scenarios. In this paper, we develop a new Self-Calibrated Illumination (SCI) learning framework for fast, flexible, and robust brightening images in real-world low-light scenarios. To be specific, we establish a cascaded illumination learning process with weight sharing to handle this task. Considering the computational burden of the cascaded pattern, we construct the self-calibrated module which realizes the convergence between results of each stage, producing the gains that only use the single basic block for inference (yet has not been exploited in previous works), which drastically diminishes computation cost. We then define the unsupervised training loss to elevate the model capability that can adapt to general scenes. Further, we make comprehensive explorations to excavate SCI's inherent properties (lacking in existing works) including operation-insensitive adaptability (acquiring stable performance under the settings of different simple operations) and model-irrelevant generality (can be applied to illumination-based existing works to improve performance). Finally, plenty of experiments and ablation studies fully indicate our superiority in both quality and efficiency. Applications on low-light face detection and nighttime semantic segmentation fully reveal the latent practical values for SCI. The source code is available at https://github.com/vis-opt-group/SCI.
CVAug 26, 2021Code
An Underwater Image Semantic Segmentation Method Focusing on Boundaries and a Real Underwater Scene Semantic Segmentation DatasetZhiwei Ma, Haojie Li, Zhihui Wang et al.
With the development of underwater object grabbing technology, underwater object recognition and segmentation of high accuracy has become a challenge. The existing underwater object detection technology can only give the general position of an object, unable to give more detailed information such as the outline of the object, which seriously affects the grabbing efficiency. To address this problem, we label and establish the first underwater semantic segmentation dataset of real scene(DUT-USEG:DUT Underwater Segmentation Dataset). The DUT- USEG dataset includes 6617 images, 1487 of which have semantic segmentation and instance segmentation annotations, and the remaining 5130 images have object detection box annotations. Based on this dataset, we propose a semi-supervised underwater semantic segmentation network focusing on the boundaries(US-Net: Underwater Segmentation Network). By designing a pseudo label generator and a boundary detection subnetwork, this network realizes the fine learning of boundaries between underwater objects and background, and improves the segmentation effect of boundary areas. Experiments show that the proposed method improves by 6.7% in three categories of holothurian, echinus, starfish in DUT-USEG dataset, and achieves state-of-the-art results. The DUT- USEG dataset will be released at https://github.com/baxiyi/DUT-USEG.
CVDec 10, 2020Code
Optimization-Inspired Learning with Architecture Augmentations and Control Mechanisms for Low-Level VisionRisheng Liu, Zhu Liu, Pan Mu et al.
In recent years, there has been a growing interest in combining learnable modules with numerical optimization to solve low-level vision tasks. However, most existing approaches focus on designing specialized schemes to generate image/feature propagation. There is a lack of unified consideration to construct propagative modules, provide theoretical analysis tools, and design effective learning mechanisms. To mitigate the above issues, this paper proposes a unified optimization-inspired learning framework to aggregate Generative, Discriminative, and Corrective (GDC for short) principles with strong generalization for diverse optimization models. Specifically, by introducing a general energy minimization model and formulating its descent direction from different viewpoints (i.e., in a generative manner, based on the discriminative metric and with optimality-based correction), we construct three propagative modules to effectively solve the optimization models with flexible combinations. We design two control mechanisms that provide the non-trivial theoretical guarantees for both fully- and partially-defined optimization formulations. Under the support of theoretical guarantees, we can introduce diverse architecture augmentation strategies such as normalization and search to ensure stable propagation with convergence and seamlessly integrate the suitable modules into the propagation respectively. Extensive experiments across varied low-level vision tasks validate the efficacy and adaptability of GDC. The codes are available at https://github.com/LiuZhu-CV/GDC-OptimizationLearning
SEApr 16, 2017Code
Towards Effective Bug Triage with Towards Effective Bug Triage with Software Data Reduction TechniquesJifeng Xuan, He Jiang, Yan Hu et al.
Software companies spend over 45 percent of cost in dealing with software bugs. An inevitable step of fixing bugs is bug triage, which aims to correctly assign a developer to a new bug. To decrease the time cost in manual work, text classification techniques are applied to conduct automatic bug triage. In this paper, we address the problem of data reduction for bug triage, i.e., how to reduce the scale and improve the quality of bug data. We combine instance selection with feature selection to simultaneously reduce data scale on the bug dimension and the word dimension. To determine the order of applying instance selection and feature selection, we extract attributes from historical bug data sets and build a predictive model for a new bug data set. We empirically investigate the performance of data reduction on totally 600,000 bug reports of two large open source projects, namely Eclipse and Mozilla. The results show that our data reduction can effectively reduce the data scale and improve the accuracy of bug triage. Our work provides an approach to leveraging techniques on data processing to form reduced and high-quality bug data in software development and maintenance.
CVNov 14, 2025
OT-ALD: Aligning Latent Distributions with Optimal Transport for Accelerated Image-to-Image TranslationZhanpeng Wang, Shuting Cao, Yuhang Lu et al.
The Dual Diffusion Implicit Bridge (DDIB) is an emerging image-to-image (I2I) translation method that preserves cycle consistency while achieving strong flexibility. It links two independently trained diffusion models (DMs) in the source and target domains by first adding noise to a source image to obtain a latent code, then denoising it in the target domain to generate the translated image. However, this method faces two key challenges: (1) low translation efficiency, and (2) translation trajectory deviations caused by mismatched latent distributions. To address these issues, we propose a novel I2I translation framework, OT-ALD, grounded in optimal transport (OT) theory, which retains the strengths of DDIB-based approach. Specifically, we compute an OT map from the latent distribution of the source domain to that of the target domain, and use the mapped distribution as the starting point for the reverse diffusion process in the target domain. Our error analysis confirms that OT-ALD eliminates latent distribution mismatches. Moreover, OT-ALD effectively balances faster image translation with improved image quality. Experiments on four translation tasks across three high-resolution datasets show that OT-ALD improves sampling efficiency by 20.29% and reduces the FID score by 2.6 on average compared to the top-performing baseline models.
CVJan 19
RSOD: Reliability-Guided Sonar Image Object Detection with Extremely Limited LabelsChengzhou Li, Ping Guo, Guanchen Meng et al.
Object detection in sonar images is a key technology in underwater detection systems. Compared to natural images, sonar images contain fewer texture details and are more susceptible to noise, making it difficult for non-experts to distinguish subtle differences between classes. This leads to their inability to provide precise annotation data for sonar images. Therefore, designing effective object detection methods for sonar images with extremely limited labels is particularly important. To address this, we propose a teacher-student framework called RSOD, which aims to fully learn the characteristics of sonar images and develop a pseudo-label strategy suitable for these images to mitigate the impact of limited labels. First, RSOD calculates a reliability score by assessing the consistency of the teacher's predictions across different views. To leverage this score, we introduce an object mixed pseudo-label method to tackle the shortage of labeled data in sonar images. Finally, we optimize the performance of the student by implementing a reliability-guided adaptive constraint. By taking full advantage of unlabeled data, the student can perform well even in situations with extremely limited labels. Notably, on the UATD dataset, our method, using only 5% of labeled data, achieves results that can compete against those of our baseline algorithm trained on 100% labeled data. We also collected a new dataset to provide more valuable data for research in the field of sonar.
LGOct 17, 2024
Solving Prior Distribution Mismatch in Diffusion Models via Optimal TransportZhanpeng Wang, Shenghao Li, Chen Wang et al.
In recent years, the knowledge surrounding diffusion models(DMs) has grown significantly, though several theoretical gaps remain. Particularly noteworthy is prior error, defined as the discrepancy between the termination distribution of the forward process and the initial distribution of the reverse process. To address these deficiencies, this paper explores the deeper relationship between optimal transport(OT) theory and DMs with discrete initial distribution. Specifically, we demonstrate that the two stages of DMs fundamentally involve computing time-dependent OT. However, unavoidable prior error result in deviation during the reverse process under quadratic transport cost. By proving that as the diffusion termination time increases, the probability flow exponentially converges to the gradient of the solution to the classical Monge-Ampère equation, we establish a vital link between these fields. Therefore, static OT emerges as the most intrinsic single-step method for bridging this theoretical potential gap. Additionally, we apply these insights to accelerate sampling in both unconditional and conditional generation scenarios. Experimental results across multiple image datasets validate the effectiveness of our approach.
CVJun 6, 2024
Global Parameterization-based Texture Space OptimizationWei Chen, Yuxue Ren, Na Lei et al.
Texture mapping is a common technology in the area of computer graphics, it maps the 3D surface space onto the 2D texture space. However, the loose texture space will reduce the efficiency of data storage and GPU memory addressing in the rendering process. Many of the existing methods focus on repacking given textures, but they still suffer from high computational cost and hardly produce a wholly tight texture space. In this paper, we propose a method to optimize the texture space and produce a new texture mapping which is compact based on global parameterization. The proposed method is computationally robust and efficient. Experiments show the effectiveness of the proposed method and the potency in improving the storage and rendering efficiency.
CVMar 30, 2022
Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object DetectionJinyuan Liu, Xin Fan, Zhanbo Huang et al.
This study addresses the issue of fusing infrared and visible images that appear differently for object detection. Aiming at generating an image of high visual quality, previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks. These approaches neglect that modality differences implying the complementary information are extremely important for both fusion and subsequent detection task. This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network. The fusion network with one generator and dual discriminators seeks commons while learning from differences, which preserves structural information of targets from the infrared and textural details from the visible. Furthermore, we build a synchronized imaging system with calibrated infrared and optical sensors, and collect currently the most comprehensive benchmark covering a wide range of scenarios. Extensive experiments on several public datasets and our benchmark demonstrate that our method outputs not only visually appealing fusion but also higher detection mAP than the state-of-the-art approaches.
IVDec 9, 2021
Learning Deep Context-Sensitive Decomposition for Low-Light Image EnhancementLong Ma, Risheng Liu, Jiaao Zhang et al.
Enhancing the quality of low-light images plays a very important role in many image processing and multimedia applications. In recent years, a variety of deep learning techniques have been developed to address this challenging task. A typical framework is to simultaneously estimate the illumination and reflectance, but they disregard the scene-level contextual information encapsulated in feature spaces, causing many unfavorable outcomes, e.g., details loss, color unsaturation, artifacts, and so on. To address these issues, we develop a new context-sensitive decomposition network architecture to exploit the scene-level contextual dependencies on spatial scales. More concretely, we build a two-stream estimation mechanism including reflectance and illumination estimation network. We design a novel context-sensitive decomposition connection to bridge the two-stream mechanism by incorporating the physical principle. The spatially-varying illumination guidance is further constructed for achieving the edge-aware smoothness property of the illumination component. According to different training patterns, we construct CSDNet (paired supervision) and CSDGAN (unpaired supervision) to fully evaluate our designed architecture. We test our method on seven testing benchmarks to conduct plenty of analytical and evaluated experiments. Thanks to our designed context-sensitive decomposition connection, we successfully realized excellent enhanced results, which fully indicates our superiority against existing state-of-the-art approaches. Finally, considering the practical needs for high-efficiency, we develop a lightweight CSDNet (named LiteCSDNet) by reducing the number of channels. Further, by sharing an encoder for these two components, we obtain a more lightweight version (SLiteCSDNet for short). SLiteCSDNet just contains 0.0301M parameters but achieves the almost same performance as CSDNet.
CVDec 9, 2021
Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light VisionRisheng Liu, Long Ma, Tengyu Ma et al.
Images captured from low-light scenes often suffer from severe degradations, including low visibility, color cast and intensive noises, etc. These factors not only affect image qualities, but also degrade the performance of downstream Low-Light Vision (LLV) applications. A variety of deep learning methods have been proposed to enhance the visual quality of low-light images. However, these approaches mostly rely on significant architecture engineering to obtain proper low-light models and often suffer from high computational burden. Furthermore, it is still challenging to extend these enhancement techniques to handle other LLVs. To partially address above issues, we establish Retinex-inspired Unrolling with Architecture Search (RUAS), a general learning framework, which not only can address low-light enhancement task, but also has the flexibility to handle other more challenging downstream vision applications. Specifically, we first establish a nested optimization formulation, together with an unrolling strategy, to explore underlying principles of a series of LLV tasks. Furthermore, we construct a differentiable strategy to cooperatively search specific scene and task architectures for RUAS. Last but not least, we demonstrate how to apply RUAS for both low- and high-level LLV applications (e.g., enhancement, detection and segmentation). Extensive experiments verify the flexibility, effectiveness, and efficiency of RUAS.
CVDec 10, 2020
Retinex-inspired Unrolling with Cooperative Prior Architecture Search for Low-light Image EnhancementRisheng Liu, Long Ma, Jiaao Zhang et al.
Low-light image enhancement plays very important roles in low-level vision field. Recent works have built a large variety of deep learning models to address this task. However, these approaches mostly rely on significant architecture engineering and suffer from high computational burden. In this paper, we propose a new method, named Retinex-inspired Unrolling with Architecture Search (RUAS), to construct lightweight yet effective enhancement network for low-light images in real-world scenario. Specifically, building upon Retinex rule, RUAS first establishes models to characterize the intrinsic underexposed structure of low-light images and unroll their optimization processes to construct our holistic propagation structure. Then by designing a cooperative reference-free learning strategy to discover low-light prior architectures from a compact search space, RUAS is able to obtain a top-performing image enhancement network, which is with fast speed and requires few computational resources. Extensive experiments verify the superiority of our RUAS framework against recently proposed state-of-the-art methods.
CVApr 30, 2020
Learning Deformable Image Registration from Optimization: Perspective, Modules, Bilevel Training and BeyondRisheng Liu, Zi Li, Xin Fan et al.
Conventional deformable registration methods aim at solving an optimization model carefully designed on image pairs and their computational costs are exceptionally high. In contrast, recent deep learning based approaches can provide fast deformation estimation. These heuristic network architectures are fully data-driven and thus lack explicit geometric constraints, e.g., topology-preserving, which are indispensable to generate plausible deformations. We design a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation in order to integrate advantages and avoid limitations of these two categories of approaches. Specifically, we introduce a generic optimization model to formulate diffeomorphic registration and develop a series of learnable architectures to obtain propagative updating in the coarse-to-fine feature space. Moreover, we propose a novel bilevel self-tuned training strategy, allowing efficient search of task-specific hyper-parameters. This training strategy increases the flexibility to various types of data while reduces computational and human burdens. We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data. Extensive results demonstrate the state-of-the-art performance of the proposed method with diffeomorphic guarantee and extreme efficiency. We also apply our framework to challenging multi-modal image registration, and investigate how our registration to support the down-streaming tasks for medical image analysis including multi-modal fusion and image segmentation.
IVOct 29, 2019
Converged Deep Framework Assembling Principled Modules for CS-MRIRisheng Liu, Yuxi Zhang, Shichao Cheng et al.
Compressed Sensing Magnetic Resonance Imaging (CS-MRI) significantly accelerates MR data acquisition at a sampling rate much lower than the Nyquist criterion. A major challenge for CS-MRI lies in solving the severely ill-posed inverse problem to reconstruct aliasing-free MR images from the sparse k-space data. Conventional methods typically optimize an energy function, producing reconstruction of high quality, but their iterative numerical solvers unavoidably bring extremely slow processing. Recent data-driven techniques are able to provide fast restoration by either learning direct prediction to final reconstruction or plugging learned modules into the energy optimizer. Nevertheless, these data-driven predictors cannot guarantee the reconstruction following constraints underlying the regularizers of conventional methods so that the reliability of their reconstruction results are questionable. In this paper, we propose a converged deep framework assembling principled modules for CS-MRI that fuses learning strategy with the iterative solver of a conventional reconstruction energy. This framework embeds an optimal condition checking mechanism, fostering \emph{efficient} and \emph{reliable} reconstruction. We also apply the framework to two practical tasks, \emph{i.e.}, parallel imaging and reconstruction with Rician noise. Extensive experiments on both benchmark and manufacturer-testing images demonstrate that the proposed method reliably converges to the optimal solution more efficiently and accurately than the state-of-the-art in various scenarios.
CVOct 18, 2019
Investigating Task-driven Latent Feasibility for Nonconvex Image ModelingRisheng Liu, Pan Mu, Jian Chen et al.
Properly modeling latent image distributions plays an important role in a variety of image-related vision problems. Most exiting approaches aim to formulate this problem as optimization models (e.g., Maximum A Posterior, MAP) with handcrafted priors. In recent years, different CNN modules are also considered as deep priors to regularize the image modeling process. However, these explicit regularization techniques require deep understandings on the problem and elaborately mathematical skills. In this work, we provide a new perspective, named Task-driven Latent Feasibility (TLF), to incorporate specific task information to narrow down the solution space for the optimization-based image modeling problem. Thanks to the flexibility of TLF, both designed and trained constraints can be embedded into the optimization process. By introducing control mechanisms based on the monotonicity and boundedness conditions, we can also strictly prove the convergence of our proposed inference process. We demonstrate that different types of image modeling problems, such as image deblurring and rain streaks removals, can all be appropriately addressed within our TLF framework. Extensive experiments also verify the theoretical results and show the advantages of our method against existing state-of-the-art approaches.
CVJul 17, 2019
Underexposed Image Correction via Hybrid Priors Navigated Deep PropagationRisheng Liu, Long Ma, Yuxi Zhang et al.
Enhancing visual qualities for underexposed images is an extensively concerned task that plays important roles in various areas of multimedia and computer vision. Most existing methods often fail to generate high-quality results with appropriate luminance and abundant details. To address these issues, we in this work develop a novel framework, integrating both knowledge from physical principles and implicit distributions from data to solve the underexposed image correction task. More concretely, we propose a new perspective to formulate this task as an energy-inspired model with advanced hybrid priors. A propagation procedure navigated by the hybrid priors is well designed for simultaneously propagating the reflectance and illumination toward desired results. We conduct extensive experiments to verify the necessity of integrating both underlying principles (i.e., with knowledge) and distributions (i.e., from data) as navigated deep propagation. Plenty of experimental results of underexposed image correction demonstrate that our proposed method performs favorably against the state-of-the-art methods on both subjective and objective assessments. Additionally, we execute the task of face detection to further verify the naturalness and practical value of underexposed image correction. What's more, we employ our method to single image haze removal whose experimental results further demonstrate its superiorities.
LGFeb 8, 2019
Mode Collapse and Regularity of Optimal Transportation MapsNa Lei, Yang Guo, Dongsheng An et al.
This work builds the connection between the regularity theory of optimal transportation map, Monge-Ampère equation and GANs, which gives a theoretic understanding of the major drawbacks of GANs: convergence difficulty and mode collapse. According to the regularity theory of Monge-Ampère equation, if the support of the target measure is disconnected or just non-convex, the optimal transportation mapping is discontinuous. General DNNs can only approximate continuous mappings. This intrinsic conflict leads to the convergence difficulty and mode collapse in GANs. We test our hypothesis that the supports of real data distribution are in general non-convex, therefore the discontinuity is unavoidable using an Autoencoder combined with discrete optimal transportation map (AE-OT framework) on the CelebA data set. The testing result is positive. Furthermore, we propose to approximate the continuous Brenier potential directly based on discrete Brenier theory to tackle mode collapse. Comparing with existing method, this method is more accurate and effective.
CVJan 15, 2019
Real-world Underwater Enhancement: Challenges, Benchmarks, and SolutionsRisheng Liu, Xin Fan, Ming Zhu et al.
Underwater image enhancement is such an important low-level vision task with many applications that numerous algorithms have been proposed in recent years. These algorithms developed upon various assumptions demonstrate successes from various aspects using different data sets and different metrics. In this work, we setup an undersea image capturing system, and construct a large-scale Real-world Underwater Image Enhancement (RUIE) data set divided into three subsets. The three subsets target at three challenging aspects for enhancement, i.e., image visibility quality, color casts, and higher-level detection/classification, respectively. We conduct extensive and systematic experiments on RUIE to evaluate the effectiveness and limitations of various algorithms to enhance visibility and correct color casts on images with hierarchical categories of degradation. Moreover, underwater image enhancement in practice usually serves as a preprocessing step for mid-level and high-level vision tasks. We thus exploit the object detection performance on enhanced images as a brand new task-specific evaluation criterion. The findings from these evaluations not only confirm what is commonly believed, but also suggest promising solutions and new directions for visibility enhancement, color correction, and object detection on real-world underwater images.
CVNov 9, 2018
A Theoretically Guaranteed Deep Optimization Framework for Robust Compressive Sensing MRIRisheng Liu, Yuxi Zhang, Shichao Cheng et al.
Magnetic Resonance Imaging (MRI) is one of the most dynamic and safe imaging techniques available for clinical applications. However, the rather slow speed of MRI acquisitions limits the patient throughput and potential indi cations. Compressive Sensing (CS) has proven to be an efficient technique for accelerating MRI acquisition. The most widely used CS-MRI model, founded on the premise of reconstructing an image from an incompletely filled k-space, leads to an ill-posed inverse problem. In the past years, lots of efforts have been made to efficiently optimize the CS-MRI model. Inspired by deep learning techniques, some preliminary works have tried to incorporate deep architectures into CS-MRI process. Unfortunately, the convergence issues (due to the experience-based networks) and the robustness (i.e., lack real-world noise modeling) of these deeply trained optimization methods are still missing. In this work, we develop a new paradigm to integrate designed numerical solvers and the data-driven architectures for CS-MRI. By introducing an optimal condition checking mechanism, we can successfully prove the convergence of our established deep CS-MRI optimization scheme. Furthermore, we explicitly formulate the Rician noise distributions within our framework and obtain an extended CS-MRI network to handle the real-world nosies in the MRI process. Extensive experimental results verify that the proposed paradigm outperforms the existing state-of-the-art techniques both in reconstruction accuracy and efficiency as well as robustness to noises in real scene.
CVAug 16, 2018
On the Convergence of Learning-based Iterative Methods for Nonconvex Inverse ProblemsRisheng Liu, Shichao Cheng, Yi He et al.
Numerous tasks at the core of statistics, learning and vision areas are specific cases of ill-posed inverse problems. Recently, learning-based (e.g., deep) iterative methods have been empirically shown to be useful for these problems. Nevertheless, integrating learnable structures into iterations is still a laborious process, which can only be guided by intuitions or empirical insights. Moreover, there is a lack of rigorous analysis about the convergence behaviors of these reimplemented iterations, and thus the significance of such methods is a little bit vague. This paper moves beyond these limits and proposes Flexible Iterative Modularization Algorithm (FIMA), a generic and provable paradigm for nonconvex inverse problems. Our theoretical analysis reveals that FIMA allows us to generate globally convergent trajectories for learning-based iterative methods. Meanwhile, the devised scheduling policies on flexible modules should also be beneficial for classical numerical methods in the nonconvex scenario. Extensive experiments on real applications verify the superiority of FIMA.
CVAug 9, 2018
User-Guided Deep Anime Line Art Colorization with Conditional Adversarial NetworksYuanzheng Ci, Xinzhu Ma, Zhihui Wang et al.
Scribble colors based line art colorization is a challenging computer vision problem since neither greyscale values nor semantic information is presented in line arts, and the lack of authentic illustration-line art training pairs also increases difficulty of model generalization. Recently, several Generative Adversarial Nets (GANs) based methods have achieved great success. They can generate colorized illustrations conditioned on given line art and color hints. However, these methods fail to capture the authentic illustration distributions and are hence perceptually unsatisfying in the sense that they often lack accurate shading. To address these challenges, we propose a novel deep conditional adversarial architecture for scribble based anime line art colorization. Specifically, we integrate the conditional framework with WGAN-GP criteria as well as the perceptual loss to enable us to robustly train a deep network that makes the synthesized images more natural and real. We also introduce a local features network that is independent of synthetic data. With GANs conditioned on features from such network, we notably increase the generalization capability over "in the wild" line arts. Furthermore, we collect two datasets that provide high-quality colorful illustrations and authentic line arts for training and benchmarking. With the proposed model trained on our illustration dataset, we demonstrate that images synthesized by the presented approach are considerably more realistic and precise than alternative approaches.
CVJul 31, 2018
Learning Collaborative Generation Correction Modules for Blind Image Deblurring and BeyondRisheng Liu, Yi He, Shichao Cheng et al.
Blind image deblurring plays a very important role in many vision and multimedia applications. Most existing works tend to introduce complex priors to estimate the sharp image structures for blur kernel estimation. However, it has been verified that directly optimizing these models is challenging and easy to fall into degenerate solutions. Although several experience-based heuristic inference strategies, including trained networks and designed iterations, have been developed, it is still hard to obtain theoretically guaranteed accurate solutions. In this work, a collaborative learning framework is established to address the above issues. Specifically, we first design two modules, named Generator and Corrector, to extract the intrinsic image structures from the data-driven and knowledge-based perspectives, respectively. By introducing a collaborative methodology to cascade these modules, we can strictly prove the convergence of our image propagations to a deblurring-related optimal solution. As a nontrivial byproduct, we also apply the proposed method to address other related tasks, such as image interpolation and edge-preserved smoothing. Plenty of experiments demonstrate that our method can outperform the state-of-the-art approaches on both synthetic and real datasets.
NAOct 15, 2018
A nodal type polynomial finite element exact sequence over quadrilateralsXinchen Zhou, Zhaoliang Meng, Xin Fan et al.
This work proposes two nodal type nonconforming finite elements over convex quadrilaterals, which are parts of a finite element exact sequence. Both elements are of 12 degrees of freedom (DoFs) with polynomial shape function spaces selected. The first one is designed for fourth order elliptic singular perturbation problems, and the other works for Brinkman problems. Numerical examples are also provided.
CVJul 5, 2018
A Single Shot Text Detector with Scale-adaptive AnchorsQi Yuan, Bingwang Zhang, Haojie Li et al.
Currently, most top-performing text detection networks tend to employ fixed-size anchor boxes to guide the search for text instances. They usually rely on a large amount of anchors with different scales to discover texts in scene images, thus leading to high computational cost. In this paper, we propose an end-to-end box-based text detector with scale-adaptive anchors, which can dynamically adjust the scales of anchors according to the sizes of underlying texts by introducing an additional scale regression layer. The proposed scale-adaptive anchors allow us to use a few number of anchors to handle multi-scale texts and therefore significantly improve the computational efficiency. Moreover, compared to discrete scales used in previous methods, the learned continuous scales are more reliable, especially for small texts detection. Additionally, we propose Anchor convolution to better exploit necessary feature information by dynamically adjusting the sizes of receptive fields according to the learned scales. Extensive experiments demonstrate that the proposed detector is fast, taking only $0.28$ second per image, while outperforming most state-of-the-art methods in accuracy.
LGMay 26, 2018
Geometric Understanding of Deep LearningNa Lei, Zhongxuan Luo, Shing-Tung Yau et al.
Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space.
CVApr 28, 2018
Toward Designing Convergent Deep Operator Splitting Methods for Task-specific Nonconvex OptimizationRisheng Liu, Shichao Cheng, Yi He et al.
Operator splitting methods have been successfully used in computational sciences, statistics, learning and vision areas to reduce complex problems into a series of simpler subproblems. However, prevalent splitting schemes are mostly established only based on the mathematical properties of some general optimization models. So it is a laborious process and often requires many iterations of ideation and validation to obtain practical and task-specific optimal solutions, especially for nonconvex problems in real-world scenarios. To break through the above limits, we introduce a new algorithmic framework, called Learnable Bregman Splitting (LBS), to perform deep-architecture-based operator splitting for nonconvex optimization based on specific task model. Thanks to the data-dependent (i.e., learnable) nature, our LBS can not only speed up the convergence, but also avoid unwanted trivial solutions for real-world tasks. Though with inexact deep iterations, we can still establish the global convergence and estimate the asymptotic convergence rate of LBS only by enforcing some fairly loose assumptions. Extensive experiments on different applications (e.g., image completion and deblurring) verify our theoretical results and show the superiority of LBS against existing methods.
CVNov 23, 2017
Self-Reinforced Cascaded Regression for Face AlignmentXin Fan, Risheng Liu, Kang Huyan et al.
Cascaded regression is prevailing in face alignment thanks to its accuracy and robustness, but typically demands manually annotated examples having low discrepancy between shape-indexed features and shape updates. In this paper, we propose a self-reinforced strategy that iteratively expands the quantity and improves the quality of training examples, thus upgrading the performance of cascaded regression itself. The reinforced term evaluates the example quality upon the consistence on both local appearance and global geometry of human faces, and constitutes the example evolution by the philosophy of "survival of the fittest". We train a set of discriminative classifiers, each associated with one landmark label, to prune those examples with inconsistent local appearance, and further validate the geometric relationship among groups of labeled landmarks against the common global geometry derived from a projective invariant. We embed this generic strategy into typical cascaded regressions, and the alignment results on several benchmark data sets demonstrate its effectiveness to predict good examples starting from a small subset.
CVNov 21, 2017
Proximal Alternating Direction Network: A Globally Converged Deep Unrolling FrameworkRisheng Liu, Xin Fan, Shichao Cheng et al.
Deep learning models have gained great success in many real-world applications. However, most existing networks are typically designed in heuristic manners, thus lack of rigorous mathematical principles and derivations. Several recent studies build deep structures by unrolling a particular optimization model that involves task information. Unfortunately, due to the dynamic nature of network parameters, their resultant deep propagation networks do \emph{not} possess the nice convergence property as the original optimization scheme does. This paper provides a novel proximal unrolling framework to establish deep models by integrating experimentally verified network architectures and rich cues of the tasks. More importantly, we \emph{prove in theory} that 1) the propagation generated by our unrolled deep model globally converges to a critical-point of a given variational energy, and 2) the proposed framework is still able to learn priors from training data to generate a convergent propagation even when task information is only partially available. Indeed, these theoretical results are the best we can ask for, unless stronger assumptions are enforced. Extensive experiments on various real-world applications verify the theoretical convergence and demonstrate the effectiveness of designed deep models.
CVNov 18, 2017
Learning Aggregated Transmission Propagation Networks for Haze Removal and BeyondRisheng Liu, Xin Fan, Minjun Hou et al.
Single image dehazing is an important low-level vision task with many applications. Early researches have investigated different kinds of visual priors to address this problem. However, they may fail when their assumptions are not valid on specific images. Recent deep networks also achieve relatively good performance in this task. But unfortunately, due to the disappreciation of rich physical rules in hazes, large amounts of data are required for their training. More importantly, they may still fail when there exist completely different haze distributions in testing images. By considering the collaborations of these two perspectives, this paper designs a novel residual architecture to aggregate both prior (i.e., domain knowledge) and data (i.e., haze distribution) information to propagate transmissions for scene radiance estimation. We further present a variational energy based perspective to investigate the intrinsic propagation behavior of our aggregated deep model. In this way, we actually bridge the gap between prior driven models and data driven networks and leverage advantages but avoid limitations of previous dehazing approaches. A lightweight learning framework is proposed to train our propagation network. Finally, by introducing a taskaware image separation formulation with a flexible optimization scheme, we extend the proposed model for more challenging vision tasks, such as underwater image enhancement and single image rain removal. Experiments on both synthetic and realworld images demonstrate the effectiveness and efficiency of the proposed framework.
SEApr 16, 2017
A Random Walk Based Algorithm for Structural Test Case GenerationJifeng Xuan, He Jiang, Zhilei Ren et al.
Structural testing is a significant and expensive process in software development. By converting test data generation into an optimization problem, search-based software testing is one of the key technologies of automated test case generation. Motivated by the success of random walk in solving the satisfiability problem (SAT), we proposed a random walk based algorithm (WalkTest) to solve structural test case generation problem. WalkTest provides a framework, which iteratively calls random walk operator to search the optimal solutions. In order to improve search efficiency, we sorted the test goals with the costs of solutions completely instead of traditional dependence analysis from control flow graph. Experimental results on the condition-decision coverage demonstrated that WalkTest achieves better performance than existing algorithms (random test and tabu search) in terms of running time and coverage rate.
SEApr 16, 2017
Automatic Bug Triage using Semi-Supervised Text ClassificationJifeng Xuan, He Jiang, Zhilei Ren et al.
In this paper, we propose a semi-supervised text classification approach for bug triage to avoid the deficiency of labeled bug reports in existing supervised approaches. This new approach combines naive Bayes classifier and expectation-maximization to take advantage of both labeled and unlabeled bug reports. This approach trains a classifier with a fraction of labeled bug reports. Then the approach iteratively labels numerous unlabeled bug reports and trains a new classifier with labels of all the bug reports. We also employ a weighted recommendation list to boost the performance by imposing the weights of multiple developers in training the classifier. Experimental results on bug reports of Eclipse show that our new approach outperforms existing supervised approaches in terms of classification accuracy.