CLDec 26, 2022
Automatic Text Simplification of News Articles in the Context of Public BroadcastingDiego Maupomé, Fanny Rancourt, Thomas Soulas et al.
This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Université de Montréal in August 2022. The team tackled a problem submitted by CBC/Radio-Canada on the theme of Automatic Text Simplification (ATS).
ROApr 27, 2024
An Attention-Based Deep Learning Architecture for Real-Time Monocular Visual Odometry: Applications to GPS-free Drone NavigationOlivier Brochu Dufour, Abolfazl Mohebbi, Sofiane Achiche
Drones are increasingly used in fields like industry, medicine, research, disaster relief, defense, and security. Technical challenges, such as navigation in GPS-denied environments, hinder further adoption. Research in visual odometry is advancing, potentially solving GPS-free navigation issues. Traditional visual odometry methods use geometry-based pipelines which, while popular, often suffer from error accumulation and high computational demands. Recent studies utilizing deep neural networks (DNNs) have shown improved performance, addressing these drawbacks. Deep visual odometry typically employs convolutional neural networks (CNNs) and sequence modeling networks like recurrent neural networks (RNNs) to interpret scenes and deduce visual odometry from video sequences. This paper presents a novel real-time monocular visual odometry model for drones, using a deep neural architecture with a self-attention module. It estimates the ego-motion of a camera on a drone, using consecutive video frames. An inference utility processes the live video feed, employing deep learning to estimate the drone's trajectory. The architecture combines a CNN for image feature extraction and a long short-term memory (LSTM) network with a multi-head attention module for video sequence modeling. Tested on two visual odometry datasets, this model converged 48% faster than a previous RNN model and showed a 22% reduction in mean translational drift and a 12% improvement in mean translational absolute trajectory error, demonstrating enhanced robustness to noise.
CVFeb 1, 2019
Fast and Optimal Laplacian Solver for Gradient-Domain Image Editing using Green Function ConvolutionDominique Beaini, Sofiane Achiche, Fabrice Nonez et al.
In computer vision, the gradient and Laplacian of an image are used in different applications, such as edge detection, feature extraction, and seamless image cloning. Computing the gradient of an image is straightforward since numerical derivatives are available in most computer vision toolboxes. However, the reverse problem is more difficult, since computing an image from its gradient requires to solve the Laplacian equation, also called Poisson equation. Current discrete methods are either slow or require heavy parallel computing. The objective of this paper is to present a novel fast and robust method of solving the image gradient or Laplacian with minimal error, which can be used for gradient domain editing. By using a single convolution based on a numerical Green's function, the whole process is faster and straightforward to implement with different computer vision libraries. It can also be optimized on a GPU using fast Fourier transforms and can easily be generalized for an n dimension image. The tests show that, for images of resolution 801x1200, the proposed GFC can solve 100 Laplacian in parallel in around 1.0 milliseconds ms. This is orders of magnitude faster than our nearest competitor which requires 294ms for a single image. Furthermore, we prove mathematically and demonstrate empirically that the proposed method is the least error solver for gradient domain editing. The developed method is also validated with examples of Poisson blending, gradient removal, and the proposed gradient domain merging GDM. Finally, we present how the GDM can be leveraged in future works for convolutional neural networks CNN.