Nazim Bendib

CV
h-index1
4papers
9citations
Novelty50%
AI Score38

4 Papers

LGSep 17, 2024
A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler

Mohammed Tirichine, Nassim Ameur, Nazim Bendib et al.

Code optimization is a crucial task that aims to enhance code performance. However, this process is often tedious and complex, highlighting the necessity for automatic code optimization techniques. Reinforcement Learning (RL) has emerged as a promising approach for tackling such complex optimization problems. In this project, we introduce MLIR RL, an RL environment for the MLIR compiler, dedicated to facilitating MLIR compiler research and enabling automatic code optimization. We propose a multi-discrete formulation of the action space where the action space is the Cartesian product of simpler action subspaces. We also propose a new method, called level pointers, to reduce the size of the action space related to the loop interchange transformation. This enables more efficient and effective learning of the policy. To demonstrate the effectiveness of MLIR RL, we train an RL agent to optimize MLIR Linalg code, targeting CPU. The code is generated from two domain-specific frameworks: deep-learning models generated from PyTorch, and LQCD (Lattice Quantum Chromodynamics) code generated from an LQCD compiler. The result of this work is a research environment that allows the community to experiment with novel ideas in RL-driven loop-nest optimization.

32.1AIApr 28
Improving Zero-Shot Offline RL via Behavioral Task Sampling

Nazim Bendib, Nicolas Perrin-Gilbert, Olivier Sigaud

Offline zero-shot reinforcement learning (RL) aims to learn agents that optimize unseen reward functions without additional environment interaction. The standard approach to this problem trains task-conditioned policies by sampling task vectors that define linear reward functions over learned state representations. In most existing algorithms, these task vectors are randomly sampled, implicitly assuming this adequately captures the structure of the task space. We argue that doing so leads to suboptimal zero-shot generalization. To address this limitation, we propose extracting task vectors directly from the offline dataset and using them to define the task distribution used for policy training. We introduce a simple and general reward function extraction procedure that integrates into existing offline zero-shot RL algorithms. Across multiple benchmark environments and baselines, our approach improves zero-shot performance by an average of 20%, highlighting the importance of principled task sampling in offline zero-shot RL.

CVFeb 19, 2023
Supervised Contrastive Learning and Feature Fusion for Improved Kinship Verification

Nazim Bendib

Facial Kinship Verification is the task of determining the degree of familial relationship between two facial images. It has recently gained a lot of interest in various applications spanning forensic science, social media, and demographic studies. In the past decade, deep learning-based approaches have emerged as a promising solution to this problem, achieving state-of-the-art performance. In this paper, we propose a novel method for solving kinship verification by using supervised contrastive learning, which trains the model to maximize the similarity between related individuals and minimize it between unrelated individuals. Our experiments show state-of-the-art results and achieve 81.1% accuracy in the Families in the Wild (FIW) dataset.

CVMay 12, 2024
CoViews: Adaptive Augmentation Using Cooperative Views for Enhanced Contrastive Learning

Nazim Bendib

Data augmentation plays a critical role in generating high-quality positive and negative pairs necessary for effective contrastive learning. However, common practices involve using a single augmentation policy repeatedly to generate multiple views, potentially leading to inefficient training pairs due to a lack of cooperation between views. Furthermore, to find the optimal set of augmentations, many existing methods require extensive supervised evaluation, overlooking the evolving nature of the model that may require different augmentations throughout the training. Other approaches train differentiable augmentation generators, thus limiting the use of non-differentiable transformation functions from the literature. In this paper, we address these challenges by proposing a framework for learning efficient adaptive data augmentation policies for contrastive learning with minimal computational overhead. Our approach continuously generates new data augmentation policies during training and produces effective positives/negatives without any supervision. Within this framework, we present two methods: \ac{IndepViews}, which generates augmentation policies used across all views, and \ac{CoViews}, which generates dependent augmentation policies for each view. This enables us to learn dependencies between the transformations applied to each view and ensures that the augmentation strategies applied to different views complement each other, leading to more meaningful and discriminative representations. Through extensive experimentation on multiple datasets and contrastive learning frameworks, we demonstrate that our method consistently outperforms baseline solutions and that training with a view-dependent augmentation policy outperforms training with an independent policy shared across views, showcasing its effectiveness in enhancing contrastive learning performance.