Sofiene Lassoued

AI
h-index17
4papers
24citations
Novelty54%
AI Score42

4 Papers

AIJan 8
Flexible Manufacturing Systems Intralogistics: Dynamic Optimization of AGVs and Tool Sharing Using Coloured-Timed Petri Nets and Actor-Critic RL with Actions Masking

Sofiene Lassoued, Laxmikant Shrikant Bahetic, Nathalie Weiß-Borkowskib et al.

Flexible Manufacturing Systems (FMS) are pivotal in optimizing production processes in today's rapidly evolving manufacturing landscape. This paper advances the traditional job shop scheduling problem by incorporating additional complexities through the simultaneous integration of automated guided vehicles (AGVs) and tool-sharing systems. We propose a novel approach that combines Colored-Timed Petri Nets (CTPNs) with actor-critic model-based reinforcement learning (MBRL), effectively addressing the multifaceted challenges associated with FMS. CTPNs provide a formal modeling structure and dynamic action masking, significantly reducing the action search space, while MBRL ensures adaptability to changing environments through the learned policy. Leveraging the advantages of MBRL, we incorporate a lookahead strategy for optimal positioning of AGVs, improving operational efficiency. Our approach was evaluated on small-sized public benchmarks and a newly developed large-scale benchmark inspired by the Taillard benchmark. The results show that our approach matches traditional methods on smaller instances and outperforms them on larger ones in terms of makespan while achieving a tenfold reduction in computation time. To ensure reproducibility, we propose a gym-compatible environment and an instance generator. Additionally, an ablation study evaluates the contribution of each framework component to its overall performance.

AIJan 16
Policy-Based Deep Reinforcement Learning Hyperheuristics for Job-Shop Scheduling Problems

Sofiene Lassoued, Asrat Gobachew, Stefan Lier et al.

This paper proposes a policy-based deep reinforcement learning hyper-heuristic framework for solving the Job Shop Scheduling Problem. The hyper-heuristic agent learns to switch scheduling rules based on the system state dynamically. We extend the hyper-heuristic framework with two key mechanisms. First, action prefiltering restricts decision-making to feasible low-level actions, enabling low-level heuristics to be evaluated independently of environmental constraints and providing an unbiased assessment. Second, a commitment mechanism regulates the frequency of heuristic switching. We investigate the impact of different commitment strategies, from step-wise switching to full-episode commitment, on both training behavior and makespan. Additionally, we compare two action selection strategies at the policy level: deterministic greedy selection and stochastic sampling. Computational experiments on standard JSSP benchmarks demonstrate that the proposed approach outperforms traditional heuristics, metaheuristics, and recent neural network-based scheduling methods

AIJan 14
Policy-Based Reinforcement Learning with Action Masking for Dynamic Job Shop Scheduling under Uncertainty: Handling Random Arrivals and Machine Failures

Sofiene Lassoued, Stefan Lier, Andreas Schwung

We present a novel framework for solving Dynamic Job Shop Scheduling Problems under uncertainty, addressing the challenges introduced by stochastic job arrivals and unexpected machine breakdowns. Our approach follows a model-based paradigm, using Coloured Timed Petri Nets to represent the scheduling environment, and Maskable Proximal Policy Optimization to enable dynamic decision-making while restricting the agent to feasible actions at each decision point. To simulate realistic industrial conditions, dynamic job arrivals are modeled using a Gamma distribution, which captures complex temporal patterns such as bursts, clustering, and fluctuating workloads. Machine failures are modeled using a Weibull distribution to represent age-dependent degradation and wear-out dynamics. These stochastic models enable the framework to reflect real-world manufacturing scenarios better. In addition, we study two action-masking strategies: a non-gradient approach that overrides the probabilities of invalid actions, and a gradient-based approach that assigns negative gradients to invalid actions within the policy network. We conduct extensive experiments on dynamic JSSP benchmarks, demonstrating that our method consistently outperforms traditional heuristic and rule-based approaches in terms of makespan minimization. The results highlight the strength of combining interpretable Petri-net-based models with adaptive reinforcement learning policies, yielding a resilient, scalable, and explainable framework for real-time scheduling in dynamic and uncertain manufacturing environments.

AIJan 23, 2024
Introducing PetriRL: An Innovative Framework for JSSP Resolution Integrating Petri nets and Event-based Reinforcement Learning

Sofiene Lassoued, Andreas Schwung

Resource utilization and production process optimization are crucial for companies in today's competitive industrial landscape. Addressing the complexities of job shop scheduling problems (JSSP) is essential to improving productivity, reducing costs, and ensuring timely delivery. We propose PetriRL, a novel framework integrating Petri nets and deep reinforcement learning (DRL) for JSSP optimization. PetriRL capitalizes on the inherent strengths of Petri nets in modelling discrete event systems while leveraging the advantages of a graph structure. The Petri net governs automated components of the process, ensuring adherence to JSSP constraints. This allows for synergistic collaboration with optimization algorithms such as DRL, particularly in critical decision-making. Unlike traditional methods, PetriRL eliminates the need to preprocess JSSP instances into disjunctive graphs and enhances the explainability of process status through its graphical structure based on places and transitions. Additionally, the inherent graph structure of Petri nets enables the dynamic additions of job operations during the inference phase without requiring agent retraining, thus enhancing flexibility. Experimental results demonstrate PetriRL's robust generalization across various instance sizes and its competitive performance on public test benchmarks and randomly generated instances. Results are compared to a wide range of optimization solutions such as heuristics, metaheuristics, and learning-based algorithms. Finally, the added values of the framework's key elements, such as event-based control and action masking, are studied in the ablation study.