Zeyd Boukhers

CV
h-index22
29papers
95citations
Novelty42%
AI Score52

29 Papers

ROMay 27
ICAN-Deploy: Identity-Stable Canary Deployment for Safety-Critical Embodied Agents

Xue Qin, Simin Luan, John See et al.

Canary deployment routes a fraction of traffic to a new software version, monitors metrics, and rolls back on regression. Mainstream controllers (Argo Rollouts, Spinnaker, Flagger) change the deployed system's cryptographic identity during the canary window. The drift is harmless for stateless microservices but breaks the claim that "the agent you certified is still the agent you have" for safety-critical embodied agents, forcing re-certification per canary. We present ICAN-Deploy (Identity-stable CANary Deployment), a middleware construction whose state machine holds the identity hash invariant across the canary window by separating capability names (frozen, hashed) from capability versions (mutable runtime state). We implement ICAN-Deploy inside a runtime governance layer for LLM-driven robots and verify invariance by closed-form proof, AST lint, and TLA+ model-checking, then corroborate over N=100 real canary cycles on a Franka Panda arm in MuJoCo (zero drift; entry latency 95% BCa CI [1.52, 2.01] ms). A feature-flagged strawman that folds versions into the manifest falsifies on the same workload. A system certified once at identity-creation time can then ship arbitrary capability evolution under that same certification, within the version-and-name envelope.

DBMar 15, 2023
Enhancing Data Space Semantic Interoperability through Machine Learning: a Visionary Perspective

Zeyd Boukhers, Christoph Lange, Oya Beyan

Our vision paper outlines a plan to improve the future of semantic interoperability in data spaces through the application of machine learning. The use of data spaces, where data is exchanged among members in a self-regulated environment, is becoming increasingly popular. However, the current manual practices of managing metadata and vocabularies in these spaces are time-consuming, prone to errors, and may not meet the needs of all stakeholders. By leveraging the power of machine learning, we believe that semantic interoperability in data spaces can be significantly improved. This involves automatically generating and updating metadata, which results in a more flexible vocabulary that can accommodate the diverse terminologies used by different sub-communities. Our vision for the future of data spaces addresses the limitations of conventional data exchange and makes data more accessible and valuable for all members of the community.

CVOct 10, 2023
Skeleton Ground Truth Extraction: Methodology, Annotation Tool and Benchmarks

Cong Yang, Bipin Indurkhya, John See et al.

Skeleton Ground Truth (GT) is critical to the success of supervised skeleton extraction methods, especially with the popularity of deep learning techniques. Furthermore, we see skeleton GTs used not only for training skeleton detectors with Convolutional Neural Networks (CNN) but also for evaluating skeleton-related pruning and matching algorithms. However, most existing shape and image datasets suffer from the lack of skeleton GT and inconsistency of GT standards. As a result, it is difficult to evaluate and reproduce CNN-based skeleton detectors and algorithms on a fair basis. In this paper, we present a heuristic strategy for object skeleton GT extraction in binary shapes and natural images. Our strategy is built on an extended theory of diagnosticity hypothesis, which enables encoding human-in-the-loop GT extraction based on clues from the target's context, simplicity, and completeness. Using this strategy, we developed a tool, SkeView, to generate skeleton GT of 17 existing shape and image datasets. The GTs are then structurally evaluated with representative methods to build viable baselines for fair comparisons. Experiments demonstrate that GTs generated by our strategy yield promising quality with respect to standard consistency, and also provide a balance between simplicity and completeness.

CVMar 16Code
RSGen: Enhancing Layout-Driven Remote Sensing Image Generation with Diverse Edge Guidance

Xianbao Hou, Yonghao He, Zeyd Boukhers et al.

Diffusion models have significantly mitigated the impact of annotated data scarcity in remote sensing (RS). Although recent approaches have successfully harnessed these models to enable diverse and controllable Layout-to-Image (L2I) synthesis, they still suffer from limited fine-grained control and fail to strictly adhere to bounding box constraints. To address these limitations, we propose RSGen, a plug-and-play framework that leverages diverse edge guidance to enhance layout-driven RS image generation. Specifically, RSGen employs a progressive enhancement strategy: 1) it first enriches the diversity of edge maps composited from retrieved training instances via Image-to-Image generation; and 2) subsequently utilizes these diverse edge maps as conditioning for existing L2I models to enforce pixel-level control within bounding boxes, ensuring the generated instances strictly adhere to the layout. Extensive experiments across three baseline models demonstrate that RSGen significantly boosts the capabilities of existing L2I models. For instance, with CC-Diff on the DOTA dataset for oriented object detection, we achieve remarkable gains of +9.8/+12.0 in YOLOScore mAP50/mAP50-95 and +1.6 in mAP on the downstream detection task. Our code will be publicly available: https://github.com/D-Robotics-AI-Lab/RSGen

DLJul 11, 2022
Whois? Deep Author Name Disambiguation using Bibliographic Data

Zeyd Boukhers, Nagaraj Asundi Bahubali

As the number of authors is increasing exponentially over years, the number of authors sharing the same names is increasing proportionally. This makes it challenging to assign newly published papers to their adequate authors. Therefore, Author Name Ambiguity (ANA) is considered a critical open problem in digital libraries. This paper proposes an Author Name Disambiguation (AND) approach that links author names to their real-world entities by leveraging their co-authors and domain of research. To this end, we use a collection from the DBLP repository that contains more than 5 million bibliographic records authored by around 2.6 million co-authors. Our approach first groups authors who share the same last names and same first name initials. The author within each group is identified by capturing the relation with his/her co-authors and area of research, which is represented by the titles of the validated publications of the corresponding author. To this end, we train a neural network model that learns from the representations of the co-authors and titles. We validated the effectiveness of our approach by conducting extensive experiments on a large dataset.

DLMar 17, 2023
Deep Author Name Disambiguation using DBLP Data

Zeyd Boukhers, Nagaraj Bahubali Asundi

In the academic world, the number of scientists grows every year and so does the number of authors sharing the same names. Consequently, it challenging to assign newly published papers to their respective authors. Therefore, Author Name Ambiguity (ANA) is considered a critical open problem in digital libraries. This paper proposes an Author Name Disambiguation (AND) approach that links author names to their real-world entities by leveraging their co-authors and domain of research. To this end, we use data collected from the DBLP repository that contains more than 5 million bibliographic records authored by around 2.6 million co-authors. Our approach first groups authors who share the same last names and same first name initials. The author within each group is identified by capturing the relation with his/her co-authors and area of research, represented by the titles of the validated publications of the corresponding author. To this end, we train a neural network model that learns from the representations of the co-authors and titles. We validated the effectiveness of our approach by conducting extensive experiments on a large dataset.

CRMar 27, 2023
PADME-SoSci: A Platform for Analytics and Distributed Machine Learning for the Social Sciences

Zeyd Boukhers, Arnim Bleier, Yeliz Ucer Yediel et al.

Data privacy and ownership are significant in social data science, raising legal and ethical concerns. Sharing and analyzing data is difficult when different parties own different parts of it. An approach to this challenge is to apply de-identification or anonymization techniques to the data before collecting it for analysis. However, this can reduce data utility and increase the risk of re-identification. To address these limitations, we present PADME, a distributed analytics tool that federates model implementation and training. PADME uses a federated approach where the model is implemented and deployed by all parties and visits each data location incrementally for training. This enables the analysis of data across locations while still allowing the model to be trained as if all data were in a single location. Training the model on data in its original location preserves data ownership. Furthermore, the results are not provided until the analysis is completed on all data locations to ensure privacy and avoid bias in the results.

AIMay 11
Autonomous FAIR Digital Objects: From Passive Assertions to Active Knowledge

Zeyd Boukhers, Oya Beyan, Cong Yang et al.

Scientific knowledge on the Web is published as passive assertions and cannot decide when to validate evidence, reconcile contradictions, or update confidence as findings accumulate. Curation depends on centralised middleware and institutional continuity, but when registries close, active stewardship stops even when data remain online. We advance the concept of Autonomous FAIR Digital Objects (aFDOs) from an abstract idea to an operational model, to offer a route from passive scientific publication toward accountable, standards-aligned automation that can outlive its publishing institutions. aFDO augments FDOs with three capabilities anchored in Semantic Web standards, namely 1) a policy layer over RDF-star aligned with PROV-O, SHACL, and ODRL for portable condition-action rules, 2) an announcement layer over ActivityStreams 2.0 that bounds per-announcement evaluation cost, and 3) an agreement layer that resolves multi-source contradictions through reputation and confidence weighted agreement under a bounded adversarial model. We provide a formal definition that distinguishes policy specifications, event handlers, and communication interfaces. We evaluate an open reference implementation on 4,305 FDOs grounded in rare-disease ontologies, namely ClinVar, HPO, and Orphanet, combined with controlled synthetic observations. The consensus mechanism resolves 56.3% of 3,914 naturally occurring ClinVar conflicts where multiple submitters disagree and an expert panel has subsequently adjudicated. Under Sybil, collusion, and poisoning attacks, the mechanism degrades gracefully within its design Byzantine-tolerance bound (f < n/5), and fails as predicted beyond that bound.

LGMay 4
Enhancing RL Generalizability in Robotics through SHAP Analysis of Algorithms and Hyperparameters

Lingxiao Kong, Cong Yang, Oya Deniz Beyan et al.

Despite significant advances in Reinforcement Learning (RL), model performance remains highly sensitive to algorithm and hyperparameter configurations, while generalization gaps across environments complicate real-world deployment. Although prior work has studied RL generalization, the relative contribution of specific configurations to the generalization gap has not been quantitatively decomposed and systematically leveraged for configuration selection. To address this limitation, we propose an explainable framework that evaluates RL performance across robotic environments using SHapley Additive exPlanations (SHAP) to quantify configuration impacts. We establish a theoretical foundation connecting Shapley values to generalizability, empirically analyze configuration impact patterns, and introduce SHAP-guided configuration selection to enhance generalization. Our results reveal distinct patterns across algorithms and hyperparameters, with consistent configuration impacts across diverse tasks and environments. By applying these insights to configuration selection, we achieve improved RL generalizability and provide actionable guidance for practitioners.

CLFeb 8, 2025
ELMTEX: Fine-Tuning Large Language Models for Structured Clinical Information Extraction. A Case Study on Clinical Reports

Aynur Guluzade, Naguib Heiba, Zeyd Boukhers et al.

Europe's healthcare systems require enhanced interoperability and digitalization, driving a demand for innovative solutions to process legacy clinical data. This paper presents the results of our project, which aims to leverage Large Language Models (LLMs) to extract structured information from unstructured clinical reports, focusing on patient history, diagnoses, treatments, and other predefined categories. We developed a workflow with a user interface and evaluated LLMs of varying sizes through prompting strategies and fine-tuning. Our results show that fine-tuned smaller models match or surpass larger counterparts in performance, offering efficiency for resource-limited settings. A new dataset of 60,000 annotated English clinical summaries and 24,000 German translations was validated with automated and manual checks. The evaluations used ROUGE, BERTScore, and entity-level metrics. The work highlights the approach's viability and outlines future improvements.

LGNov 11, 2024
Large Language Model in Medical Informatics: Direct Classification and Enhanced Text Representations for Automatic ICD Coding

Zeyd Boukhers, AmeerAli Khan, Qusai Ramadan et al.

Addressing the complexity of accurately classifying International Classification of Diseases (ICD) codes from medical discharge summaries is challenging due to the intricate nature of medical documentation. This paper explores the use of Large Language Models (LLM), specifically the LLAMA architecture, to enhance ICD code classification through two methodologies: direct application as a classifier and as a generator of enriched text representations within a Multi-Filter Residual Convolutional Neural Network (MultiResCNN) framework. We evaluate these methods by comparing them against state-of-the-art approaches, revealing LLAMA's potential to significantly improve classification outcomes by providing deep contextual insights into medical texts.

CLMay 5, 2025
EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning

Lingxiao Kong, Cong Yang, Susanne Neufang et al.

Recent advances in reinforcement learning (RL) for large language model (LLM) fine-tuning show promise in addressing multi-objective tasks but still face significant challenges, including competing objective balancing, low training efficiency, poor scalability, and limited explainability. Leveraging ensemble learning principles, we introduce an Ensemble Multi-Objective RL (EMORL) framework that fine-tunes multiple models with individual objectives while optimizing their aggregation after the fine-tuning to improve efficiency and flexibility. Our method is the first to aggregate the hidden states of individual models, incorporating contextual information from multiple objectives. This approach is supported by a hierarchical grid search algorithm that identifies optimal weighted combinations. We evaluate EMORL on counselor reflection generation tasks, using text classification models to score the generations and provide rewards during RL fine-tuning. Through comprehensive experiments on the PAIR and Psych8k datasets, we demonstrate the advantages of EMORL against existing baselines: significantly lower and more stable training consumption ($17,529\pm 1,650$ data points and $6,573\pm 147.43$ seconds), improved scalability and explainability, and comparable performance across multiple objectives.

LGMay 14, 2024
Falcon 7b for Software Mention Detection in Scholarly Documents

AmeerAli Khan, Qusai Ramadan, Cong Yang et al.

This paper aims to tackle the challenge posed by the increasing integration of software tools in research across various disciplines by investigating the application of Falcon-7b for the detection and classification of software mentions within scholarly texts. Specifically, the study focuses on solving Subtask I of the Software Mention Detection in Scholarly Publications (SOMD), which entails identifying and categorizing software mentions from academic literature. Through comprehensive experimentation, the paper explores different training strategies, including a dual-classifier approach, adaptive sampling, and weighted loss scaling, to enhance detection accuracy while overcoming the complexities of class imbalance and the nuanced syntax of scholarly writing. The findings highlight the benefits of selective labelling and adaptive sampling in improving the model's performance. However, they also indicate that integrating multiple strategies does not necessarily result in cumulative improvements. This research offers insights into the effective application of large language models for specific tasks such as SOMD, underlining the importance of tailored approaches to address the unique challenges presented by academic text analysis.

CLSep 25, 2025
Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective

Lingxiao Kong, Cong Yang, Oya Deniz Beyan et al.

Multi-Objective Reinforcement Learning (MORL) presents significant challenges and opportunities for optimizing multiple objectives in Large Language Models (LLMs). We introduce a MORL taxonomy and examine the advantages and limitations of various MORL methods when applied to LLM optimization, identifying the need for efficient and flexible approaches that accommodate personalization functionality and inherent complexities in LLMs and RL. We propose a vision for a MORL benchmarking framework that addresses the effects of different methods on diverse objective relationships. As future research directions, we focus on meta-policy MORL development that can improve efficiency and flexibility through its bi-level learning paradigm, highlighting key research questions and potential solutions for improving LLM performance.

CVSep 23, 2025
MoiréNet: A Compact Dual-Domain Network for Image Demoiréing

Shuwei Guo, Simin Luan, Yan Ke et al.

Moiré patterns arise from spectral aliasing between display pixel lattices and camera sensor grids, manifesting as anisotropic, multi-scale artifacts that pose significant challenges for digital image demoiréing. We propose MoiréNet, a convolutional neural U-Net-based framework that synergistically integrates frequency and spatial domain features for effective artifact removal. MoiréNet introduces two key components: a Directional Frequency-Spatial Encoder (DFSE) that discerns moiré orientation via directional difference convolution, and a Frequency-Spatial Adaptive Selector (FSAS) that enables precise, feature-adaptive suppression. Extensive experiments demonstrate that MoiréNet achieves state-of-the-art performance on public and actively used datasets while being highly parameter-efficient. With only 5.513M parameters, representing a 48% reduction compared to ESDNet-L, MoiréNet combines superior restoration quality with parameter efficiency, making it well-suited for resource-constrained applications including smartphone photography, industrial imaging, and augmented reality.

CVSep 3, 2025
InstaDA: Augmenting Instance Segmentation Data with Dual-Agent System

Xianbao Hou, Yonghao He, Zeyd Boukhers et al.

Acquiring high-quality instance segmentation data is challenging due to the labor-intensive nature of the annotation process and significant class imbalances within datasets. Recent studies have utilized the integration of Copy-Paste and diffusion models to create more diverse datasets. However, these studies often lack deep collaboration between large language models (LLMs) and diffusion models, and underutilize the rich information within the existing training data. To address these limitations, we propose InstaDA, a novel, training-free Dual-Agent system designed to augment instance segmentation datasets. First, we introduce a Text-Agent (T-Agent) that enhances data diversity through collaboration between LLMs and diffusion models. This agent features a novel Prompt Rethink mechanism, which iteratively refines prompts based on the generated images. This process not only fosters collaboration but also increases image utilization and optimizes the prompts themselves. Additionally, we present an Image-Agent (I-Agent) aimed at enriching the overall data distribution. This agent augments the training set by generating new instances conditioned on the training images. To ensure practicality and efficiency, both agents operate as independent and automated workflows, enhancing usability. Experiments conducted on the LVIS 1.0 validation set indicate that InstaDA achieves significant improvements, with an increase of +4.0 in box average precision (AP) and +3.3 in mask AP compared to the baseline. Furthermore, it outperforms the leading model, DiverGen, by +0.3 in box AP and +0.1 in mask AP, with a notable +0.7 gain in box AP on common categories and mask AP gains of +0.2 on common categories and +0.5 on frequent categories.

CVAug 19, 2025
Unleashing Semantic and Geometric Priors for 3D Scene Completion

Shiyuan Chen, Wei Sui, Bohao Zhang et al.

Camera-based 3D semantic scene completion (SSC) provides dense geometric and semantic perception for autonomous driving and robotic navigation. However, existing methods rely on a coupled encoder to deliver both semantic and geometric priors, which forces the model to make a trade-off between conflicting demands and limits its overall performance. To tackle these challenges, we propose FoundationSSC, a novel framework that performs dual decoupling at both the source and pathway levels. At the source level, we introduce a foundation encoder that provides rich semantic feature priors for the semantic branch and high-fidelity stereo cost volumes for the geometric branch. At the pathway level, these priors are refined through specialised, decoupled pathways, yielding superior semantic context and depth distributions. Our dual-decoupling design produces disentangled and refined inputs, which are then utilised by a hybrid view transformation to generate complementary 3D features. Additionally, we introduce a novel Axis-Aware Fusion (AAF) module that addresses the often-overlooked challenge of fusing these features by anisotropically merging them into a unified representation. Extensive experiments demonstrate the advantages of FoundationSSC, achieving simultaneous improvements in both semantic and geometric metrics, surpassing prior bests by +0.23 mIoU and +2.03 IoU on SemanticKITTI. Additionally, we achieve state-of-the-art performance on SSCBench-KITTI-360, with 21.78 mIoU and 48.61 IoU. The code will be released upon acceptance.

CLJul 10, 2025
Exploring the Limits of Model Compression in LLMs: A Knowledge Distillation Study on QA Tasks

Joyeeta Datta, Niclas Doll, Qusai Ramadan et al.

Large Language Models (LLMs) have demonstrated outstanding performance across a range of NLP tasks, however, their computational demands hinder their deployment in real-world, resource-constrained environments. This work investigates the extent to which LLMs can be compressed using Knowledge Distillation (KD) while maintaining strong performance on Question Answering (QA) tasks. We evaluate student models distilled from the Pythia and Qwen2.5 families on two QA benchmarks, SQuAD and MLQA, under zero-shot and one-shot prompting conditions. Results show that student models retain over 90% of their teacher models' performance while reducing parameter counts by up to 57.1%. Furthermore, one-shot prompting yields additional performance gains over zero-shot setups for both model families. These findings underscore the trade-off between model efficiency and task performance, demonstrating that KD, combined with minimal prompting, can yield compact yet capable QA systems suitable for resource-constrained applications.

IRJan 9, 2025
Comparison of Feature Learning Methods for Metadata Extraction from PDF Scholarly Documents

Zeyd Boukhers, Cong Yang

The availability of metadata for scientific documents is pivotal in propelling scientific knowledge forward and for adhering to the FAIR principles (i.e. Findability, Accessibility, Interoperability, and Reusability) of research findings. However, the lack of sufficient metadata in published documents, particularly those from smaller and mid-sized publishers, hinders their accessibility. This issue is widespread in some disciplines, such as the German Social Sciences, where publications often employ diverse templates. To address this challenge, our study evaluates various feature learning and prediction methods, including natural language processing (NLP), computer vision (CV), and multimodal approaches, for extracting metadata from documents with high template variance. We aim to improve the accessibility of scientific documents and facilitate their wider use. To support our comparison of these methods, we provide comprehensive experimental results, analyzing their accuracy and efficiency in extracting metadata. Additionally, we provide valuable insights into the strengths and weaknesses of various feature learning and prediction methods, which can guide future research in this field.

CVFeb 10, 2024
Gyroscope-Assisted Motion Deblurring Network

Simin Luan, Cong Yang, Zeyd Boukhers et al.

Image research has shown substantial attention in deblurring networks in recent years. Yet, their practical usage in real-world deblurring, especially motion blur, remains limited due to the lack of pixel-aligned training triplets (background, blurred image, and blur heat map) and restricted information inherent in blurred images. This paper presents a simple yet efficient framework to synthetic and restore motion blur images using Inertial Measurement Unit (IMU) data. Notably, the framework includes a strategy for training triplet generation, and a Gyroscope-Aided Motion Deblurring (GAMD) network for blurred image restoration. The rationale is that through harnessing IMU data, we can determine the transformation of the camera pose during the image exposure phase, facilitating the deduction of the motion trajectory (aka. blur trajectory) for each point inside the three-dimensional space. Thus, the synthetic triplets using our strategy are inherently close to natural motion blur, strictly pixel-aligned, and mass-producible. Through comprehensive experiments, we demonstrate the advantages of the proposed framework: only two-pixel errors between our synthetic and real-world blur trajectories, a marked improvement (around 33.17%) of the state-of-the-art deblurring method MIMO on Peak Signal-to-Noise Ratio (PSNR).

STFeb 12, 2022
Beyond Trading Data: The Hidden Influence of Public Awareness and Interest on Cryptocurrency Volatility

Zeyd Boukhers, Azeddine Bouabdallah, Cong Yang et al.

Since Bitcoin first appeared on the scene in 2009, cryptocurrencies have become a worldwide phenomenon as important decentralized financial assets. Their decentralized nature, however, leads to notable volatility against traditional fiat currencies, making the task of accurately forecasting the crypto-fiat exchange rate complex. This study examines the various independent factors that affect the volatility of the Bitcoin-Dollar exchange rate. To this end, we propose CoMForE, a multimodal AdaBoost-LSTM ensemble model, which not only utilizes historical trading data but also incorporates public sentiments from related tweets, public interest demonstrated by search volumes, and blockchain hash-rate data. Our developed model goes a step further by predicting fluctuations in the overall cryptocurrency value distribution, thus increasing its value for investment decision-making. We have subjected this method to extensive testing via comprehensive experiments, thereby validating the importance of multimodal combination over exclusive reliance on trading data. Further experiments show that our method significantly surpasses existing forecasting tools and methodologies, demonstrating a 19.29% improvement. This result underscores the influence of external independent factors on cryptocurrency volatility.

CVJan 10, 2022
COIN: Counterfactual Image Generation for VQA Interpretation

Zeyd Boukhers, Timo Hartmann, Jan Jürjens

Due to the significant advancement of Natural Language Processing and Computer Vision-based models, Visual Question Answering (VQA) systems are becoming more intelligent and advanced. However, they are still error-prone when dealing with relatively complex questions. Therefore, it is important to understand the behaviour of the VQA models before adopting their results. In this paper, we introduce an interpretability approach for VQA models by generating counterfactual images. Specifically, the generated image is supposed to have the minimal possible change to the original image and leads the VQA model to give a different answer. In addition, our approach ensures that the generated image is realistic. Since quantitative metrics cannot be employed to evaluate the interpretability of the model, we carried out a user study to assess different aspects of our approach. In addition to interpreting the result of VQA models on single images, the obtained results and the discussion provides an extensive explanation of VQA models' behaviour.

CLAug 8, 2021
Leveraging Commonsense Knowledge on Classifying False News and Determining Checkworthiness of Claims

Ipek Baris Schlicht, Erhan Sezerer, Selma Tekir et al.

Widespread and rapid dissemination of false news has made fact-checking an indispensable requirement. Given its time-consuming and labor-intensive nature, the task calls for an automated support to meet the demand. In this paper, we propose to leverage commonsense knowledge for the tasks of false news classification and check-worthy claim detection. Arguing that commonsense knowledge is a factor in human believability, we fine-tune the BERT language model with a commonsense question answering task and the aforementioned tasks in a multi-task learning environment. For predicting fine-grained false news types, we compare the proposed fine-tuned model's performance with the false news classification models on a public dataset as well as a newly collected dataset. We compare the model's performance with the single-task BERT model and a state-of-the-art check-worthy claim detection tool to evaluate the check-worthy claim detection. Our experimental analysis demonstrates that commonsense knowledge can improve performance in both tasks.

DLJul 9, 2021
Bib2Auth: Deep Learning Approach for Author Disambiguation using Bibliographic Data

Zeyd Boukhers, Nagaraj Bahubali, Abinaya Thulsi Chandrasekaran et al.

Author name ambiguity remains a critical open problem in digital libraries due to synonymy and homonymy of names. In this paper, we propose a novel approach to link author names to their real-world entities by relying on their co-authorship pattern and area of research. Our supervised deep learning model identifies an author by capturing his/her relationship with his/her co-authors and area of research, which is represented by the titles and sources of the target author's publications. These attributes are encoded by their semantic and symbolic representations. To this end, Bib2Auth uses ~ 22K bibliographic records from the DBLP repository and is trained with each pair of co-authors. The extensive experiments have proved the capability of the approach to distinguish between authors sharing the same name and recognize authors with different name variations. Bib2Auth has shown good performance on a relatively large dataset, which qualifies it to be directly integrated into bibliographic indices.

DLJun 23, 2021
BiblioDAP: The 1st Workshop on Bibliographic Data Analysis and Processing

Zeyd Boukhers, Philipp Mayr, Silvio Peroni

Automatic processing of bibliographic data becomes very important in digital libraries, data science and machine learning due to its importance in keeping pace with the significant increase of published papers every year from one side and to the inherent challenges from the other side. This processing has several aspects including but not limited to I) Automatic extraction of references from PDF documents, II) Building an accurate citation graph, III) Author name disambiguation, etc. Bibliographic data is heterogeneous by nature and occurs in both structured (e.g. citation graph) and unstructured (e.g. publications) formats. Therefore, it requires data science and machine learning techniques to be processed and analysed. Here we introduce BiblioDAP'21: The 1st Workshop on Bibliographic Data Analysis and Processing.

IRJun 4, 2021
MexPub: Deep Transfer Learning for Metadata Extraction from German Publications

Zeyd Boukhers, Nada Beili, Timo Hartmann et al.

Extracting metadata from scientific papers can be considered a solved problem in NLP due to the high accuracy of state-of-the-art methods. However, this does not apply to German scientific publications, which have a variety of styles and layouts. In contrast to most of the English scientific publications that follow standard and simple layouts, the order, content, position and size of metadata in German publications vary greatly among publications. This variety makes traditional NLP methods fail to accurately extract metadata from these publications. In this paper, we present a method that extracts metadata from PDF documents with different layouts and styles by viewing the document as an image. We used Mask R-CNN that is trained on COCO dataset and finetuned with PubLayNet dataset that consists of ~200K PDF snapshots with five basic classes (e.g. text, figure, etc). We refine-tuned the model on our proposed synthetic dataset consisting of ~30K article snapshots to extract nine patterns (i.e. author, title, etc). Our synthetic dataset is generated using contents in both languages German and English and a finite set of challenging templates obtained from German publications. Our method achieved an average accuracy of around $90\%$ which validates its capability to accurately extract metadata from a variety of PDF documents with challenging templates.

LGMar 11, 2021
Hybrid Physics and Deep Learning Model for Interpretable Vehicle State Prediction

Alexandra Baier, Zeyd Boukhers, Steffen Staab

Physical motion models offer interpretable predictions for the motion of vehicles. However, some model parameters, such as those related to aero- and hydrodynamics, are expensive to measure and are often only roughly approximated reducing prediction accuracy. Recurrent neural networks achieve high prediction accuracy at low cost, as they can use cheap measurements collected during routine operation of the vehicle, but their results are hard to interpret. To precisely predict vehicle states without expensive measurements of physical parameters, we propose a hybrid approach combining deep learning and physical motion models including a novel two-phase training procedure. We achieve interpretability by restricting the output range of the deep neural network as part of the hybrid model, which limits the uncertainty introduced by the neural network to a known quantity. We have evaluated our approach for the use case of ship and quadcopter motion. The results show that our hybrid model can improve model interpretability with no decrease in accuracy compared to existing deep learning approaches.

CLJan 14, 2021
ECOL: Early Detection of COVID Lies Using Content, Prior Knowledge and Source Information

Ipek Baris, Zeyd Boukhers

Social media platforms are vulnerable to fake news dissemination, which causes negative consequences such as panic and wrong medication in the healthcare domain. Therefore, it is important to automatically detect fake news in an early stage before they get widely spread. This paper analyzes the impact of incorporating content information, prior knowledge, and credibility of sources into models for the early detection of fake news. We propose a framework modeling those features by using BERT language model and external sources, namely Simple English Wikipedia and source reliability tags. The conducted experiments on CONSTRAINT datasets demonstrated the benefit of integrating these features for the early detection of fake news in the healthcare domain.

LGNov 23, 2020
LaHAR: Latent Human Activity Recognition using LDA

Zeyd Boukhers, Danniene Wete, Steffen Staab

Processing sequential multi-sensor data becomes important in many tasks due to the dramatic increase in the availability of sensors that can acquire sequential data over time. Human Activity Recognition (HAR) is one of the fields which are actively benefiting from this availability. Unlike most of the approaches addressing HAR by considering predefined activity classes, this paper proposes a novel approach to discover the latent HAR patterns in sequential data. To this end, we employed Latent Dirichlet Allocation (LDA), which is initially a topic modelling approach used in text analysis. To make the data suitable for LDA, we extract the so-called "sensory words" from the sequential data. We carried out experiments on a challenging HAR dataset, demonstrating that LDA is capable of uncovering underlying structures in sequential data, which provide a human-understandable representation of the data. The extrinsic evaluations reveal that LDA is capable of accurately clustering HAR data sequences compared to the labelled activities.