LGMar 13, 2023
Meta-learning approaches for few-shot learning: A survey of recent advancesHassan Gharoun, Fereshteh Momenifar, Fang Chen et al.
Despite its astounding success in learning deeper multi-dimensional data, the performance of deep learning declines on new unseen tasks mainly due to its focus on same-distribution prediction. Moreover, deep learning is notorious for poor generalization from few samples. Meta-learning is a promising approach that addresses these issues by adapting to new tasks with few-shot datasets. This survey first briefly introduces meta-learning and then investigates state-of-the-art meta-learning methods and recent advances in: (I) metric-based, (II) memory-based, (III), and learning-based methods. Finally, current challenges and insights for future researches are discussed.
AO-PHJul 14, 2022
Soil Erosion in the United States. Present and Future (2020-2050)Shahab Aldin Shojaeezadeh, Malik Al-Wardy, Mohammad Reza Nikoo et al.
Soil erosion is a significant threat to the environment and long-term land management around the world. Accelerated soil erosion by human activities inflicts extreme changes in terrestrial and aquatic ecosystems, which is not fully surveyed/predicted for the present and probable future at field-scales (30-m). Here, we estimate/predict soil erosion rates by water erosion, (sheet and rill erosion), using three alternative (2.6, 4.5, and 8.5) Shared Socioeconomic Pathway and Representative Concentration Pathway (SSP-RCP) scenarios across the contiguous United States. Field Scale Soil Erosion Model (FSSLM) estimations rely on a high resolution (30-m) G2 erosion model integrated by satellite- and imagery-based estimations of land use and land cover (LULC), gauge observations of long-term precipitation, and scenarios of the Coupled Model Intercomparison Project Phase 6 (CMIP6). The baseline model (2020) estimates soil erosion rates of 2.32 Mg ha 1 yr 1 with current agricultural conservation practices (CPs). Future scenarios with current CPs indicate an increase between 8% to 21% under different combinations of SSP-RCP scenarios of climate and LULC changes. The soil erosion forecast for 2050 suggests that all the climate and LULC scenarios indicate either an increase in extreme events or a change in the spatial location of extremes largely from the southern to the eastern and northeastern regions of the United States.
LGSep 6, 2023
Unveiling the frontiers of deep learning: innovations shaping diverse domainsShams Forruque Ahmed, Md. Sakib Bin Alam, Maliha Kabir et al.
Deep learning (DL) allows computer models to learn, visualize, optimize, refine, and predict data. To understand its present state, examining the most recent advancements and applications of deep learning across various domains is essential. However, prior reviews focused on DL applications in only one or two domains. The current review thoroughly investigates the use of DL in four different broad fields due to the plenty of relevant research literature in these domains. This wide range of coverage provides a comprehensive and interconnected understanding of DL's influence and opportunities, which is lacking in other reviews. The study also discusses DL frameworks and addresses the benefits and challenges of utilizing DL in each field, which is only occasionally available in other reviews. DL frameworks like TensorFlow and PyTorch make it easy to develop innovative DL applications across diverse domains by providing model development and deployment platforms. This helps bridge theoretical progress and practical implementation. Deep learning solves complex problems and advances technology in many fields, demonstrating its revolutionary potential and adaptability. CNN LSTM models with attention mechanisms can forecast traffic with 99 percent accuracy. Fungal diseased mango leaves can be classified with 97.13 percent accuracy by the multi layer CNN model. However, deep learning requires rigorous data collection to analyze and process large amounts of data because it is independent of training data. Thus, large scale medical, research, healthcare, and environmental data compilation are challenging, reducing deep learning effectiveness. Future research should address data volume, privacy, domain complexity, and data quality issues in DL datasets.
NEFeb 1, 2023
A fuzzy adaptive metaheuristic algorithm for identifying sustainable, economical, lightweight, and earthquake-resistant reinforced concrete cantilever retaining wallsFarshid Keivanian, Raymond Chiong, Ali R. Kashani et al.
In earthquake-prone zones, the seismic performance of reinforced concrete cantilever (RCC) retaining walls is significant. In this study, the seismic performance was investigated using horizontal and vertical pseudo-static coefficients. To tackle RCC weights and forces resulting from these earth pressures, 26 constraints for structural strengths and geotechnical stability along with 12 geometric variables are associated with each design. These constraints and design variables form a constraint optimization problem with a twelve-dimensional solution space. To conduct effective search and produce sustainable, economical, lightweight RCC designs robust against earthquake hazards, a novel adaptive fuzzy-based metaheuristic algorithm is applied. The proposed method divides the search space to sub-regions and establishes exploration, information sharing, and exploitation search capabilities based on its novel search components. Further, fuzzy inference systems were employed to address parameterization and computational cost evaluation issues. It was found that the proposed algorithm can achieve low-cost, low-weight, and low CO2 emission RCC designs under nine seismic conditions in comparison with several classical and best-performing design optimizers.
LGDec 20, 2022
Graph Neural Networks in Computer Vision -- Architectures, Datasets and Common ApproachesMaciej Krzywda, Szymon Łukasik, Amir H. Gandomi
Graph Neural Networks (GNNs) are a family of graph networks inspired by mechanisms existing between nodes on a graph. In recent years there has been an increased interest in GNN and their derivatives, i.e., Graph Attention Networks (GAT), Graph Convolutional Networks (GCN), and Graph Recurrent Networks (GRN). An increase in their usability in computer vision is also observed. The number of GNN applications in this field continues to expand; it includes video analysis and understanding, action and behavior recognition, computational photography, image and video synthesis from zero or few shots, and many more. This contribution aims to collect papers published about GNN-based approaches towards computer vision. They are described and summarized from three perspectives. Firstly, we investigate the architectures of Graph Neural Networks and their derivatives used in this area to provide accurate and explainable recommendations for the ensuing investigations. As for the other aspect, we also present datasets used in these works. Finally, using graph analysis, we also examine relations between GNN-based studies in computer vision and potential sources of inspiration identified outside of this field.
LGSep 18, 2023
Noise-Augmented Boruta: The Neural Network Perturbation Infusion with Boruta Feature SelectionHassan Gharoun, Navid Yazdanjoe, Mohammad Sadegh Khorshidi et al.
With the surge in data generation, both vertically (i.e., volume of data) and horizontally (i.e., dimensionality), the burden of the curse of dimensionality has become increasingly palpable. Feature selection, a key facet of dimensionality reduction techniques, has advanced considerably to address this challenge. One such advancement is the Boruta feature selection algorithm, which successfully discerns meaningful features by contrasting them to their permutated counterparts known as shadow features. However, the significance of a feature is shaped more by the data's overall traits than by its intrinsic value, a sentiment echoed in the conventional Boruta algorithm where shadow features closely mimic the characteristics of the original ones. Building on this premise, this paper introduces an innovative approach to the Boruta feature selection algorithm by incorporating noise into the shadow variables. Drawing parallels from the perturbation analysis framework of artificial neural networks, this evolved version of the Boruta method is presented. Rigorous testing on four publicly available benchmark datasets revealed that this proposed technique outperforms the classic Boruta algorithm, underscoring its potential for enhanced, accurate feature selection.
IVSep 19, 2024
Beyond Uncertainty Quantification: Learning Uncertainty for Trust-Informed Neural Network Decisions - A Case Study in COVID-19 ClassificationHassan Gharoun, Mohammad Sadegh Khorshidi, Fang Chen et al.
Reliable uncertainty quantification is critical in high-stakes applications, such as medical diagnosis, where confidently incorrect predictions can erode trust in automated decision-making systems. Traditional uncertainty quantification methods rely on a predefined confidence threshold to classify predictions as confident or uncertain. However, this approach assumes that predictions exceeding the threshold are trustworthy, while those below it are uncertain, without explicitly assessing the correctness of high-confidence predictions. As a result, confidently incorrect predictions may still occur, leading to misleading uncertainty assessments. To address this limitation, this study proposed an uncertainty-aware stacked neural network, which extends conventional uncertainty quantification by learning when predictions should be trusted. The framework consists of a two-tier model: the base model generates predictions with uncertainty estimates, while the meta-model learns to assign a trust flag, distinguishing confidently correct cases from those requiring expert review. The proposed approach is evaluated against the traditional threshold-based method across multiple confidence thresholds and pre-trained architectures using the COVIDx CXR-4 dataset. Results demonstrate that the proposed framework significantly reduces confidently incorrect predictions, offering a more trustworthy and efficient decision-support system for high-stakes domains.
LGFeb 24, 2024
Clustering in Dynamic Environments: A Framework for Benchmark Dataset Generation With Heterogeneous ChangesDanial Yazdani, Juergen Branke, Mohammad Sadegh Khorshidi et al.
Clustering in dynamic environments is of increasing importance, with broad applications ranging from real-time data analysis and online unsupervised learning to dynamic facility location problems. While meta-heuristics have shown promising effectiveness in static clustering tasks, their application for tracking optimal clustering solutions or robust clustering over time in dynamic environments remains largely underexplored. This is partly due to a lack of dynamic datasets with diverse, controllable, and realistic dynamic characteristics, hindering systematic performance evaluations of clustering algorithms in various dynamic scenarios. This deficiency leads to a gap in our understanding and capability to effectively design algorithms for clustering in dynamic environments. To bridge this gap, this paper introduces the Dynamic Dataset Generator (DDG). DDG features multiple dynamic Gaussian components integrated with a range of heterogeneous, local, and global changes. These changes vary in spatial and temporal severity, patterns, and domain of influence, providing a comprehensive tool for simulating a wide range of dynamic scenarios.
CVApr 3
ProtoFlow: Mitigating Forgetting in Class-Incremental Remote Sensing Segmentation via Low-Curvature Prototype FlowJiekai Wu, Rong Fu, Chuangqi Li et al.
Remote sensing segmentation in real deployment is inherently continual: new semantic categories emerge, and acquisition conditions shift across seasons, cities, and sensors. Despite recent progress, many incremental approaches still treat training steps as isolated updates, which leaves representation drift and forgetting insufficiently controlled. We present ProtoFlow, a time-aware prototype dynamics framework that models class prototypes as trajectories and learns their evolution with an explicit temporal vector field. By jointly enforcing low-curvature motion and inter-class separation, ProtoFlow stabilizes prototype geometry throughout incremental learning. Experiments on standard class- and domain-incremental remote sensing benchmarks show consistent gains over strong baselines, including up to 1.5-2.0 points improvement in mIoUall, together with reduced forgetting. These results suggest that explicitly modeling temporal prototype evolution is a practical and interpretable strategy for robust continual remote sensing segmentation.
LGJan 10, 2025
Kolmogorov-Arnold networks for metal surface defect classificationMaciej Krzywda, Mariusz Wermiński, Szymon Łukasik et al.
This paper presents the application of Kolmogorov-Arnold Networks (KAN) in classifying metal surface defects. Specifically, steel surfaces are analyzed to detect defects such as cracks, inclusions, patches, pitted surfaces, and scratches. Drawing on the Kolmogorov-Arnold theorem, KAN provides a novel approach compared to conventional multilayer perceptrons (MLPs), facilitating more efficient function approximation by utilizing spline functions. The results show that KAN networks can achieve better accuracy than convolutional neural networks (CNNs) with fewer parameters, resulting in faster convergence and improved performance in image classification.
LGJan 11, 2024
Semantic-Preserving Feature Partitioning for Multi-View Ensemble LearningMohammad Sadegh Khorshidi, Navid Yazdanjue, Hassan Gharoun et al.
In machine learning, the exponential growth of data and the associated ``curse of dimensionality'' pose significant challenges, particularly with expansive yet sparse datasets. Addressing these challenges, multi-view ensemble learning (MEL) has emerged as a transformative approach, with feature partitioning (FP) playing a pivotal role in constructing artificial views for MEL. Our study introduces the Semantic-Preserving Feature Partitioning (SPFP) algorithm, a novel method grounded in information theory. The SPFP algorithm effectively partitions datasets into multiple semantically consistent views, enhancing the MEL process. Through extensive experiments on eight real-world datasets, ranging from high-dimensional with limited instances to low-dimensional with high instances, our method demonstrates notable efficacy. It maintains model accuracy while significantly improving uncertainty measures in scenarios where high generalization performance is achievable. Conversely, it retains uncertainty metrics while enhancing accuracy where high generalization accuracy is less attainable. An effect size analysis further reveals that the SPFP algorithm outperforms benchmark models by large effect size and reduces computational demands through effective dimensionality reduction. The substantial effect sizes observed in most experiments underscore the algorithm's significant improvements in model performance.
CLApr 9
Graph Neural Networks for Misinformation Detection: Performance-Efficiency Trade-offsSoveatin Kuntur, Maciej Krzywda, Anna Wróblewska et al.
The rapid spread of online misinformation has led to increasingly complex detection models, including large language models and hybrid architectures. However, their computational cost and deployment limitations raise concerns about practical applicability. In this work, we benchmark graph neural networks (GNNs) against non-graph-based machine learning methods under controlled and comparable conditions. We evaluate lightweight GNN architectures (GCN, GraphSAGE, GAT, ChebNet) against Logistic Regression, Support Vector Machines, and Multilayer Perceptrons across seven public datasets in English, Indonesian, and Polish. All models use identical TF-IDF features to isolate the impact of relational structure. Performance is measured using F1 score, with inference time reported to assess efficiency. GNNs consistently outperform non-graph baselines across all datasets. For example, GraphSAGE achieves 96.8% F1 on Kaggle and 91.9% on WELFake, compared to 73.2% and 66.8% for MLP, respectively. On COVID-19, GraphSAGE reaches 90.5% F1 vs. 74.9%, while ChebNet attains 79.1% vs. 66.4% on FakeNewsNet. These gains are achieved with comparable or lower inference times. Overall, the results show that classic GNNs remain effective and efficient, challenging the need for increasingly complex architectures in misinformation detection.
NESep 16, 2025
From Embeddings to Equations: Genetic-Programming Surrogates for Interpretable Transformer ClassificationMohammad Sadegh Khorshidi, Navid Yazdanjue, Hassan Gharoun et al.
We study symbolic surrogate modeling of frozen Transformer embeddings to obtain compact, auditable classifiers with calibrated probabilities. For five benchmarks (SST2G, 20NG, MNIST, CIFAR10, MSC17), embeddings from ModernBERT, DINOv2, and SigLIP are partitioned on the training set into disjoint, information-preserving views via semantic-preserving feature partitioning (SPFP). A cooperative multi-population genetic program (MEGP) then learns additive, closed-form logit programs over these views. Across 30 runs per dataset we report F1, AUC, log-loss, Brier, expected calibration error (ECE), and symbolic complexity; a canonical model is chosen by a one-standard-error rule on validation F1 with a parsimony tie-break. Temperature scaling fitted on validation yields substantial ECE reductions on test. The resulting surrogates achieve strong discrimination (up to F1 around 0.99 on MNIST, CIFAR10, MSC17; around 0.95 on SST2G), while 20NG remains most challenging. We provide reliability diagrams, dimension usage and overlap statistics, contribution-based importances, and global effect profiles (PDP and ALE), demonstrating faithful, cross-modal explanations grounded in explicit programs.
NESep 16, 2025
Multi-population Ensemble Genetic Programming via Cooperative Coevolution and Multi-view Learning for ClassificationMohammad Sadegh Khorshidi, Navid Yazdanjue, Hassan Gharoun et al.
This paper introduces Multi-population Ensemble Genetic Programming (MEGP), a computational intelligence framework that integrates cooperative coevolution and the multiview learning paradigm to address classification challenges in high-dimensional and heterogeneous feature spaces. MEGP decomposes the input space into conditionally independent feature subsets, enabling multiple subpopulations to evolve in parallel while interacting through a dynamic ensemble-based fitness mechanism. Each individual encodes multiple genes whose outputs are aggregated via a differentiable softmax-based weighting layer, enhancing both model interpretability and adaptive decision fusion. A hybrid selection mechanism incorporating both isolated and ensemble-level fitness promotes inter-population cooperation while preserving intra-population diversity. This dual-level evolutionary dynamic facilitates structured search exploration and reduces premature convergence. Experimental evaluations across eight benchmark datasets demonstrate that MEGP consistently outperforms a baseline GP model in terms of convergence behavior and generalization performance. Comprehensive statistical analyses validate significant improvements in Log-Loss, Precision, Recall, F1 score, and AUC. MEGP also exhibits robust diversity retention and accelerated fitness gains throughout evolution, highlighting its effectiveness for scalable, ensemble-driven evolutionary learning. By unifying population-based optimization, multi-view representation learning, and cooperative coevolution, MEGP contributes a structurally adaptive and interpretable framework that advances emerging directions in evolutionary machine learning.
CLJul 24, 2025
The Role of Orthographic Consistency in Multilingual Embedding Models for Text Classification in Arabic-Script LanguagesAbdulhady Abas Abdullah, Amir H. Gandomi, Tarik A Rashid et al.
In natural language processing, multilingual models like mBERT and XLM-RoBERTa promise broad coverage but often struggle with languages that share a script yet differ in orthographic norms and cultural context. This issue is especially notable in Arabic-script languages such as Kurdish Sorani, Arabic, Persian, and Urdu. We introduce the Arabic Script RoBERTa (AS-RoBERTa) family: four RoBERTa-based models, each pre-trained on a large corpus tailored to its specific language. By focusing pre-training on language-specific script features and statistics, our models capture patterns overlooked by general-purpose models. When fine-tuned on classification tasks, AS-RoBERTa variants outperform mBERT and XLM-RoBERTa by 2 to 5 percentage points. An ablation study confirms that script-focused pre-training is central to these gains. Error analysis using confusion matrices shows how shared script traits and domain-specific content affect performance. Our results highlight the value of script-aware specialization for languages using the Arabic script and support further work on pre-training strategies rooted in script and language specificity.
DBFeb 1, 2024
Text-Based Product Matching -- Semi-Supervised Clustering ApproachAlicja Martinek, Szymon Łukasik, Amir H. Gandomi
Matching identical products present in multiple product feeds constitutes a crucial element of many tasks of e-commerce, such as comparing product offerings, dynamic price optimization, and selecting the assortment personalized for the client. It corresponds to the well-known machine learning task of entity matching, with its own specificity, like omnipresent unstructured data or inaccurate and inconsistent product descriptions. This paper aims to present a new philosophy to product matching utilizing a semi-supervised clustering approach. We study the properties of this method by experimenting with the IDEC algorithm on the real-world dataset using predominantly textual features and fuzzy string matching, with more standard approaches as a point of reference. Encouraging results show that unsupervised matching, enriched with a small annotated sample of product links, could be a possible alternative to the dominant supervised strategy, requiring extensive manual data labeling.
LGOct 19, 2025
Uncertainty-Aware Post-Hoc Calibration: Mitigating Confidently Incorrect Predictions Beyond Calibration MetricsHassan Gharoun, Mohammad Sadegh Khorshidi, Kasra Ranjbarigderi et al.
Despite extensive research on neural network calibration, existing methods typically apply global transformations that treat all predictions uniformly, overlooking the heterogeneous reliability of individual predictions. Furthermore, the relationship between improved calibration and effective uncertainty-aware decision-making remains largely unexplored. This paper presents a post-hoc calibration framework that leverages prediction reliability assessment to jointly enhance calibration quality and uncertainty-aware decision-making. The framework employs proximity-based conformal prediction to stratify calibration samples into putatively correct and putatively incorrect groups based on semantic similarity in feature space. A dual calibration strategy is then applied: standard isotonic regression calibrated confidence in putatively correct predictions, while underconfidence-regularized isotonic regression reduces confidence toward uniform distributions for putatively incorrect predictions, facilitating their identification for further investigations. A comprehensive evaluation is conducted using calibration metrics, uncertainty-aware performance measures, and empirical conformal coverage. Experiments on CIFAR-10 and CIFAR-100 with BiT and CoAtNet backbones show that the proposed method achieves lower confidently incorrect predictions, and competitive Expected Calibration Error compared with isotonic and focal-loss baselines. This work bridges calibration and uncertainty quantification through instance-level adaptivity, offering a practical post-hoc solution that requires no model retraining while improving both probability alignment and uncertainty-aware decision-making.
AIOct 14, 2025
Evolution of meta's llama models and parameter-efficient fine-tuning of large language models: a surveyAbdulhady Abas Abdullah, Arkaitz Zubiaga, Seyedali Mirjalili et al.
This review surveys the rapid evolution of Meta AI's LLaMA (Large Language Model Meta AI) series - from LLaMA 1 through LLaMA 4 and the specialized parameter-efficient fine-tuning (PEFT) methods developed for these models. We first describe the LLaMA family of foundation models (7B-65B to 288B parameters), their architectures (including native multimodal and Mixtureof-Experts variants), and key performance characteristics. We then describe and discuss the concept of PEFT, which adapts large pre-trained models by updating only a small subset of parameters, and review five PEFT methods that have been applied to LLaMA: LoRA (Low-Rank Adaptation), LLaMA-Adapter V1 and V2, LLaMA-Excitor, and QLoRA (Quantized LoRA). We discuss each method's mechanism, parameter savings, and example application to LLaMA (e.g., instruction tuning, multimodal tasks). We provide structured discussion and analysis of model and adapter architectures, parameter counts, and benchmark results (including examples where fine-tuned LLaMA models outperform larger baselines). Finally, we examine real-world use cases where LLaMA-based models and PEFT have been successfully applied (e.g., legal and medical domains), and we discuss ongoing challenges and future research directions (such as scaling to even larger contexts and improving robustness). This survey paper provides a one-stop resource for ML researchers and practitioners interested in LLaMA models and efficient fine-tuning strategies.
NESep 20, 2025
Domain-Informed Genetic Superposition Programming: A Case Study on SFRC BeamsMohammad Sadegh Khorshidi, Navid Yazdanjue, Hassan Gharoun et al.
This study presents domain-informed genetic superposition programming (DIGSP), a symbolic regression framework tailored for engineering systems governed by separable physical mechanisms. DIGSP partitions the input space into domain-specific feature subsets and evolves independent genetic programming (GP) populations to model material-specific effects. Early evolution occurs in isolation, while ensemble fitness promotes inter-population cooperation. To enable symbolic superposition, an adaptive hierarchical symbolic abstraction mechanism (AHSAM) is triggered after stagnation across all populations. AHSAM performs analysis of variance- (ANOVA) based filtering to identify statistically significant individuals, compresses them into symbolic constructs, and injects them into all populations through a validation-guided pruning cycle. The DIGSP is benchmarked against a baseline multi-gene genetic programming (BGP) model using a dataset of steel fiber-reinforced concrete (SFRC) beams. Across 30 independent trials with 65% training, 10% validation, and 25% testing splits, DIGSP consistently outperformed BGP in training and test root mean squared error (RMSE). The Wilcoxon rank-sum test confirmed statistical significance (p < 0.01), and DIGSP showed tighter error distributions and fewer outliers. No significant difference was observed in validation RMSE due to limited sample size. These results demonstrate that domain-informed structural decomposition and symbolic abstraction improve convergence and generalization. DIGSP offers a principled and interpretable modeling strategy for systems where symbolic superposition aligns with the underlying physical structure.
CVSep 11, 2025
Proximity-Based Evidence Retrieval for Uncertainty-Aware Neural NetworksHassan Gharoun, Mohammad Sadegh Khorshidi, Kasra Ranjbarigderi et al.
This work proposes an evidence-retrieval mechanism for uncertainty-aware decision-making that replaces a single global cutoff with an evidence-conditioned, instance-adaptive criterion. For each test instance, proximal exemplars are retrieved in an embedding space; their predictive distributions are fused via Dempster-Shafer theory. The resulting fused belief acts as a per-instance thresholding mechanism. Because the supporting evidences are explicit, decisions are transparent and auditable. Experiments on CIFAR-10/100 with BiT and ViT backbones show higher or comparable uncertainty-aware performance with materially fewer confidently incorrect outcomes and a sustainable review load compared with applying threshold on prediction entropy. Notably, only a few evidences are sufficient to realize these gains; increasing the evidence set yields only modest changes. These results indicate that evidence-conditioned tagging provides a more reliable and interpretable alternative to fixed prediction entropy thresholds for operational uncertainty-aware decision-making.
CLJul 19, 2025
A Language Model-Driven Semi-Supervised Ensemble Framework for Illicit Market Detection Across Deep/Dark Web and Social PlatformsNavid Yazdanjue, Morteza Rakhshaninejad, Hossein Yazdanjouei et al.
Illegal marketplaces have increasingly shifted to concealed parts of the internet, including the deep and dark web, as well as platforms such as Telegram, Reddit, and Pastebin. These channels enable the anonymous trade of illicit goods including drugs, weapons, and stolen credentials. Detecting and categorizing such content remains challenging due to limited labeled data, the evolving nature of illicit language, and the structural heterogeneity of online sources. This paper presents a hierarchical classification framework that combines fine-tuned language models with a semi-supervised ensemble learning strategy to detect and classify illicit marketplace content across diverse platforms. We extract semantic representations using ModernBERT, a transformer model for long documents, finetuned on domain-specific data from deep and dark web pages, Telegram channels, Subreddits, and Pastebin pastes to capture specialized jargon and ambiguous linguistic patterns. In addition, we incorporate manually engineered features such as document structure, embedded patterns including Bitcoin addresses, emails, and IPs, and metadata, which complement language model embeddings. The classification pipeline operates in two stages. The first stage uses a semi-supervised ensemble of XGBoost, Random Forest, and SVM with entropy-based weighted voting to detect sales-related documents. The second stage further classifies these into drug, weapon, or credential sales. Experiments on three datasets, including our multi-source corpus, DUTA, and CoDA, show that our model outperforms several baselines, including BERT, ModernBERT, DarkBERT, ALBERT, Longformer, and BigBird. The model achieves an accuracy of 0.96489, an F1-score of 0.93467, and a TMCC of 0.95388, demonstrating strong generalization, robustness under limited supervision, and effectiveness in real-world illicit content detection.
LGJun 13, 2024
A Review of 315 Benchmark and Test Functions for Machine Learning Optimization Algorithms and Metaheuristics with Mathematical and Visual DescriptionsM. Z. Naser, Mohammad Khaled al-Bashiti, Arash Teymori Gharah Tapeh et al.
In the rapidly evolving optimization and metaheuristics domains, the efficacy of algorithms is crucially determined by the benchmark (test) functions. While several functions have been developed and derived over the past decades, little information is available on the mathematical and visual description, range of suitability, and applications of many such functions. To bridge this knowledge gap, this review provides an exhaustive survey of more than 300 benchmark functions used in the evaluation of optimization and metaheuristics algorithms. This review first catalogs benchmark and test functions based on their characteristics, complexity, properties, visuals, and domain implications to offer a wide view that aids in selecting appropriate benchmarks for various algorithmic challenges. This review also lists the 25 most commonly used functions in the open literature and proposes two new, highly dimensional, dynamic and challenging functions that could be used for testing new algorithms. Finally, this review identifies gaps in current benchmarking practices and suggests directions for future research.