IRSep 13, 2024
NeSHFS: Neighborhood Search with Heuristic-based Feature Selection for Click-Through Rate PredictionDogukan Aksu, Ismail Hakki Toroslu, Hasan Davulcu
Click-through-rate (CTR) prediction plays an important role in online advertising and ad recommender systems. In the past decade, maximizing CTR has been the main focus of model development and solution creation. Therefore, researchers and practitioners have proposed various models and solutions to enhance the effectiveness of CTR prediction. Most of the existing literature focuses on capturing either implicit or explicit feature interactions. Although implicit interactions are successfully captured in some studies, explicit interactions present a challenge for achieving high CTR by extracting both low-order and high-order feature interactions. Unnecessary and irrelevant features may cause high computational time and low prediction performance. Furthermore, certain features may perform well with specific predictive models while underperforming with others. Also, feature distribution may fluctuate due to traffic variations. Most importantly, in live production environments, resources are limited, and the time for inference is just as crucial as training time. Because of all these reasons, feature selection is one of the most important factors in enhancing CTR prediction model performance. Simple filter-based feature selection algorithms do not perform well and they are not sufficient. An effective and efficient feature selection algorithm is needed to consistently filter the most useful features during live CTR prediction process. In this paper, we propose a heuristic algorithm named Neighborhood Search with Heuristic-based Feature Selection (NeSHFS) to enhance CTR prediction performance while reducing dimensionality and training time costs. We conduct comprehensive experiments on three public datasets to validate the efficiency and effectiveness of our proposed solution.
LGJun 17, 2025
Fair for a few: Improving Fairness in Doubly Imbalanced DatasetsAta Yalcin, Asli Umay Ozturk, Yigit Sever et al.
Fairness has been identified as an important aspect of Machine Learning and Artificial Intelligence solutions for decision making. Recent literature offers a variety of approaches for debiasing, however many of them fall short when the data collection is imbalanced. In this paper, we focus on a particular case, fairness in doubly imbalanced datasets, such that the data collection is imbalanced both for the label and the groups in the sensitive attribute. Firstly, we present an exploratory analysis to illustrate limitations in debiasing on a doubly imbalanced dataset. Then, a multi-criteria based solution is proposed for finding the most suitable sampling and distribution for label and sensitive attribute, in terms of fairness and classification accuracy
DBFeb 24, 2014
Secure Logical Schema and Decomposition Algorithm for Proactive Context Dependent Attribute Based Access ControlUgur Turan, Ismail Hakki Toroslu
Traditional database access control mechanisms use role based methods, with generally row based and attribute based constraints for granularity, and privacy is achieved mainly by using views. However if only a set of views according to policy are made accessible to users, then this set should be checked against the policy for the whole probable query history. The aim of this work is to define a proactive decomposition algorithm according to the attribute based policy rules and build a secure logical schema in which relations are decomposed into several ones in order to inhibit joins or inferences that may violate predefined privacy constraints. The attributes whose association should not be inferred, are defined as having security dependency among them and they form a new kind of context dependent attribute based policy rule named as security dependent set. The decomposition algorithm works on a logical schema with given security dependent sets and aims to prohibit the inference of the association among the elements of these sets. It is also proven that the decomposition technique generates a secure logical schema that is in compliance with the given security dependent set constraints.