Juergen Mueller

4papers

68citations

Novelty45%

AI Score22

Ranked #186,757 of 205,806 authors (top 91%)#1,882 in IR (top 85%)

4 Papers

CVDec 14, 2017

Adaptive kNN using Expected Accuracy for Classification of Geo-Spatial Data

Mark Kibanov, Martin Becker, Juergen Mueller et al.

The k-Nearest Neighbor (kNN) classification approach is conceptually simple - yet widely applied since it often performs well in practical applications. However, using a global constant k does not always provide an optimal solution, e.g., for datasets with an irregular density distribution of data points. This paper proposes an adaptive kNN classifier where k is chosen dynamically for each instance (point) to be classified, such that the expected accuracy of classification is maximized. We define the expected accuracy as the accuracy of a set of structurally similar observations. An arbitrary similarity function can be used to find these observations. We introduce and evaluate different similarity functions. For the evaluation, we use five different classification tasks based on geo-spatial data. Each classification task consists of (tens of) thousands of items. We demonstrate, that the presented expected accuracy measures can be a good estimator for kNN performance, and the proposed adaptive kNN classifier outperforms common kNN and previously introduced adaptive kNN algorithms. Also, we show that the range of considered k can be significantly reduced to speed up the algorithm without negative influence on classification accuracy.

IROct 27, 2017

Combining Aspects of Genetic Algorithms with Weighted Recommender Hybridization

Juergen Mueller

Recommender systems are established means to inspire users to watch interesting movies, discover baby names, or read books. The recommendation quality further improves by combining the results of multiple recommendation algorithms using hybridization methods. In this paper, we focus on the task of combining unscored recommendations into a single ensemble. Our proposed method is inspired by genetic algorithms. It repeatedly selects items from the recommendations to create a population of items that will be used for the final ensemble. We compare our method with a weighted voting method and test the performance of both in a movie- and name-recommendation scenario. We were able to outperform the weighted method on both datasets by 20.3 % and 31.1 % and decreased the overall execution time by up to 19.9 %. Our results do not only propose a new kind of hybridization method, but introduce the field of recommender hybridization to further work with genetic algorithms.

IRMay 9, 2017

Predicting Rising Follower Counts on Twitter Using Profile Information

Juergen Mueller, Gerd Stumme

When evaluating the cause of one's popularity on Twitter, one thing is considered to be the main driver: Many tweets. There is debate about the kind of tweet one should publish, but little beyond tweets. Of particular interest is the information provided by each Twitter user's profile page. One of the features are the given names on those profiles. Studies on psychology and economics identified correlations of the first name to, e.g., one's school marks or chances of getting a job interview in the US. Therefore, we are interested in the influence of those profile information on the follower count. We addressed this question by analyzing the profiles of about 6 Million Twitter users. All profiles are separated into three groups: Users that have a first name, English words, or neither of both in their name field. The assumption is that names and words influence the discoverability of a user and subsequently his/her follower count. We propose a classifier that labels users who will increase their follower count within a month by applying different models based on the user's group. The classifiers are evaluated with the area under the receiver operator curve score and achieves a score above 0.800.

CLJun 17, 2016

Gender Inference using Statistical Name Characteristics in Twitter

Juergen Mueller, Gerd Stumme

Much attention has been given to the task of gender inference of Twitter users. Although names are strong gender indicators, the names of Twitter users are rarely used as a feature; probably due to the high number of ill-formed names, which cannot be found in any name dictionary. Instead of relying solely on a name database, we propose a novel name classifier. Our approach extracts characteristics from the user names and uses those in order to assign the names to a gender. This enables us to classify international first names as well as ill-formed names.