Patrick O. Glauner

CV
5papers
141citations
Novelty16%
AI Score18

5 Papers

IROct 1, 2013Code
Use of Solr and Xapian in the Invenio document repository software

Patrick O. Glauner, Jan Iwaszkiewicz, Jean-Yves Le Meur et al.

Invenio is a free comprehensive web-based document repository and digital library software suite originally developed at CERN. It can serve a variety of use cases from an institutional repository or digital library to a web journal. In order to fully use full-text documents for efficient search and ranking, Solr was integrated into Invenio through a generic bridge. Solr indexes extracted full-texts and most relevant metadata. Consequently, Invenio takes advantage of Solr's efficient search and word similarity ranking capabilities. In this paper, we first give an overview of Invenio, its capabilities and features. We then present our open source Solr integration as well as scalability challenges that arose for an Invenio-based multi-million record repository: the CERN Document Server. We also compare our Solr adapter to an alternative Xapian adapter using the same generic bridge. Both integrations are distributed with the Invenio package and ready to be used by the institutions using or adopting Invenio.

LGFeb 26, 2016
Large-Scale Detection of Non-Technical Losses in Imbalanced Data Sets

Patrick O. Glauner, Andre Boechat, Lautaro Dolberg et al.

Non-technical losses (NTL) such as electricity theft cause significant harm to our economies, as in some countries they may range up to 40% of the total electricity distributed. Detecting NTLs requires costly on-site inspections. Accurate prediction of NTLs for customers using machine learning is therefore crucial. To date, related research largely ignore that the two classes of regular and non-regular customers are highly imbalanced, that NTL proportions may change and mostly consider small data sets, often not allowing to deploy the results in production. In this paper, we present a comprehensive approach to assess three NTL detection models for different NTL proportions in large real world data sets of 100Ks of customers: Boolean rules, fuzzy logic and Support Vector Machine. This work has resulted in appreciable results that are about to be deployed in a leading industry solution. We believe that the considerations and observations made in this contribution are necessary for future smart meter research in order to report their effectiveness on imbalanced and large real world data sets.

CVJan 30, 2016
Deep Learning For Smile Recognition

Patrick O. Glauner

Inspired by recent successes of deep learning in computer vision, we propose a novel application of deep convolutional neural networks to facial expression recognition, in particular smile recognition. A smile recognition test accuracy of 99.45% is achieved for the Denver Intensity of Spontaneous Facial Action (DISFA) database, significantly outperforming existing approaches based on hand-crafted features with accuracies ranging from 65.55% to 79.67%. The novelty of this approach includes a comprehensive model selection of the architecture parameters, allowing to find an appropriate architecture for each expression such as smile. This is feasible because all experiments were run on a Tesla K40c GPU, allowing a speedup of factor 10 over traditional computations on a CPU.

CVAug 26, 2015
Deep Convolutional Neural Networks for Smile Recognition

Patrick O. Glauner

This thesis describes the design and implementation of a smile detector based on deep convolutional neural networks. It starts with a summary of neural networks, the difficulties of training them and new training methods, such as Restricted Boltzmann Machines or autoencoders. It then provides a literature review of convolutional neural networks and recurrent neural networks. In order to select databases for smile recognition, comprehensive statistics of databases popular in the field of facial expression recognition were generated and are summarized in this thesis. It then proposes a model for smile detection, of which the main part is implemented. The experimental results are discussed in this thesis and justified based on a comprehensive model selection performed. All experiments were run on a Tesla K40c GPU benefiting from a speedup of up to factor 10 over the computations on a CPU. A smile detection test accuracy of 99.45% is achieved for the Denver Intensity of Spontaneous Facial Action (DISFA) database, significantly outperforming existing approaches with accuracies ranging from 65.55% to 79.67%. This experiment is re-run under various variations, such as retaining less neutral images or only the low or high intensities, of which the results are extensively compared.

LGApr 26, 2015
Comparison of Training Methods for Deep Neural Networks

Patrick O. Glauner

This report describes the difficulties of training neural networks and in particular deep neural networks. It then provides a literature review of training methods for deep neural networks, with a focus on pre-training. It focuses on Deep Belief Networks composed of Restricted Boltzmann Machines and Stacked Autoencoders and provides an outreach on further and alternative approaches. It also includes related practical recommendations from the literature on training them. In the second part, initial experiments using some of the covered methods are performed on two databases. In particular, experiments are performed on the MNIST hand-written digit dataset and on facial emotion data from a Kaggle competition. The results are discussed in the context of results reported in other research papers. An error rate lower than the best contribution to the Kaggle competition is achieved using an optimized Stacked Autoencoder.