LGFeb 1
PolySAE: Modeling Feature Interactions in Sparse Autoencoders via Polynomial DecodingPanagiotis Koromilas, Andreas D. Demou, James Oldfield et al.
Sparse autoencoders (SAEs) have emerged as a promising method for interpreting neural network representations by decomposing activations into sparse combinations of dictionary atoms. However, SAEs assume that features combine additively through linear reconstruction, an assumption that cannot capture compositional structure: linear models cannot distinguish whether "Starbucks" arises from the composition of "star" and "coffee" features or merely their co-occurrence. This forces SAEs to allocate monolithic features for compound concepts rather than decomposing them into interpretable constituents. We introduce PolySAE, which extends the SAE decoder with higher-order terms to model feature interactions while preserving the linear encoder essential for interpretability. Through low-rank tensor factorization on a shared projection subspace, PolySAE captures pairwise and triple feature interactions with small parameter overhead (3% on GPT2). Across four language models and three SAE variants, PolySAE achieves an average improvement of approximately 8% in probing F1 while maintaining comparable reconstruction error, and produces 2-10$\times$ larger Wasserstein distances between class-conditional feature distributions. Critically, learned interaction weights exhibit negligible correlation with co-occurrence frequency ($r = 0.06$ vs. $r = 0.82$ for SAE feature covariance), suggesting that polynomial terms capture compositional structure, such as morphological binding and phrasal composition, largely independent of surface statistics.
LGSep 30, 2019
A Gradient Free Neural Network Framework Based on Universal Approximation TheoremNikolaos P. Bakas, Andreas Langousis, Mihalis Nicolaou et al.
We present a numerical scheme for computation of Artificial Neural Networks (ANN) weights, which stems from the Universal Approximation Theorem, avoiding laborious iterations. The proposed algorithm adheres to the underlying theory, is highly fast, and results in remarkably low errors when applied for regression and classification of complex data-sets, such as the Griewank function of multiple variables $\mathbf{x} \in \mathbb{R}^{100}$ with random noise addition, and MNIST database for handwritten digits recognition, with $7\times10^4$ images. The same mathematical formulation is found capable of approximating highly nonlinear functions in multiple dimensions, with low errors (e.g. $10^{-10}$) for the test-set of the unknown functions, their higher-order partial derivatives, as well as numerically solving Partial Differential Equations. The method is based on the calculation of the weights of each neuron in small neighborhoods of the data, such that the corresponding local approximation matrix is invertible. Accordingly, optimization of hyperparameters is not necessary, as the number of neurons stems directly from the dimensionality of the data, further improving the algorithmic speed. Under this setting, overfitting is inherently avoided, and the results are interpretable and reproducible. The complexity of the proposed algorithm is of class P with $\mathcal{O}(mn^2)+\mathcal{O}(\frac{m^3}{n^2})-\mathcal{O}(\log(n+1))$ computing time, with respect to the observations $m$ and features $n$, in contrast with the NP-Complete class of standard algorithms for ANN training. The performance of the method is high, irrespective of the size of the dataset, and the test-set errors are similar or smaller than the training errors, indicating the generalization efficiency of the algorithm.
CVMay 1, 2019
3DFaceGAN: Adversarial Nets for 3D Face Representation, Generation, and TranslationStylianos Moschoglou, Stylianos Ploumpis, Mihalis Nicolaou et al.
Over the past few years, Generative Adversarial Networks (GANs) have garnered increased interest among researchers in Computer Vision, with applications including, but not limited to, image generation, translation, imputation, and super-resolution. Nevertheless, no GAN-based method has been proposed in the literature that can successfully represent, generate or translate 3D facial shapes (meshes). This can be primarily attributed to two facts, namely that (a) publicly available 3D face databases are scarce as well as limited in terms of sample size and variability (e.g., few subjects, little diversity in race and gender), and (b) mesh convolutions for deep networks present several challenges that are not entirely tackled in the literature, leading to operator approximations and model instability, often failing to preserve high-frequency components of the distribution. As a result, linear methods such as Principal Component Analysis (PCA) have been mainly utilized towards 3D shape analysis, despite being unable to capture non-linearities and high frequency details of the 3D face - such as eyelid and lip variations. In this work, we present 3DFaceGAN, the first GAN tailored towards modeling the distribution of 3D facial surfaces, while retaining the high frequency details of 3D face shapes. We conduct an extensive series of both qualitative and quantitative experiments, where the merits of 3DFaceGAN are clearly demonstrated against other, state-of-the-art methods in tasks such as 3D shape representation, generation, and translation.
CVDec 15, 2017
Multi-Attribute Robust Component Analysis for Facial UV MapsStylianos Moschoglou, Evangelos Ververas, Yannis Panagakis et al.
Recently, due to the collection of large scale 3D face models, as well as the advent of deep learning, a significant progress has been made in the field of 3D face alignment "in-the-wild". That is, many methods have been proposed that establish sparse or dense 3D correspondences between a 2D facial image and a 3D face model. The utilization of 3D face alignment introduces new challenges and research directions, especially on the analysis of facial texture images. In particular, texture does not suffer any more from warping effects (that occurred when 2D face alignment methods were used). Nevertheless, since facial images are commonly captured in arbitrary recording conditions, a considerable amount of missing information and gross outliers is observed (e.g., due to self-occlusion, or subjects wearing eye-glasses). Given that many annotated databases have been developed for face analysis tasks, it is evident that component analysis techniques need to be developed in order to alleviate issues arising from the aforementioned challenges. In this paper, we propose a novel component analysis technique that is suitable for facial UV maps containing a considerable amount of missing information and outliers, while additionally, incorporates knowledge from various attributes (such as age and identity). We evaluate the proposed Multi-Attribute Robust Component Analysis (MA-RCA) on problems such as UV completion and age progression, where the proposed method outperforms compared techniques. Finally, we demonstrate that MA-RCA method is powerful enough to provide weak annotations for training deep learning systems for various applications, such as illumination transfer.