Jose Maureira

6.7DCApr 14

Leveraging Mathematical Reasoning of LLMs for Efficient GPU Thread Mapping

Jose Maureira, Cristóbal A. Navarro, Hector Ferrada et al.

Mapping parallel threads onto non-box-shaped domains is a known challenge in GPU computing; efficient mapping prevents performance penalties from unnecessary resource allocation. Currently, achieving this requires significant analytical human effort to manually derive bespoke mapping functions for each geometry. This work introduces a novel approach leveraging the symbolic reasoning of Large Language Models (LLMs) to automate this derivation entirely through in-context learning. Focusing on state-of-the-art open-weights models, we conducted a rigorous comparative analysis across spatial domains of increasing complexity. Our results demonstrate that modern local LLMs successfully infer exact O(1) and O(log N) mapping equations for complex 2D/3D dense domains and 2D fractals, vastly outperforming traditional symbolic regression methods. Crucially, we profile the energetic viability of this approach on high-performance infrastructure, distinguishing between the code-generation and execution phases. While one-time inference incurs a high energy penalty -- particularly for reasoning-focused models like DeepSeek-R1 -- this is a single upfront investment. Once integrated, the generated analytical kernels eliminate block waste entirely, yielding massive energy and time savings (e.g., up to 4833x speedup and 2890x energy reduction) during actual GPU workloads. Finally, we identify a current "reasoning ceiling" when these models face highly recursive 3D fractals (e.g., the Menger Sponge). This limitation benchmarks the present maturity of open-weight architectures, charting a viable path toward fully automated, energy-efficient GPU resource optimization.

2.6CVJul 26, 2021

Synthetic Periocular Iris PAI from a Small Set of Near-Infrared-Images

Jose Maureira, Juan Tapia, Claudia Arellano et al.

Biometric has been increasing in relevance these days since it can be used for several applications such as access control for instance. Unfortunately, with the increased deployment of biometric applications, we observe an increase of attacks. Therefore, algorithms to detect such attacks (Presentation Attack Detection (PAD)) have been increasing in relevance. The LivDet-2020 competition which focuses on Presentation Attacks Detection (PAD) algorithms have shown still open problems, specially for unknown attacks scenarios. In order to improve the robustness of biometric systems, it is crucial to improve PAD methods. This can be achieved by augmenting the number of presentation attack instruments (PAI) and bona fide images that are used to train such algorithms. Unfortunately, the capture and creation of presentation attack instruments and even the capture of bona fide images is sometimes complex to achieve. This paper proposes a novel PAI synthetically created (SPI-PAI) using four state-of-the-art GAN algorithms (cGAN, WGAN, WGAN-GP, and StyleGAN2) and a small set of periocular NIR images. A benchmark between GAN algorithms is performed using the Frechet Inception Distance (FID) between the generated images and the original images used for training. The best PAD algorithm reported by the LivDet-2020 competition was tested for us using the synthetic PAI which was obtained with the StyleGAN2 algorithm. Surprisingly, The PAD algorithm was not able to detect the synthetic images as a Presentation Attack, categorizing all of them as bona fide. Such results demonstrated the feasibility of synthetic images to fool presentation attacks detection algorithms and the need for such algorithms to be constantly updated and trained with a larger number of images and PAI scenarios.

Jose Maureira

2 Papers