Fulu Li

h-index12

4papers

863citations

Novelty22%

AI Score18

Ranked #189,841 of 194,257 authors (top 98%)#39,852 in LG (top 99%)

4 Papers

2.1IVJul 6

Non-contact, Real-time, Heart-rate Measurement using Image Processing with Commodity Cameras and AI Agents

Kelly Li, Fulu Li

Heart rate measurement is one of the key requirements for real-time health monitoring, in particular for health caring of elderly people. Traditional heart rate measurement relies on contact sensing mechanisms such as some heart rate measurement devices at medical hospitals or some wearable devices with embedded sensors such as Apple Watch, etc. In this paper, we develop a system for non-contact, real-time, heart rate measurement using image processing with commodity cameras such as an embedded camera on a laptop, where we use an innovative algorithm to capture the relevant signals for the computation of heart rate in a time series in real life environments. The presented heart rate computation (HRC) process is composed with four major steps: (a) identify frames per second of the camera in use, i.e., 30 frames per second for a given camera, (b) face detection (FD) with shape predictor of 68 face landmarks using deep learning (DL) method, (c) time sliding window (TSW) algorithm to de-noise the signal by smoothing out the noise, and (d) compute heart rate based on identified signal periodicity. We test and analyze the developed prototypes against heart rate results by Apple Watch and check the difference range in multiple rounds and compute the mean of the difference for the measurement values of the heart rate of the same person at the same time. We will do further tuning and optimization of the present methods and deploy the system as a personal AI agent [6] for health monitoring as our future directions.

2.6LGSep 14, 2024

Cross-Entropy Optimization for Hyperparameter Optimization in Stochastic Gradient-based Approaches to Train Deep Neural Networks

Kevin Li, Fulu Li

In this paper, we present a cross-entropy optimization method for hyperparameter optimization in stochastic gradient-based approaches to train deep neural networks. The value of a hyperparameter of a learning algorithm often has great impact on the performance of a model such as the convergence speed, the generalization performance metrics, etc. While in some cases the hyperparameters of a learning algorithm can be part of learning parameters, in other scenarios the hyperparameters of a stochastic optimization algorithm such as Adam [5] and its variants are either fixed as a constant or are kept changing in a monotonic way over time. We give an in-depth analysis of the presented method in the framework of expectation maximization (EM). The presented algorithm of cross-entropy optimization for hyperparameter optimization of a learning algorithm (CEHPO) can be equally applicable to other areas of optimization problems in deep learning. We hope that the presented methods can provide different perspectives and offer some insights for optimization problems in different areas of machine learning and beyond.

2.3AISep 29, 2024

Analysis on Riemann Hypothesis with Cross Entropy Optimization and Reasoning

Kevin Li, Fulu Li

In this paper, we present a novel framework for the analysis of Riemann Hypothesis [27], which is composed of three key components: a) probabilistic modeling with cross entropy optimization and reasoning; b) the application of the law of large numbers; c) the application of mathematical inductions. The analysis is mainly conducted by virtue of probabilistic modeling of cross entropy optimization and reasoning with rare event simulation techniques. The application of the law of large numbers [2, 3, 6] and the application of mathematical inductions make the analysis of Riemann Hypothesis self-contained and complete to make sure that the whole complex plane is covered as conjectured in Riemann Hypothesis. We also discuss the method of enhanced top-p sampling with large language models (LLMs) for reasoning, where next token prediction is not just based on the estimated probabilities of each possible token in the current round but also based on accumulated path probabilities among multiple top-k chain of thoughts (CoTs) paths. The probabilistic modeling of cross entropy optimization and reasoning may suit well with the analysis of Riemann Hypothesis as Riemann Zeta functions are inherently dealing with the sums of infinite components of a complex number series. We hope that our analysis in this paper could shed some light on some of the insights of Riemann Hypothesis. The framework and techniques presented in this paper, coupled with recent developments with chain of thought (CoT) or diagram of thought (DoT) reasoning in large language models (LLMs) with reinforcement learning (RL) [1, 7, 18, 21, 24, 34, 39-41], could pave the way for eventual proof of Riemann Hypothesis [27].

2.6LGOct 24, 2024

The Nature of Mathematical Modeling and Probabilistic Optimization Engineering in Generative AI

Fulu Li

In this paper, we give an in-depth analysis on the mathematical problem formulations and the probabilistic optimization explorations for some of the key components in Transformer model [33] in the field of generative AI. We explore and discuss some potential further enhancement for current state of the art methods for some key underlying technologies of generative AI models from algorithmic and probabilistic optimization perspective. In particular, we present an optimal solution for sub-word encoding (SWE) based on similar initial settings as that of byte-pair encoding (BPE) algorithm in [9] with similar objectives as that of WordPiece approach in [28, 31] to maximize the likelihood of the training data. We also present cross entropy optimization method to optimize hyperparameters for word2vec model [17]. In addition, we propose a factored combination of rotary positional encoding (RoPE) [32] and attention with linear biases (ALiBi) [23] with a harmonic series. We also present a probabilistic FlashAttention [6, 7] (PrFlashAttention) method with a probability distribution over block distances in the matrix to decide which block is likely to participate in a given round of attention computation while maintaining the lower triangle shape of the tensor for autoregressive language models by re-shaping the tensors. Finally, we present staircase adaptive quantization (SAQ) of key-value (KV) cache for multi-query attention (MQA) based on the framework presented in [16] to have gradual quantization degradation while achieving reasonable model quality and cost savings.