LGJan 28, 2025
A 1-D CNN inference engine for constrained platformsIshwar Mudraje, Kai Vogelgesang, Thorsten Herfet
1D-CNNs are used for time series classification in various domains with a high degree of accuracy. Most implementations collect the incoming data samples in a buffer before performing inference on it. On edge devices, which are typically constrained and single-threaded, such an implementation may interfere with time-critical tasks. One such task is that of sample acquisition. In this work, we propose an inference scheme that interleaves the convolution operations between sample intervals, which allows us to reduce the inference latency. Furthermore, our scheme is well-suited for storing data in ring buffers, yielding a small memory footprint. We demonstrate these improvements by comparing our approach to TFLite's inference method, giving a 10% reduction in the inference delay while almost halving the memory usage. Our approach is feasible on common consumer devices, which we show using an AVR-based Arduino board and an ARM-based Arduino board.
IVJul 22, 2021
Fristograms: Revealing and Exploiting Light Field InternalsThorsten Herfet, Kelvin Chelli, Tobias Lange et al.
In recent years, light field (LF) capture and processing has become an integral part of media production. The richness of information available in LFs has enabled novel applications like post-capture depth-of-field editing, 3D reconstruction, segmentation and matting, saliency detection, object detection and recognition, and mixed reality. The efficacy of such applications depends on certain underlying requirements, which are often ignored. For example, some operations such as noise-reduction, or hyperfan-filtering are only possible if a scene point Lambertian radiator. Some other operations such as the removal of obstacles or looking behind objects are only possible if there is at least one ray capturing the required scene point. Consequently, the ray distribution representing a certain scene point is an important characteristic for evaluating processing possibilities. The primary idea in this paper is to establish a relation between the capturing setup and the rays of the LF. To this end, we discretize the view frustum. Traditionally, a uniform discretization of the view frustum results in voxels that represents a single sample on a regularly spaced, 3-D grid. Instead, we use frustum-shaped voxels (froxels), by using depth and capturing-setup dependent discretization of the view frustum. Based on such discretization, we count the number of rays mapping to the same pixel on the capturing device(s). By means of this count, we propose histograms of ray-counts over the froxels (fristograms). Fristograms can be used as a tool to analyze and reveal interesting aspects of the underlying LF, like the number of rays originating from a scene point and the color distribution of these rays. As an example, we show its ability by significantly reducing the number of rays which enables noise reduction while maintaining the realistic rendering of non-Lambertian or partially occluded regions.
NISep 27, 2018
Cross-Layer Effects on Training Neural Algorithms for Video StreamingPablo Gil Pereira, Andreas Schmidt, Thorsten Herfet
Nowadays Dynamic Adaptive Streaming over HTTP (DASH) is the most prevalent solution on the Internet for multimedia streaming and responsible for the majority of global traffic. DASH uses adaptive bit rate (ABR) algorithms, which select the video quality considering performance metrics such as throughput and playout buffer level. Pensieve is a system that allows to train ABR algorithms using reinforcement learning within a simulated network environment and is outperforming existing approaches in terms of achieved performance. In this paper, we demonstrate that the performance of the trained ABR algorithms depends on the implementation of the simulated environment used to train the neural network. We also show that the used congestion control algorithm impacts the algorithms' performance due to cross-layer effects.