Marco Mezzavilla

h-index63

9papers

41citations

Novelty29%

AI Score48

Ranked #51,155 of 201,326 authors (top 25%)#57 in SP (top 8%)

9 Papers

NIMay 26

A Preliminary Assessment of Midhaul Links at 140 GHz using Ray-Tracing

Sravan Reddy Chintareddy, Marco Mezzavilla, Sundeep Rangan et al.

The ever-growing demand for mobile data necessitates a transport network architecture that can withstand the 5G-and-beyond multi-Gbps traffic requirements. To cater for such unprecedented demand, studies are being conducted to incorporate TeraHertz (THz) communications in future mobile networks. In this paper, we consider an urban environment and evaluate the feasibility of THz wireless midhaul links for the transport networks between the Central Units (CU) and Distributed Units (DU) in a disaggregated 5G network architecture with functional splits. Our goal is to study the feasibility of midhaul links at 140 GHz by minimizing the number of required CUs to serve all the DUs. To this end, we define several policies for selecting CU and DU nodes in order to determine the peak data rate that can be supported over each link between a CU and DU. Our numerical results based on ray-tracing suggest that wireless links at 140 GHz with 3GPP option 2 as High Layer Split (HLS) represents a promising technology for midhaul transport networks.

CVJan 26Code

Exploring the Use of VLMs for Navigation Assistance for People with Blindness and Low Vision

Yu Li, Yuchen Zheng, Giles Hamilton-Fletcher et al.

This paper investigates the potential of vision-language models (VLMs) to assist people with blindness and low vision (pBLV) in navigation tasks. We evaluate state-of-the-art closed-source models, including GPT-4V, GPT-4o, Gemini-1.5-Pro, and Claude-3.5-Sonnet, alongside open-source models, such as Llava-v1.6-mistral and Llava-onevision-qwen, to analyze their capabilities in foundational visual skills: counting ambient obstacles, relative spatial reasoning, and common-sense wayfinding-pertinent scene understanding. We further assess their performance in navigation scenarios, using pBLV-specific prompts designed to simulate real-world assistance tasks. Our findings reveal notable performance disparities between these models: GPT-4o consistently outperforms others across all tasks, particularly in spatial reasoning and scene understanding. In contrast, open-source models struggle with nuanced reasoning and adaptability in complex environments. Common challenges include difficulties in accurately counting objects in cluttered settings, biases in spatial reasoning, and a tendency to prioritize object details over spatial feedback, limiting their usability for pBLV in navigation tasks. Despite these limitations, VLMs show promise for wayfinding assistance when better aligned with human feedback and equipped with improved spatial reasoning. This research provides actionable insights into the strengths and limitations of current VLMs, guiding developers on effectively integrating VLMs into assistive technologies while addressing key limitations for enhanced usability.

ROJul 3, 2022

Wireless Channel Prediction in Partially Observed Environments

Mingsheng Yin, Yaqi Hu, Tommy Azzino et al.

Site-specific radio frequency (RF) propagation prediction increasingly relies on models built from visual data such as cameras and LIDAR sensors. When operating in dynamic settings, the environment may only be partially observed. This paper introduces a method to extract statistical channel models, given partial observations of the surrounding environment. We propose a simple heuristic algorithm that performs ray tracing on the partial environment and then uses machine-learning trained predictors to estimate the channel and its uncertainty from features extracted from the partial ray tracing results. It is shown that the proposed method can interpolate between fully statistical models when no partial information is available and fully deterministic models when the environment is completely observed. The method can also capture the degree of uncertainty of the propagation predictions depending on the amount of region that has been explored. The methodology is demonstrated in a robotic navigation application simulated on a set of indoor maps with detailed models constructed using state-of-the-art navigation, simultaneous localization and mapping (SLAM), and computer vision methods.

SPApr 7

Interference Suppression for Massive MU-MIMO Long-Term Beamforming with Matrix Inversion Approximation

Amirreza Kiani, Ali Rasteh, Marco Mezzavilla et al.

Long-term beamforming (LTBF) is a widely-used scalable alternative to instantaneous multi-user MIMO processing that leverages slowly varying spatial channel statistics. VLSI implementations require matrix inversion that become computationally challenging for massive MIMO systems with large number of antennas. In this work, we show that dominant interferers significantly degrade the numerical conditioning of the LTBF covariance matrix, leading to severe performance loss in finite-precision implementations of polynomial and conjugate gradient (CG) based inversion methods. To address this issue, we propose a subspace nulling approach that operates solely on long-term channel statistics and acts as an implicit preconditioning step for LTBF. By projecting the received signal onto the orthogonal complement of the dominant interference subspace, the proposed method reduces the eigenvalue spread of the covariance matrix and improves numerical stability. Through ray-tracing simulations in a realistic 5G scenario, we demonstrate that the proposed method substantially reduces the number of CG iterations required to achieve near-optimal performance across floating-point and fixed-point implementations while preserving the low-overhead nature of LTBF.

SPMay 4

Low-rank Preconditioning in Beamspace Domain For Massive MU-MIMO Long-Term Beamforming

Amirreza Kiani, Ali Rasteh, Marco Mezzavilla et al.

Long-term beamforming substantially reduces the channel estimation and inversion overhead of conventional massive MU-MIMO receivers; yet, its construction still hinges on the inversion of a large Hermitian matrix, whose condition number deteriorates with the per-user SNR dynamic range. When this inversion is approximated in hardware via the conjugate gradient (CG) algorithm, the deterioration directly inflates the iteration count and, consequently, the energy and latency budget. We propose a hardware-friendly low-rank preconditioning framework that targets exactly this bottleneck. The preconditioner is constructed from the top eigenpairs of the long-term covariance matrix through a randomized complex eigenvalue decomposition (RC-EVD), whose inner QR factorizations are realized via a Cholesky-based scheme (QRC), confining the dominant cost to generalized matrix multiplication (GEMM) and small triangular solves that map naturally onto systolic arrays. We further show that performing the preconditioned CG inversion in the beamspace domain induces sparsification of the system matrix and provides additional convergence acceleration at negligible transformation cost. Ray-tracing simulations confirm that the joint scheme reduces the required CG iteration count by two to three while matching the post-equalization SINR of the exact inversion.

ITApr 25, 2024

Channel Modeling for FR3 Upper Mid-band via Generative Adversarial Networks

Yaqi Hu, Mingsheng Yin, Marco Mezzavilla et al.

The upper mid-band (FR3) has been recently attracting interest for new generation of mobile networks, as it provides a promising balance between spectrum availability and coverage, which are inherent limitations of the sub 6GHz and millimeter wave bands, respectively. In order to efficiently design and optimize the network, channel modeling plays a key role since FR3 systems are expected to operate at multiple frequency bands. Data-driven methods, especially generative adversarial networks (GANs), can capture the intricate relationships among data samples, and provide an appropriate tool for FR3 channel modeling. In this work, we present the architecture, link state model, and path generative network of GAN-based FR3 channel modeling. The comparison of our model greatly matches the ray-tracing simulated data.

SPSep 30, 2025

Transformer-Based Rate Prediction for Multi-Band Cellular Handsets

Ruibin Chen, Haozhe Lei, Hao Guo et al.

Cellular wireless systems are witnessing the proliferation of frequency bands over a wide spectrum, particularly with the expansion of new bands in FR3. These bands must be supported in user equipment (UE) handsets with multiple antennas in a constrained form factor. Rapid variations in channel quality across the bands from motion and hand blockage, limited field-of-view of antennas, and hardware and power-constrained measurement sparsity pose significant challenges to reliable multi-band channel tracking. This paper formulates the problem of predicting achievable rates across multiple antenna arrays and bands with sparse historical measurements. We propose a transformer-based neural architecture that takes asynchronous rate histories as input and outputs per-array rate predictions. Evaluated on ray-traced simulations in a dense urban micro-cellular setting with FR1 and FR3 arrays, our method demonstrates superior performance over baseline predictors, enabling more informed band selection under realistic mobility and hardware constraints.

IVDec 25, 2021

Network-Aware 5G Edge Computing for Object Detection: Augmenting Wearables to "See" More, Farther and Faster

Zhongzheng Yuan, Tommy Azzino, Yu Hao et al.

Advanced wearable devices are increasingly incorporating high-resolution multi-camera systems. As state-of-the-art neural networks for processing the resulting image data are computationally demanding, there has been growing interest in leveraging fifth generation (5G) wireless connectivity and mobile edge computing for offloading this processing to the cloud. To assess this possibility, this paper presents a detailed simulation and evaluation of 5G wireless offloading for object detection within a powerful, new smart wearable called VIS4ION, for the Blind-and-Visually Impaired (BVI). The current VIS4ION system is an instrumented book-bag with high-resolution cameras, vision processing and haptic and audio feedback. The paper considers uploading the camera data to a mobile edge cloud to perform real-time object detection and transmitting the detection results back to the wearable. To determine the video requirements, the paper evaluates the impact of video bit rate and resolution on object detection accuracy and range. A new street scene dataset with labeled objects relevant to BVI navigation is leveraged for analysis. The vision evaluation is combined with a detailed full-stack wireless network simulation to determine the distribution of throughputs and delays with real navigation paths and ray-tracing from new high-resolution 3D models in an urban environment. For comparison, the wireless simulation considers both a standard 4G-Long Term Evolution (LTE) carrier and high-rate 5G millimeter-wave (mmWave) carrier. The work thus provides a thorough and realistic assessment of edge computing with mmWave connectivity in an application with both high bandwidth and low latency requirements.

ROAug 19, 2020

Enabling Remote Whole-Body Control with 5G Edge Computing

Huaijiang Zhu, Manali Sharma, Kai Pfeiffer et al.

Real-world applications require light-weight, energy-efficient, fully autonomous robots. Yet, increasing autonomy is oftentimes synonymous with escalating computational requirements. It might thus be desirable to offload intensive computation--not only sensing and planning, but also low-level whole-body control--to remote servers in order to reduce on-board computational needs. Fifth Generation (5G) wireless cellular technology, with its low latency and high bandwidth capabilities, has the potential to unlock cloud-based high performance control of complex robots. However, state-of-the-art control algorithms for legged robots can only tolerate very low control delays, which even ultra-low latency 5G edge computing can sometimes fail to achieve. In this work, we investigate the problem of cloud-based whole-body control of legged robots over a 5G link. We propose a novel approach that consists of a standard optimization-based controller on the network edge and a local linear, approximately optimal controller that significantly reduces on-board computational needs while increasing robustness to delay and possible loss of communication. Simulation experiments on humanoid balancing and walking tasks that includes a realistic 5G communication model demonstrate significant improvement of the reliability of robot locomotion under jitter and delays likely to experienced in 5G wireless links.