Xiaoli Xu

CV
h-index7
5papers
317citations
Novelty51%
AI Score41

5 Papers

SPOct 28, 2024Code
Deep Learning-Based CKM Construction with Image Super-Resolution

Shiyu Wang, Xiaoli Xu, Yong Zeng

Channel knowledge map (CKM) is a novel technique for achieving environment awareness, and thereby improving the communication and sensing performance for wireless systems. A fundamental problem associated with CKM is how to construct a complete CKM that provides channel knowledge for a large number of locations based solely on sparse data measurements. This problem bears similarities to the super-resolution (SR) problem in image processing. In this letter, we propose an effective deep learning-based CKM construction method that leverages the image SR network known as SRResNet. Unlike most existing studies, our approach does not require any additional input beyond the sparsely measured data. In addition to the conventional path loss map construction, our approach can also be applied to construct channel angle maps (CAMs), thanks to the use of a new dataset called CKMImageNet. The numerical results demonstrate that our method outperforms interpolation-based methods such as nearest neighbour and bicubic interpolation, as well as the SRGAN method in CKM construction. Furthermore, only 1/16 of the locations need to be measured in order to achieve a root mean square error (RMSE) of 1.1 dB in path loss.

42.0CVApr 21
Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items

Mengting Chen, Zhengrui Chen, Yongchao Du et al.

Recent advances in image generation and editing have opened new opportunities for virtual try-on. However, existing methods still struggle to meet complex real-world demands. We present Tstars-Tryon 1.0, a commercial-scale virtual try-on system that is robust, realistic, versatile, and highly efficient. First, our system maintains a high success rate across challenging cases like extreme poses, severe illumination variations, motion blur, and other in-the-wild conditions. Second, it delivers highly photorealistic results with fine-grained details, faithfully preserving garment texture, material properties, and structural characteristics, while largely avoiding common AI-generated artifacts. Third, beyond apparel try-on, our model supports flexible multi-image composition (up to 6 reference images) across 8 fashion categories, with coordinated control over person identity and background. Fourth, to overcome the latency bottlenecks of commercial deployment, our system is heavily optimized for inference speed, delivering near real-time generation for a seamless user experience. These capabilities are enabled by an integrated system design spanning end-to-end model architecture, a scalable data engine, robust infrastructure, and a multi-stage training paradigm. Extensive evaluation and large-scale product deployment demonstrate that Tstars-Tryon1.0 achieves leading overall performance. To support future research, we also release a comprehensive benchmark. The model has been deployed at an industrial scale on the Taobao App, serving millions of users with tens of millions of requests.

SPMar 17, 2020Code
Simultaneous Navigation and Radio Mapping for Cellular-Connected UAV with Deep Reinforcement Learning

Yong Zeng, Xiaoli Xu, Shi Jin et al.

Cellular-connected unmanned aerial vehicle (UAV) is a promising technology to unlock the full potential of UAVs in the future. However, how to achieve ubiquitous three-dimensional (3D) communication coverage for the UAVs in the sky is a new challenge. In this paper, we tackle this challenge by a new coverage-aware navigation approach, which exploits the UAV's controllable mobility to design its navigation/trajectory to avoid the cellular BSs' coverage holes while accomplishing their missions. We formulate an UAV trajectory optimization problem to minimize the weighted sum of its mission completion time and expected communication outage duration, and propose a new solution approach based on the technique of deep reinforcement learning (DRL). To further improve the performance, we propose a new framework called simultaneous navigation and radio mapping (SNARM), where the UAV's signal measurement is used not only for training the deep Q network (DQN) directly, but also to create a radio map that is able to predict the outage probabilities at all locations in the area of interest. This thus enables the generation of simulated UAV trajectories and predicting their expected returns, which are then used to further train the DQN via Dyna technique, thus greatly improving the learning efficiency.

CVAug 27, 2019
Mobile Video Action Recognition

Yuqi Huo, Xiaoli Xu, Yao Lu et al.

Video action recognition, which is topical in computer vision and video analysis, aims to allocate a short video clip to a pre-defined category such as brushing hair or climbing stairs. Recent works focus on action recognition with deep neural networks that achieve state-of-the-art results in need of high-performance platforms. Despite the fast development of mobile computing, video action recognition on mobile devices has not been fully discussed. In this paper, we focus on the novel mobile video action recognition task, where only the computational capabilities of mobile devices are accessible. Instead of raw videos with huge storage, we choose to extract multiple modalities (including I-frames, motion vectors, and residuals) directly from compressed videos. By employing MobileNetV2 as backbone, we propose a novel Temporal Trilinear Pooling (TTP) module to fuse the multiple modalities for mobile video action recognition. In addition to motion vectors, we also provide a temporal fusion method to explicitly induce the temporal context. The efficiency test on a mobile device indicates that our model can perform mobile video action recognition at about 40FPS. The comparative results on two benchmarks show that our model outperforms existing action recognition methods in model size and time consuming, but with competitive accuracy.

NIMay 9, 2019
Path Design for Cellular-Connected UAV with Reinforcement Learning

Yong Zeng, Xiaoli Xu

This paper studies the path design problem for cellular-connected unmanned aerial vehicle (UAV), which aims to minimize its mission completion time while maintaining good connectivity with the cellular network. We first argue that the conventional path design approach via formulating and solving optimization problems faces several practical challenges, and then propose a new reinforcement learning-based UAV path design algorithm by applying \emph{temporal-difference} method to directly learn the \emph{state-value function} of the corresponding Markov Decision Process. The proposed algorithm is further extended by using linear function approximation with tile coding to deal with large state space. The proposed algorithms only require the raw measured or simulation-generated signal strength as the input and are suitable for both online and offline implementations. Numerical results show that the proposed path designs can successfully avoid the coverage holes of cellular networks even in the complex urban environment.