LGJul 5, 2022
Sedentary Behavior Estimation with Hip-worn Accelerometer Data: Segmentation, Classification and ThresholdingYiren Wang, Fatima Tuz-Zahra, Rong Zablocki et al.
Cohort studies are increasingly using accelerometers for physical activity and sedentary behavior estimation. These devices tend to be less error-prone than self-report, can capture activity throughout the day, and are economical. However, previous methods for estimating sedentary behavior based on hip-worn data are often invalid or suboptimal under free-living situations and subject-to-subject variation. In this paper, we propose a local Markov switching model that takes this situation into account, and introduce a general procedure for posture classification and sedentary behavior analysis that fits the model naturally. Our method features changepoint detection methods in time series and also a two stage classification step that labels data into 3 classes(sitting, standing, stepping). Through a rigorous training-testing paradigm, we showed that our approach achieves > 80% accuracy. In addition, our method is robust and easy to interpret.
47.6NAMar 27
A Quantum Spectral Method for Non-Periodic Boundary Value ProblemsEky Febrianto, Yiren Wang, Burigede Liu et al.
Quantum computing holds the promise of solving computational mechanics problems in polylogarithmic time, meaning computational time scales as $\mathscr{O}((\log N)^c)$, where $N$ is the problem size and $c$ a constant. We propose a quantum spectral method with polylogarithmic complexity for solving non-periodic boundary value problems with arbitrary Dirichlet boundary conditions. Our method extends the recently proposed approach by Liu et al. (2025), in which periodic problems are discretised using truncated Fourier series. In such spectral methods, the discretisation of boundary value problems with constant coefficients leads to a set of algebraic equations in the Fourier space. We implement the respective diagonal solution operator by first approximating it with a polynomial and then quantum encoding the polynomial. The mapping between the physical and Fourier spaces is accomplished using the quantum Fourier transform (QFT). To impose zero Dirichlet boundary conditions, we double the domain size and reflect all physical fields antisymmetrically. The respective reflection matrix defines the quantum sine transform (QST) by pre- and post-multiplying with the QFT. For non-zero Dirichlet boundary conditions, the solution is decomposed into a boundary-conforming and a homogeneous part. The homogenous part is determined by solving a problem with a suitably modified forcing vector. We illustrate the basic approach with a Dirichlet-Poisson problem and demonstrate its generality by applying it to a fractional stochastic PDE for modelling spatial random fields. We discuss the circuit implementation of the proposed approach and provide numerical evidence confirming its polylogarithmic complexity.
94.1NIMar 31
Statistical Verification of Medium-Access Parameterization for Power-Grid Edge Ad Hoc Sensor NetworksHaitian Wang, Xia Cheng, Yiren Wang et al.
The widespread deployment of power grid ad hoc sensor networks based on IEEE 802.15.4 raises reliability challenges when nodes selfishly adapt CSMA/CA parameters to maximize individual performance. Such behavior degrades reliability, energy efficiency, and compliance with strict grid constraints. Existing analytical and simulation approaches often fail to rigorously evaluate configurations under asynchronous, event-driven, and resource-limited conditions. We develop a verification framework that integrates stochastic timed hybrid automata with statistical model checking (SMC) with confidence bounds to formally assess CSMA/CA parameterizations under grid workloads. By encoding node- and system-level objectives in temporal logic and automating protocol screening via large-scale statistical evaluation, the method certifies Nash equilibrium strategies that remain robust to unilateral deviations. In a substation-scale scenario, the certified equilibrium improves utility from 0.862 to 0.914 and raises the delivery ratio from 89.5% to 93.2% when compared with an aggressive tuning baseline. Against a delivery-oriented baseline, it reduces mean per-cycle energy from 152.8 mJ to 149.2 mJ while maintaining comparable delivery performance. Certified configurations satisfy latency, reliability, and energy constraints with robustness coefficients above 0.97 and utility above 0.91.
CVJun 19, 2025Code
P2MFDS: A Privacy-Preserving Multimodal Fall Detection System for Elderly People in Bathroom EnvironmentsHaitian Wang, Yiren Wang, Xinyu Wang et al.
By 2050, people aged 65 and over are projected to make up 16 percent of the global population. As aging is closely associated with increased fall risk, particularly in wet and confined environments such as bathrooms where over 80 percent of falls occur. Although recent research has increasingly focused on non-intrusive, privacy-preserving approaches that do not rely on wearable devices or video-based monitoring, these efforts have not fully overcome the limitations of existing unimodal systems (e.g., WiFi-, infrared-, or mmWave-based), which are prone to reduced accuracy in complex environments. These limitations stem from fundamental constraints in unimodal sensing, including system bias and environmental interference, such as multipath fading in WiFi-based systems and drastic temperature changes in infrared-based methods. To address these challenges, we propose a Privacy-Preserving Multimodal Fall Detection System for Elderly People in Bathroom Environments. First, we develop a sensor evaluation framework to select and fuse millimeter-wave radar with 3D vibration sensing, and use it to construct and preprocess a large-scale, privacy-preserving multimodal dataset in real bathroom settings, which will be released upon publication. Second, we introduce P2MFDS, a dual-stream network combining a CNN-BiLSTM-Attention branch for radar motion dynamics with a multi-scale CNN-SEBlock-Self-Attention branch for vibration impact detection. By uniting macro- and micro-scale features, P2MFDS delivers significant gains in accuracy and recall over state-of-the-art approaches. Code and pretrained models will be made available at: https://github.com/HaitianWang/P2MFDS-A-Privacy-Preserving-Multimodal-Fall-Detection-Network-for-Elderly-Individuals-in-Bathroom.
CLJul 3, 2019Code
Depth Growing for Neural Machine TranslationLijun Wu, Yiren Wang, Yingce Xia et al.
While very deep neural networks have shown effectiveness for computer vision and text classification applications, how to increase the network depth of neural machine translation (NMT) models for better translation quality remains a challenging problem. Directly stacking more blocks to the NMT model results in no improvement and even reduces performance. In this work, we propose an effective two-stage approach with three specially designed components to construct deeper NMT models, which result in significant improvements over the strong Transformer baselines on WMT$14$ English$\to$German and English$\to$French translation tasks\footnote{Our code is available at \url{https://github.com/apeterswu/Depth_Growing_NMT}}.
27.2CVMar 17
Edge-Efficient Two-Stream Multimodal Architecture for Non-Intrusive Bathroom Fall DetectionHaitian Wang, Yiren Wang, Xinyu Wang et al.
Falls in wet bathroom environments are a major safety risk for seniors living alone. Recent work has shown that mmWave-only, vibration-only, and existing multimodal schemes, such as vibration-triggered radar activation, early feature concatenation, and decision-level score fusion, can support privacy-preserving, non-intrusive fall detection. However, these designs still treat motion and impact as loosely coupled streams, depending on coarse temporal alignment and amplitude thresholds, and do not explicitly encode the causal link between radar-observed collapse and floor impact or address timing drift, object drop confounders, and latency and energy constraints on low-power edge devices. To this end, we propose a two-stream architecture that encodes radar signals with a Motion--Mamba branch for long-range motion patterns and processes floor vibration with an Impact--Griffin branch that emphasizes impact transients and cross-axis coupling. Cross-conditioned fusion uses low-rank bilinear interaction and a Switch--MoE head to align motion and impact tokens and suppress object-drop confounders. The model keeps inference cost suitable for real-time execution on a Raspberry Pi 4B gateway. We construct a bathroom fall detection benchmark dataset with frame-level annotations, comprising more than 3~h of synchronized mmWave radar and triaxial vibration recordings across eight scenarios under running water, together with subject-independent training, validation, and test splits. On the test split, our model attains 96.1% accuracy, 94.8% precision, 88.0% recall, a 91.1% macro F1 score, and an AUC of 0.968. Compared with the strongest baseline, it improves accuracy by 2.0 percentage points and fall recall by 1.3 percentage points, while reducing latency from 35.9 ms to 15.8 ms and lowering energy per 2.56 s window from 14200 mJ to 10750 mJ on the Raspberry Pi 4B gateway.
57.7NAApr 7
QAFE$^2$: Quantum Accelerated Multiscale Finite Element AnalysisYiren Wang, Michael Ortiz, Fehmi Cirak
The computational cost of concurrent multiscale finite element methods is dominated by the repeated solution of microscopic representative volume element (RVE) problems at macroscopic quadrature points. In this work, we introduce a quantum-classical framework for multiscale finite element analysis (QAFE$^2$) that leverages quantum parallelism to fundamentally alter the scaling of RVE-based homogenisation. At the single-RVE level, the proposed quantum solver attains polylogarithmic complexity with respect to the microscopic discretisation size, yielding an exponential asymptotic speedup over the best available classical solvers. More importantly, QAFE$^2$ exploits quantum superposition and entanglement to evaluate, in a single quantum execution, the entire ensemble of RVE problems associated with all macroscopic quadrature points. This capability is a form of intrinsic quantum concurrency with no classical analogue. Numerical experiments on one- and two-dimensional model problems with known analytical solutions confirm the accuracy of the proposed formulation and verify the theoretical computational scaling and parallel performance.
IVJul 21, 2025
Quantization-Aware Neuromorphic Architecture for Efficient Skin Disease Classification on Resource-Constrained DevicesHaitian Wang, Xinyu Wang, Yiren Wang et al.
Accurate and efficient skin lesion classification on edge devices is critical for accessible dermatological care but remains challenging due to computational, energy, and privacy constraints. We introduce QANA, a novel quantization-aware neuromorphic architecture for incremental skin lesion classification on resource-limited hardware. QANA effectively integrates ghost modules, efficient channel attention, and squeeze-and-excitation blocks for robust feature representation with low-latency and energy-efficient inference. Its quantization-aware head and spike-compatible transformations enable seamless conversion to spiking neural networks (SNNs) and deployment on neuromorphic platforms. Evaluation on the large-scale HAM10000 benchmark and a real-world clinical dataset shows that QANA achieves 91.6% Top-1 accuracy and 82.4% macro F1 on HAM10000, and 90.8%/81.7% on the clinical dataset, significantly outperforming state-of-the-art CNN-to-SNN models under fair comparison. Deployed on BrainChip Akida hardware, QANA achieves 1.5 ms inference latency and 1.7,mJ energy per image, reducing inference latency and energy use by over 94.6%/98.6% compared to GPU-based CNNs surpassing state-of-the-art CNN-to-SNN conversion baselines. These results demonstrate the effectiveness of QANA for accurate, real-time, and privacy-sensitive medical analysis in edge environments.
CLMay 24, 2023
Leveraging GPT-4 for Automatic Translation Post-EditingVikas Raunak, Amr Sharaf, Yiren Wang et al.
While Neural Machine Translation (NMT) represents the leading approach to Machine Translation (MT), the outputs of NMT models still require translation post-editing to rectify errors and enhance quality under critical settings. In this work, we formalize the task of direct translation post-editing with Large Language Models (LLMs) and explore the use of GPT-4 to automatically post-edit NMT outputs across several language pairs. Our results demonstrate that GPT-4 is adept at translation post-editing, producing meaningful and trustworthy edits to translations that help improve its general quality as well as remove different classes of major errors in translations. In particular, human evaluations on assessing edit trustworthiness show that GPT-4 exhibits a large improvement over the prior state-of-the-art LLM. Notably, we improve upon state-of-the-art performance on WMT-22 English-Chinese, English-German, Chinese-English and German-English language pairs using GPT-4 based post-editing, as evaluated by state-of-the-art MT quality metrics. However, we also show that GPT-4 could produce hallucinated edits, thereby urging caution in its use as an expert translation post-editor.
CLOct 6, 2020
Multi-task Learning for Multilingual Neural Machine TranslationYiren Wang, ChengXiang Zhai, Hany Hassan Awadalla
While monolingual data has been shown to be useful in improving bilingual neural machine translation (NMT), effectively and efficiently leveraging monolingual data for Multilingual NMT (MNMT) systems is a less explored area. In this work, we propose a multi-task learning (MTL) framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data. We conduct extensive empirical studies on MNMT systems with 10 language pairs from WMT datasets. We show that the proposed approach can effectively improve the translation quality for both high-resource and low-resource languages with large margin, achieving significantly better results than the individual bilingual models. We also demonstrate the efficacy of the proposed approach in the zero-shot setup for language pairs without bitext training data. Furthermore, we show the effectiveness of MTL over pre-training approaches for both NMT and cross-lingual transfer learning NLU tasks; the proposed approach outperforms massive scale models trained on single task.
CLNov 22, 2019
Improving N-gram Language Models with Pre-trained Deep TransformerYiren Wang, Hongzhao Huang, Zhe Liu et al.
Although n-gram language models (LMs) have been outperformed by the state-of-the-art neural LMs, they are still widely used in speech recognition due to its high efficiency in inference. In this paper, we demonstrate that n-gram LM can be improved by neural LMs through a text generation based data augmentation method. In contrast to previous approaches, we employ a large-scale general domain pre-training followed by in-domain fine-tuning strategy to construct deep Transformer based neural LMs. Large amount of in-domain text data is generated with the well trained deep Transformer to construct new n-gram LMs, which are then interpolated with baseline n-gram systems. Empirical studies on different speech recognition tasks show that the proposed approach can effectively improve recognition accuracy. In particular, our proposed approach brings significant relative word error rate reduction up to 6.0% for domains with limited in-domain data.
CLNov 7, 2019
Microsoft Research Asia's Systems for WMT19Yingce Xia, Xu Tan, Fei Tian et al.
We Microsoft Research Asia made submissions to 11 language directions in the WMT19 news translation tasks. We won the first place for 8 of the 11 directions and the second place for the other three. Our basic systems are built on Transformer, back translation and knowledge distillation. We integrate several of our rececent techniques to enhance the baseline systems: multi-agent dual learning (MADL), masked sequence-to-sequence pre-training (MASS), neural architecture optimization (NAO), and soft contextual data augmentation (SCA).
CLFeb 22, 2019
Non-Autoregressive Machine Translation with Auxiliary RegularizationYiren Wang, Fei Tian, Di He et al.
As a new neural machine translation approach, Non-Autoregressive machine Translation (NAT) has attracted attention recently due to its high efficiency in inference. However, the high efficiency has come at the cost of not capturing the sequential dependency on the target side of translation, which causes NAT to suffer from two kinds of translation errors: 1) repeated translations (due to indistinguishable adjacent decoder hidden states), and 2) incomplete translations (due to incomplete transfer of source side information via the decoder hidden states). In this paper, we propose to address these two problems by improving the quality of decoder hidden representations via two auxiliary regularization terms in the training process of an NAT model. First, to make the hidden states more distinguishable, we regularize the similarity between consecutive hidden states based on the corresponding target tokens. Second, to force the hidden states to contain all the information in the source sentence, we leverage the dual nature of translation tasks (e.g., English to German and German to English) and minimize a backward reconstruction error to ensure that the hidden states of the NAT decoder are able to recover the source side sentence. Extensive experiments conducted on several benchmark datasets show that both regularization strategies are effective and can alleviate the issues of repeated translations and incomplete translations in NAT models. The accuracy of NAT models is therefore improved significantly over the state-of-the-art NAT models with even better efficiency for inference.