Bin Han

CV
h-index40
50papers
763citations
Novelty42%
AI Score55

50 Papers

CVJul 22, 2022Code
Optimization of Forcemyography Sensor Placement for Arm Movement Recognition

Xiaohao Xu, Zihao Du, Huaxin Zhang et al.

How to design an optimal wearable device for human movement recognition is vital to reliable and accurate human-machine collaboration. Previous works mainly fabricate wearable devices heuristically. Instead, this paper raises an academic question: can we design an optimization algorithm to optimize the fabrication of wearable devices such as figuring out the best sensor arrangement automatically? Specifically, this work focuses on optimizing the placement of Forcemyography (FMG) sensors for FMG armbands in the application of arm movement recognition. Firstly, based on graph theory, the armband is modeled considering sensors' signals and connectivity. Then, a Graph-based Armband Modeling Network (GAM-Net) is introduced for arm movement recognition. Afterward, the sensor placement optimization for FMG armbands is formulated and an optimization algorithm with greedy local search is proposed. To study the effectiveness of our optimization algorithm, a dataset for mechanical maintenance tasks using FMG armbands with 16 sensors is collected. Our experiments show that using only 4 sensors optimized with our algorithm can help maintain a comparable recognition accuracy to using all sensors. Finally, the optimized sensor placement result is verified from a physiological view. This work would like to shed light on the automatic fabrication of wearable devices considering downstream tasks, such as human biological signal collection and movement recognition. Our code and dataset are available at https://github.com/JerryX1110/IROS22-FMG-Sensor-Optimization

CLApr 13, 2023
LeafAI: query generator for clinical cohort discovery rivaling a human programmer

Nicholas J Dobbins, Bin Han, Weipeng Zhou et al.

Objective: Identifying study-eligible patients within clinical databases is a critical step in clinical research. However, accurate query design typically requires extensive technical and biomedical expertise. We sought to create a system capable of generating data model-agnostic queries while also providing novel logical reasoning capabilities for complex clinical trial eligibility criteria. Materials and Methods: The task of query creation from eligibility criteria requires solving several text-processing problems, including named entity recognition and relation extraction, sequence-to-sequence transformation, normalization, and reasoning. We incorporated hybrid deep learning and rule-based modules for these, as well as a knowledge base of the Unified Medical Language System (UMLS) and linked ontologies. To enable data-model agnostic query creation, we introduce a novel method for tagging database schema elements using UMLS concepts. To evaluate our system, called LeafAI, we compared the capability of LeafAI to a human database programmer to identify patients who had been enrolled in 8 clinical trials conducted at our institution. We measured performance by the number of actual enrolled patients matched by generated queries. Results: LeafAI matched a mean 43% of enrolled patients with 27,225 eligible across 8 clinical trials, compared to 27% matched and 14,587 eligible in queries by a human database programmer. The human programmer spent 26 total hours crafting queries compared to several minutes by LeafAI. Conclusions: Our work contributes a state-of-the-art data model-agnostic query generation system capable of conditional reasoning using a knowledge base. We demonstrate that LeafAI can rival an experienced human programmer in finding patients eligible for clinical trials.

CLJun 12, 2023
Prompt-based Extraction of Social Determinants of Health Using Few-shot Learning

Giridhar Kaushik Ramachandran, Yujuan Fu, Bin Han et al. · uw

Social determinants of health (SDOH) documented in the electronic health record through unstructured text are increasingly being studied to understand how SDOH impacts patient health outcomes. In this work, we utilize the Social History Annotation Corpus (SHAC), a multi-institutional corpus of de-identified social history sections annotated for SDOH, including substance use, employment, and living status information. We explore the automatic extraction of SDOH information with SHAC in both standoff and inline annotation formats using GPT-4 in a one-shot prompting setting. We compare GPT-4 extraction performance with a high-performing supervised approach and perform thorough error analyses. Our prompt-based GPT-4 method achieved an overall 0.652 F1 on the SHAC test set, similar to the 7th best-performing system among all teams in the n2c2 challenge with SHAC.

CVOct 24, 2023Code
Breaking of brightness consistency in optical flow with a lightweight CNN network

Yicheng Lin, Shuo Wang, Yunlong Jiang et al.

Sparse optical flow is widely used in various computer vision tasks, however assuming brightness consistency limits its performance in High Dynamic Range (HDR) environments. In this work, a lightweight network is used to extract illumination robust convolutional features and corners with strong invariance. Modifying the typical brightness consistency of the optical flow method to the convolutional feature consistency yields the light-robust hybrid optical flow method. The proposed network runs at 190 FPS on a commercial CPU because it uses only four convolutional layers to extract feature maps and score maps simultaneously. Since the shallow network is difficult to train directly, a deep network is designed to compute the reliability map that helps it. An end-to-end unsupervised training mode is used for both networks. To validate the proposed method, we compare corner repeatability and matching performance with origin optical flow under dynamic illumination. In addition, a more accurate visual inertial system is constructed by replacing the optical flow method in VINS-Mono. In a public HDR dataset, it reduces translation errors by 93\%. The code is publicly available at https://github.com/linyicheng1/LET-NET.

ITApr 27
Covariance-Aware Demapping on Fourier-Curve Constellations

Bin Han, Muxia Sun, H. Vincent Poor et al.

Injecting artificial noise (AN) along the tangent space of a curved constellation makes each transmitted symbol induce a Gaussian observation with a symbol-dependent rank-one covariance, so the matched maximum-likelihood (ML) decoder differs from the Euclidean nearest-neighbor decoder by a single rank-one correction per candidate. We develop a baseband-demapper realization of this correction for the Fourier-curve constellation and instantiate a regular $(3,6)$ low-density parity-check (LDPC)-coded link at $(k,M){=}(20,64)$. Against four baselines (Euclidean-mismatched, flat-constellation isotropic-AN, no-AN, and same-spectral-efficiency narrowband), the matched decoder extends the BLER${=}10^{-1}$ operating range by approximately $5$\,dB over the Euclidean-mismatched counterpart on the same tangent-AN transmitter, at a cost of $2kM$ additional multiply-accumulate operations per symbol ($+50\%/+100\%$ under residual/template-correlation accounting) and a $20$\,KB constellation--tangent lookup table ($10$\,KB incremental over a Euclidean template-only LUT). A bit-interleaved coded-modulation achievable-rate (BICM-AIR) computation supports the same matched-metric advantage at the tested labeling and max-log demapper, indicating that the BLER gain is not merely an artifact of this particular LDPC simulation, and a Woodbury extension generalizes the rank-one correction to per-tone Ricean fading. In the tested Monte-Carlo runs, a design-aware bounded-search eavesdropper without the phase-key shows no successful LDPC decoding at any tested $k\in\{2,8,20\}$ within a $B{=}10^{3}$ non-code-aided search budget; code-aided, multi-frame, and known-preamble attacks are left to follow-up work. LUT quantization down to $6$ bits yields no measurable coded-BLER degradation at the tested operating points.

ITApr 16
Matched and Euclidean-Mismatched Decoding on Fourier-Curve Constellations with Tangent Noise

Bin Han, Hao Chen, Muxia Sun et al.

We study matched and Euclidean-mismatched decoding on finite Fourier-curve constellations with tangent-space artificial noise. Each hypothesis induces a Gaussian law with symbol-dependent rank-one covariance. We derive exact Euclidean pairwise errors for arbitrary pairs and an exact Gaussian-expectation representation for matched decoding on bilaterally tangent-orthogonal pairs. For uniform even constellations, the Euclidean side yields explicit distance spectra and symbol-error bounds across all offset classes; the matched side is exact on antipodal pairs and benchmarked numerically at the full-codebook level via Monte Carlo. By isolating the detection-theoretic consequence of tangent-space artificial noise, these results clarify analytically how noise fraction and constellation density enter the mismatch behavior; secrecy-rate implications require additional channel and adversary modeling.

CRApr 21
Physical Layer Deception as a Stackelberg Game: Strategy Regimes, Equilibrium, and Robust Design

Wenwen Chen, Bin Han, Yao Zhu et al.

Physical layer deception (PLD) combines physical layer security (PLS) with deception: the transmitter actively misleads the eavesdropper with falsified information. We model the transmitter-eavesdropper interaction as a Stackelberg game in which the transmitter commits to a resource allocation and encryption strategy, and each receiver best-responds by selecting among three decryption modes: Perception, Dropping, and Exclusion. Using semantic distortion as the metric, we derive closed-form switching surfaces that partition the parameter space into strategy regimes and identify conditions under which each regime dominates. The robust operating point, at the peak of the worst-case distortion envelope, is shown to be a Stackelberg equilibrium; iterative best-response dynamics oscillate around it with strictly lower time-averaged security. We evaluate the design under Nakagami-m fading with static and adaptive transmitter strategies, benchmarked against a classical PLS baseline. Numerical results validate the regime characterization and show 12-55% higher eavesdropper distortion than the erasure-only baseline across all fading conditions.

MAApr 20, 2022
Massive Twinning to Enhance Emergent Intelligence

Siyu Yuan, Bin Han, Dennis Krummacker et al.

As a complement to conventional AI solutions, emergent intelligence (EI) exhibits competitiveness in 6G IIoT scenario for its various outstanding features including robustness, protection to privacy, and scalability. However, despite the low computational complexity, EI is challenged by its high demand of data traffic in massive deployment. We propose to leverage massive twinning, which 6G is envisaged to support, to reduce the data traffic in EI and therewith enhance its performance.

NEOct 27, 2022
Trust-Awareness to Secure Swarm Intelligence from Data Injection Attack

Bin Han, Dennis Krummacker, Qiuheng Zhou et al.

Enabled by the emerging industrial agent (IA) technology, swarm intelligence (SI) is envisaged to play an important role in future industrial Internet of Things (IIoT) that is shaped by Sixth Generation (6G) mobile communications and digital twin (DT). However, its fragility against data injection attack may halt it from practical deployment. In this paper we propose an efficient trust approach to address this security concern for SI.

ASOct 25, 2022
Artificial ASMR: A Cyber-Psychological Approach

Zexin Fang, Bin Han, C. Clark Cao et al.

The popularity of Autonomous Sensory Meridian Response (ASMR) has skyrockted over the past decade, but scientific studies on what exactly triggered ASMR effect remain few and immature, one most commonly acknowledged trigger is that ASMR clips typically provide rich semantic information. With our attention caught by the common acoustic patterns in ASMR audios, we investigate the correlation between the cyclic features of audio signals and their effectiveness in triggering ASMR effects. A cyber-psychological approach that combines signal processing, artificial intelligence, and experimental psychology is taken, with which we are able to quantize ASMR-related acoustic features, and therewith synthesize ASMR clips with random cyclic patterns but not delivering identifiably scenarios to the audience, which were proven to be effective in triggering ASMR effects.

GTApr 13
The Price of Ignorance: Information-Free Quotation for Data Retention in Machine Unlearning

Bin Han, Di Feng, Zexin Fang et al.

When users exercise data deletion rights under the General Data Protection Regulation (GDPR) and similar regulations, mobile network operators face a tradeoff: excessive machine unlearning degrades model accuracy and incurs retraining costs, yet existing pricing mechanisms for data retention require the server to know every user's private privacy and accuracy preferences, which is infeasible under the very regulations that motivate unlearning. We ask: what is the welfare cost of operating without this private information? We design an information-free ascending quotation mechanism where the server broadcasts progressively higher prices and users self-select their data supply, requiring no knowledge of users' parameters. Under complete information, the protocol admits a unique subgame-perfect Nash equilibrium characterized by single-period selling. We formalize the Price of Ignorance -- the welfare gap between optimal personalized pricing (which knows everything) and our information-free quotation (which knows nothing) -- and prove a three-regime efficiency ordering. Numerical evaluation across seven mechanisms and 5000 Monte Carlo runs shows that this price is near zero: the information-free mechanism achieves >=99% of the welfare of its information-intensive benchmarks, while providing noise-robust guarantees and comparable fairness.

CLJul 4, 2023
KDSTM: Neural Semi-supervised Topic Modeling with Knowledge Distillation

Weijie Xu, Xiaoyu Jiang, Jay Desai et al. · amazon-science

In text classification tasks, fine tuning pretrained language models like BERT and GPT-3 yields competitive accuracy; however, both methods require pretraining on large text datasets. In contrast, general topic modeling methods possess the advantage of analyzing documents to extract meaningful patterns of words without the need of pretraining. To leverage topic modeling's unsupervised insights extraction on text classification tasks, we develop the Knowledge Distillation Semi-supervised Topic Modeling (KDSTM). KDSTM requires no pretrained embeddings, few labeled documents and is efficient to train, making it ideal under resource constrained settings. Across a variety of datasets, our method outperforms existing supervised topic modeling methods in classification accuracy, robustness and efficiency and achieves similar performance compare to state of the art weakly supervised text classification methods.

NAMay 11
Galerkin Scheme Using Biorthogonal Wavelets on Intervals for Elliptic Interface Problems

Bin Han, Michelle Michelle

This paper presents a wavelet Galerkin method for solving elliptic interface problems of the form $-\nabla\cdot(a\nabla u)=f$ in $Ω\backslash Γ$, where $Γ$ is a smooth interface within $Ω$. Since the scalar variable coefficient $a>0$ and source term $f$ are often discontinuous across $Γ$, the solution $u$ typically has discontinuous gradient $\nabla u$ across $Γ$ and hence $u\not\in H^{1.5}(Ω)$, posing significant challenges for traditional numerical methods. By utilizing a compactly supported biorthogonal wavelet for $H^1_0(Ω)$, we develop a strategy that incorporates additional wavelet elements (or basis functions) along the interface to resolve the complex geometry of the interface $Γ$ and the resulting gradient discontinuities. For the two-dimensional (2D) elliptic interface problem, the proposed method achieves near-optimal convergence rates: $\mathcal{O}(h |\log(h)|)$ in the $H^1(Ω)$-norm and $\mathcal(h^2 |\log(h)|^2)$ in the $L^{2}$-norm with respect to the approximation order. A key theoretical contribution is the use of the dual biorthogonal wavelet basis to establish the $H^1(Ω)$ convergence results. This is supported by the development of weighted Bessel properties for wavelets and several inequalities in fractional Sobolev spaces. To maintain high accuracy and robustness against high-contrast coefficients, our method leverages an augmented set of wavelet elements, similar to meshfree approaches, thereby eliminating the need for the complex re-meshing required by finite element methods. Unlike existing techniques, this wavelet Riesz basis framework captures the geometry of $Γ$ seamlessly while ensuring that the condition numbers of the coefficient matrices remain small and uniformly bounded, independent of the problem size.

DCMay 25
Bandwidth-Aware LLM Inference on Heterogeneous Many-Core Supercomputers

Yao Lu, Zhongzhi Luan, Gen Li et al.

Large language model (LLM) inference is limited by high computational cost and memory bandwidth demands, making deployment on heterogeneous many-core processors challenging. Taking the MT-3000 processor used in the Tianhe supercomputer as an example, its limited main-memory bandwidth and distributed memory hierarchy exemplify these bottlenecks, making it difficult to directly migrate existing GPU-based inference frameworks. To address this problem, we propose THInfer, a hardware-aware inference framework that maximizes data locality under bandwidth-constrained conditions through hardware-software co-design and parallel strategy optimization. THInfer incorporates three key techniques: (1) a high-performance operator library for the VLIW SIMD architecture, providing hand-optimized FP16 kernels that achieve up to 70 percent of the peak performance per cluster; (2) a density-driven computation graph fusion and unified kernel scheduling mechanism, combined with a staged pipelined attention fusion method; and (3) a Prefill-Buffer-Decode (P-B-D) pipeline and bounded buffer management strategy, which supports hybrid parallelism and enables efficient multi-cluster collaboration through two-level communication based on MPI and hthreads. Experiments on the Llama model series show that THInfer improves throughput on the 7B model by 62 percent to 73 percent over DeepSpeed on two V100S GPUs and by 67 percent to 84 percent over the A800 GPU. The 13B and 30B models also demonstrate comparable or better performance. Moreover, THInfer maintains stable performance on the 70B model, whereas typical GPU-based frameworks fail to run under the same setting. Overall, THInfer significantly enhances throughput, reduces latency, and improves scalability, providing a feasible system solution for efficient and scalable LLM inference on heterogeneous many-core architectures.

NIMar 12
The Structure of Service Level Agreement of Slice-based 5G Network

Mohammad Asif Habibi, Bin Han, Meysam Nasimi et al.

Network slicing is considered to be one of the key enablers to Fifth Generation (5G) communication system. Legacy telecommunication networks have been providing various services to all kinds of customers through a single network infrastructure. In contrast, with the deployment of network slicing, operators are now able to partition entire network into different slices, each with its own configuration and Quality of Service (QoS) requirements. There are many applications across industry, each needs an independent slice with its own functions and features. All these applications open new business opportunities, which require new business models and therefore every single slice needs an individual Service Level Agreement (SLA). In this paper, we proposed a comprehensive end-to-end structure of SLA between tenant and service provider of slice-based 5G network, which balances the interests of both sides. The proposed SLA is expected to define reliability, availability, and performance of delivered telecommunication services in order to ensure that right information gets to the right destination at right time, safely and securely. We also discussed the metrics of slice-based network SLA such as throughput, penalty, cost, revenue, profit, and QoS related metrics, which we think are very critical to be considered during the agreement.

CVAug 1, 2024
Towards Zero-Shot Annotation of the Built Environment with Vision-Language Models (Vision Paper)

Bin Han, Yiwei Yang, Anat Caspi et al.

Equitable urban transportation applications require high-fidelity digital representations of the built environment: not just streets and sidewalks, but bike lanes, marked and unmarked crossings, curb ramps and cuts, obstructions, traffic signals, signage, street markings, potholes, and more. Direct inspections and manual annotations are prohibitively expensive at scale. Conventional machine learning methods require substantial annotated training data for adequate performance. In this paper, we consider vision language models as a mechanism for annotating diverse urban features from satellite images, reducing the dependence on human annotation to produce large training sets. While these models have achieved impressive results in describing common objects in images captured from a human perspective, their training sets are less likely to include strong signals for esoteric features in the built environment, and their performance in these settings is therefore unclear. We demonstrate proof-of-concept combining a state-of-the-art vision language model and variants of a prompting strategy that asks the model to consider segmented elements independently of the original image. Experiments on two urban features -- stop lines and raised tables -- show that while direct zero-shot prompting correctly annotates nearly zero images, the pre-segmentation strategies can annotate images with near 40% intersection-over-union accuracy. We describe how these results inform a new research agenda in automatic annotation of the built environment to improve equity, accessibility, and safety at broad scale and in diverse environments.

NAMay 20
Efficient and simple fourth-order compact finite difference methods for convection-diffusion-reaction equations on arbitrary curved domains

Qiwei Feng, Bin Han, Peter Minev

In this paper, we discuss the 2D convection-diffusion-reaction equation with variable smooth coefficients and the Dirichlet boundary condition on a complicated, thin, and curved domain. We propose the fourth-order compact FDM at every grid point with the uniform Cartesian mesh. For the regular stencil center, we utilize the fourth-order compact 9-point FDM to approximate the solution. According to the preliminary analysis, we use vertical and horizontal transformations to derive fourth-order compact FDMs in 10 cases for all irregular stencil centers. To obtain the left-hand side of the stencil of the fourth-order FDM in each case, we only need to solve an at most $6 \times 24$ linear system which is presented with the explicit formula. The right-hand side of the FDM is constructed in explicit expression for any irregular stencil centers too. To achieve the fourth-order consistency, up to second-order partial derivatives of convection, diffusion, reaction, and source terms are used for the FDM at the regular stencil center, and the FDM at an irregular stencil center only requires first-order partial derivatives of convection, diffusion, reaction, and source terms, and up to third-order derivatives of the Dirichlet boundary function and the parametric expression of the boundary curve. We test challenging domains with 100-leaf, high-curvature, high-frequency, sharply varying, and nearly overlapping boundary curves, the proposed FDM produces the high accuracy and the stable fourth-order convergence rate in $l_2$ and $l_{\infty}$ norms. All stencils of our FDMs have a simple desired structure by only keeping grid points inside $Ω$ in the standard compact 9-point stencil for both regular stencils and boundary stencils, but without assuming any information outside the domain $Ω$.

CVOct 1, 2023
Top-down Green-ups: Satellite Sensing and Deep Models to Predict Buffelgrass Phenology

Lucas Rosenblatt, Bin Han, Erin Posthumus et al.

An invasive species of grass known as "buffelgrass" contributes to severe wildfires and biodiversity loss in the Southwest United States. We tackle the problem of predicting buffelgrass "green-ups" (i.e. readiness for herbicidal treatment). To make our predictions, we explore temporal, visual and multi-modal models that combine satellite sensing and deep learning. We find that all of our neural-based approaches improve over conventional buffelgrass green-up models, and discuss how neural model deployment promises significant resource savings.

CVJan 10, 2023
Adapting to Skew: Imputing Spatiotemporal Urban Data with 3D Partial Convolutions and Biased Masking

Bin Han, Bill Howe

We adapt image inpainting techniques to impute large, irregular missing regions in urban settings characterized by sparsity, variance in both space and time, and anomalous events. Missing regions in urban data can be caused by sensor or software failures, data quality issues, interference from weather events, incomplete data collection, or varying data use regulations; any missing data can render the entire dataset unusable for downstream applications. To ensure coverage and utility, we adapt computer vision techniques for image inpainting to operate on 3D histograms (2D space + 1D time) commonly used for data exchange in urban settings. Adapting these techniques to the spatiotemporal setting requires handling skew: urban data tend to follow population density patterns (small dense regions surrounded by large sparse areas); these patterns can dominate the learning process and fool the model into ignoring local or transient effects. To combat skew, we 1) train simultaneously in space and time, and 2) focus attention on dense regions by biasing the masks used for training to the skew in the data. We evaluate the core model and these two extensions using the NYC taxi data and the NYC bikeshare data, simulating different conditions for missing data. We show that the core model is effective qualitatively and quantitatively, and that biased masking during training reduces error in a variety of scenarios. We also articulate a tradeoff in varying the number of timesteps per training sample: too few timesteps and the model ignores transient events; too many timesteps and the model is slow to train with limited performance gain.

LGJun 9, 2023
SARN: Structurally-Aware Recurrent Network for Spatio-Temporal Disaggregation

Bin Han, Bill Howe

Open data is frequently released spatially aggregated, usually to comply with privacy policies. But coarse, heterogeneous aggregations complicate learning and integration for downstream AI/ML systems. In this work, we consider models to disaggregate spatio-temporal data from a low-resolution, irregular partition (e.g., census tract) to a high-resolution, irregular partition (e.g., city block). We propose an overarching model named the Structurally-Aware Recurrent Network (SARN), which integrates structurally-aware spatial attention (SASA) layers into the Gated Recurrent Unit (GRU) model. The spatial attention layers capture spatial interactions among regions, while the gated recurrent module captures the temporal dependencies. Each SASA layer calculates both global and structural attention -- global attention facilitates comprehensive interactions between different geographic levels, while structural attention leverages the containment relationship between different geographic levels (e.g., a city block being wholly contained within a census tract) to ensure coherent and consistent results. For scenarios with limited historical training data, we explore transfer learning and show that a model pre-trained on one city variable can be fine-tuned for another city variable using only a few hundred samples. Evaluating these techniques on two mobility datasets, we find that on both datasets, SARN significantly outperforms other neural models (5% and 1%) and typical heuristic methods (40% and 14%), enabling us to generate realistic, high-quality fine-grained data for downstream applications.

HCJul 17, 2024
In-Depth Analysis of Emotion Recognition through Knowledge-Based Large Language Models

Bin Han, Cleo Yau, Su Lei et al.

Emotion recognition in social situations is a complex task that requires integrating information from both facial expressions and the situational context. While traditional approaches to automatic emotion recognition have focused on decontextualized signals, recent research emphasizes the importance of context in shaping emotion perceptions. This paper contributes to the emerging field of context-based emotion recognition by leveraging psychological theories of human emotion perception to inform the design of automated methods. We propose an approach that combines emotion recognition methods with Bayesian Cue Integration (BCI) to integrate emotion inferences from decontextualized facial expressions and contextual knowledge inferred via Large-language Models. We test this approach in the context of interpreting facial expressions during a social task, the prisoner's dilemma. Our results provide clear support for BCI across a range of automatic emotion recognition methods. The best automated method achieved results comparable to human observers, suggesting the potential for this approach to advance the field of affective computing.

SPMar 16
Generative Semantic HARQ: Latent-Space Text Retransmission and Combining

Bin Han, Yulin Hu, Hans D. Schotten

Semantic communication conveys meaning rather than raw bits, but reliability at the semantic level remains an open challenge. We propose a semantic-level hybrid automatic repeat request (HARQ) framework for text communication, in which a Transformer-variational autoencoder (VAE) codec operates as a lightweight overlay on the conventional protocol stack. The stochastic encoder inherently generates diverse latent representations across retransmissions-providing incremental knowledge (IK) from a single model without dedicated protocol design. On the receiver side, a soft quality estimator triggers retransmissions and a quality-aware combiner merges the received latent vectors within a consistent latent space. We systematically benchmark six semantic quality metrics and four soft combining strategies under hybrid semantic distortion that mixes systematic bias with additive noise. The results suggest combining Weighted-Average or MRC-Inspired combining with self-consistency-based HARQ triggering for the best performance.

SPApr 9
Quality-Aware Denoising of Ultra-Short TDoA Measurements for 5G-NR UAV Localization

Zexin Fang, Bin Han, Anjie Qiu et al.

Reliable positioning is essential for Uncrewed Aerial Vehicles (UAVs) in safety-critical urban operations, yet achieving sub-meter accuracy under stringent latency constraints remains challenging. While 3rd Generation Partnership Project (3GPP) specifies repeated Positioning Reference Signals (PRS) transmissions for accurate Time Difference of Arrival (TDoA) measurements, denoising techniques specifically tailored for extremely limited measurement sequences within 3GPP frameworks remain underexplored. We propose Adaptive Gain Exponential Smoother (AGES), a lightweight filter combining exponentially weighted averaging with adaptive gains informed by 3GPP measurement quality reports. Simulations demonstrate AGES achieves 30-40% reduction in positioning error with only 3-5 repeated measurements while maintaining Fifth Generation New Radio (5G-NR) infrastructure compatibility.

NAMar 22
Wavelet-based Galerkin Scheme with Arbitrarily High-Order Convergence for 1D Elliptic Interface Problems

Bin Han, Michelle Michelle

The solution $u$ of an elliptic interface problem in a domain $Ω$ is often smooth away from the interface $Γ\subset Ω$, but its gradient is discontinuous across $Γ$, resulting in low regularity; in particular, $u \notin H^{1.5}(Ω)$. This paper focuses on 1D elliptic interface problems using wavelet methods. We propose a Galerkin method using locally supported biorthogonal wavelet bases on bounded intervals with $m$th approximation order for any integer $m \ge 2$. Additionally, we rigorously prove that its convergence rates are of order $m-1$ in the $H^1(Ω)$-norm and order $m$ in the $L^2(Ω)$-norm, which are optimal with respect to the scheme's approximation order $m$. Our approach involves incorporating wavelet basis functions from higher scale levels to capture the singularity in the neighbourhood of the interface $Γ$. The results in this paper both complement and sharply contrast our findings in Han and Michelle (2024), where we consider a similar wavelet-based method for solving $d$-dimensional elliptic interface problems with $d\ge 2$.

MLNov 14, 2025
Knowledge vs. Experience: Asymptotic Limits of Impatience in Edge Tenants

Anthony Kiggundu, Bin Han, Hans D. Schotten

We study how two information feeds, a closed-form Markov estimator of residual sojourn and an online trained actor-critic, affect reneging and jockeying in a dual M/M/1 system. Analytically, for unequal service rates and total-time patience, we show that total wait grows linearly so abandonment is inevitable and the probability of a successful jockey vanishes as the backlog approaches towards infinity. Furthermore, under a mild sub-linear error condition both information models yield the same asymptotic limits (robustness). We empirically validate these limits and quantify finite backlog differences. Our findings show that learned and analytic feeds produce different delays, reneging rates and transient jockeying behavior at practical sizes, but converge to the same asymptotic outcome implied by our theory. The results characterize when value-of-information matters (finite regimes) and when it does not (asymptotics), informing lightweight telemetry and decision-logic design for low-cost, jockeying-aware systems.

CVMay 24, 2025Code
Why Not Replace? Sustaining Long-Term Visual Localization via Handcrafted-Learned Feature Collaboration on CPU

Yicheng Lin, Yunlong Jiang, Xujia Jiao et al.

Robust long-term visual localization in complex industrial environments is critical for mobile robotic systems. Existing approaches face limitations: handcrafted features are illumination-sensitive, learned features are computationally intensive, and semantic- or marker-based methods are environmentally constrained. Handcrafted and learned features share similar representations but differ functionally. Handcrafted features are optimized for continuous tracking, while learned features excel in wide-baseline matching. Their complementarity calls for integration rather than replacement. Building on this, we propose a hierarchical localization framework. It leverages real-time handcrafted feature extraction for relative pose estimation. In parallel, it employs selective learned keypoint detection on optimized keyframes for absolute positioning. This design enables CPU-efficient, long-term visual localization. Experiments systematically progress through three validation phases: Initially establishing feature complementarity through comparative analysis, followed by computational latency profiling across algorithm stages on CPU platforms. Final evaluation under photometric variations (including seasonal transitions and diurnal cycles) demonstrates 47% average error reduction with significantly improved localization consistency. The code implementation is publicly available at https://github.com/linyicheng1/ORB_SLAM3_localization.

CVSep 1, 2021Code
BVMatch: Lidar-based Place Recognition Using Bird's-eye View Images

Lun Luo, Si-Yuan Cao, Bin Han et al.

Recognizing places using Lidar in large-scale environments is challenging due to the sparse nature of point cloud data. In this paper we present BVMatch, a Lidar-based frame-to-frame place recognition framework, that is capable of estimating 2D relative poses. Based on the assumption that the ground area can be approximated as a plane, we uniformly discretize the ground area into grids and project 3D Lidar scans to bird's-eye view (BV) images. We further use a bank of Log-Gabor filters to build a maximum index map (MIM) that encodes the orientation information of the structures in the images. We analyze the orientation characteristics of MIM theoretically and introduce a novel descriptor called bird's-eye view feature transform (BVFT). The proposed BVFT is insensitive to rotation and intensity variations of BV images. Leveraging the BVFT descriptors, we unify the Lidar place recognition and pose estimation tasks into the BVMatch framework. The experiments conducted on three large-scale datasets show that BVMatch outperforms the state-of-the-art methods in terms of both recall rate of place recognition and pose estimation accuracy. The source code of our method is publicly available at https://github.com/zjuluolun/BVMatch.

NAOct 27, 2025
An Efficient Finite Difference-Based PML Technique for Acoustic Scattering Problems

Bin Han, Jiwoon Sim

The acoustic scattering problem is modeled by the exterior Helmholtz equation, which is challenging to solve due to both the unboundedness of the domain and the high dispersion error, known as the pollution effect. We develop high-order compact finite difference methods (FDMs) in polar coordinates to numerically solve the problem with multiple arbitrarily shaped scatterers. The unbounded domain is effectively truncated and compressed via perfectly matched layers (PMLs), while the pollution effect is handled by the high order of our method and a novel pollution minimization technique. This technique is easy to implement, rigorously proven to be effective and shows superior performance in our numerous numerical results. The FDMs we propose in regular polar coordinates achieve fourth consistency order. Yet, combined with exponential stretching and mesh refinement, we can reach sixth consistency order by slightly enlarging the stencil at certain locations. Our numerical examples demonstrate that the proposed FDMs are effective and robust under various wavenumbers, PML layer thickness and shapes of scatterers.

AIMay 31, 2025
Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs

Chenjun Xu, Bingbing Wen, Bin Han et al. · allen-ai, uw

Psychology research has shown that humans are poor at estimating their performance on tasks, tending towards underconfidence on easy tasks and overconfidence on difficult tasks. We examine three LLMs, Llama-3-70B-instruct, Claude-3-Sonnet, and GPT-4o, on a range of QA tasks of varying difficulty, and show that models exhibit subtle differences from human patterns of overconfidence: less sensitive to task difficulty, and when prompted to answer based on different personas -- e.g., expert vs layman, or different race, gender, and ages -- the models will respond with stereotypically biased confidence estimations even though their underlying answer accuracy remains the same. Based on these observations, we propose Answer-Free Confidence Estimation (AFCE) to improve confidence calibration and LLM interpretability in these settings. AFCE is a self-assessment method that employs two stages of prompting, first eliciting only confidence scores on questions, then asking separately for the answer. Experiments on the MMLU and GPQA datasets spanning subjects and difficulty show that this separation of tasks significantly reduces overconfidence and delivers more human-like sensitivity to task difficulty.

SPApr 9
Balancing Functionality and GDPR-Driven Privacy in ISAC Trajectory Sharing

Zexin Fang, Bin Han, Zhuojun Tian et al.

Integrated Sensing and Communications (ISAC) enables trajectory sharing that enhances beamforming, resource allocation, and cooperative perception, yet raises fundamental privacy concerns under the General Data Protection Regulation (GDPR) data minimisation principle. This paper proposes a Fisher Information Density (FID)-constrained trajectory sharing framework that enforces a local lower bound on estimation uncertainty, providing hard, quantifiable privacy guarantees by construction. Unlike fixed-noise approaches, the proposed method bounds the Privacy Leak Ratio (PLR) regardless of sensing power or adversarial post-processing, ensuring that no trajectory segment can be reconstructed beyond a prescribed accuracy threshold. Simulations on the OpenTraj dataset demonstrate that the framework keeps the average PLR below 20-25% and the maximum leakage segment duration under 2-2.5 s, while preserving data utility for downstream tasks such as movement prediction. The resulting criterion is interpretable, model-agnostic, and compatible with GDPR-compliant ISAC system design.

CLFeb 1
Personality Expression Across Contexts: Linguistic and Behavioral Variation in LLM Agents

Bin Han, Deuksin Kwon, Jonathan Gratch

Large Language Models (LLMs) can be conditioned with explicit personality prompts, yet their behavioral realization often varies depending on context. This study examines how identical personality prompts lead to distinct linguistic, behavioral, and emotional outcomes across four conversational settings: ice-breaking, negotiation, group decision, and empathy tasks. Results show that contextual cues systematically influence both personality expression and emotional tone, suggesting that the same traits are expressed differently depending on social and affective demands. This raises an important question for LLM-based dialogue agents: whether such variations reflect inconsistency or context-sensitive adaptation akin to human behavior. Viewed through the lens of Whole Trait Theory, these findings highlight that LLMs exhibit context-sensitive rather than fixed personality expression, adapting flexibly to social interaction goals and affective conditions.

LGOct 22, 2025
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Ling Team, Bin Han, Caizhi Tang et al.

In this technical report, we present the Ring-linear model series, specifically including Ring-mini-linear-2.0 and Ring-flash-linear-2.0. Ring-mini-linear-2.0 comprises 16B parameters and 957M activations, while Ring-flash-linear-2.0 contains 104B parameters and 6.1B activations. Both models adopt a hybrid architecture that effectively integrates linear attention and softmax attention, significantly reducing I/O and computational overhead in long-context inference scenarios. Compared to a 32 billion parameter dense model, this series reduces inference cost to 1/10, and compared to the original Ring series, the cost is also reduced by over 50%. Furthermore, through systematic exploration of the ratio between different attention mechanisms in the hybrid architecture, we have identified the currently optimal model structure. Additionally, by leveraging our self-developed high-performance FP8 operator library-linghe, overall training efficiency has been improved by 50%. Benefiting from the high alignment between the training and inference engine operators, the models can undergo long-term, stable, and highly efficient optimization during the reinforcement learning phase, consistently maintaining SOTA performance across multiple challenging complex reasoning benchmarks.

AIAug 7, 2025
Can Large Language Models Integrate Spatial Data? Empirical Insights into Reasoning Strengths and Computational Weaknesses

Bin Han, Robert Wolfe, Anat Caspi et al.

We explore the application of large language models (LLMs) to empower domain experts in integrating large, heterogeneous, and noisy urban spatial datasets. Traditional rule-based integration methods are unable to cover all edge cases, requiring manual verification and repair. Machine learning approaches require collecting and labeling of large numbers of task-specific samples. In this study, we investigate the potential of LLMs for spatial data integration. Our analysis first considers how LLMs reason about environmental spatial relationships mediated by human experience, such as between roads and sidewalks. We show that while LLMs exhibit spatial reasoning capabilities, they struggle to connect the macro-scale environment with the relevant computational geometry tasks, often producing logically incoherent responses. But when provided relevant features, thereby reducing dependence on spatial reasoning, LLMs are able to generate high-performing results. We then adapt a review-and-refine method, which proves remarkably effective in correcting erroneous initial responses while preserving accurate responses. We discuss practical implications of employing LLMs for spatial data integration in real-world contexts and outline future research directions, including post-training, multi-modal integration methods, and support for diverse data formats. Our findings position LLMs as a promising and flexible alternative to traditional rule-based heuristics, advancing the capabilities of adaptive spatial data integration.

AIMay 23, 2025
MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation

Jihan Yao, Yushi Hu, Yujie Yi et al. · allen-ai, uw

Automatically evaluating multimodal generation presents a significant challenge, as automated metrics often struggle to align reliably with human evaluation, especially for complex tasks that involve multiple modalities. To address this, we present MMMG, a comprehensive and human-aligned benchmark for multimodal generation across 4 modality combinations (image, audio, interleaved text and image, interleaved text and audio), with a focus on tasks that present significant challenges for generation models, while still enabling reliable automatic evaluation through a combination of models and programs. MMMG encompasses 49 tasks (including 29 newly developed ones), each with a carefully designed evaluation pipeline, and 937 instructions to systematically assess reasoning, controllability, and other key capabilities of multimodal generation models. Extensive validation demonstrates that MMMG is highly aligned with human evaluation, achieving an average agreement of 94.3%. Benchmarking results on 24 multimodal generation models reveal that even though the state-of-the-art model, GPT Image, achieves 78.3% accuracy for image generation, it falls short on multimodal reasoning and interleaved generation. Furthermore, results suggest considerable headroom for improvement in audio generation, highlighting an important direction for future research.

CLSep 19, 2025
Evaluating Behavioral Alignment in Conflict Dialogue: A Multi-Dimensional Comparison of LLM Agents and Humans

Deuksin Kwon, Kaleen Shrestha, Bin Han et al.

Large Language Models (LLMs) are increasingly deployed in socially complex, interaction-driven tasks, yet their ability to mirror human behavior in emotionally and strategically complex contexts remains underexplored. This study assesses the behavioral alignment of personality-prompted LLMs in adversarial dispute resolution by simulating multi-turn conflict dialogues that incorporate negotiation. Each LLM is guided by a matched Five-Factor personality profile to control for individual variation and enhance realism. We evaluate alignment across three dimensions: linguistic style, emotional expression (e.g., anger dynamics), and strategic behavior. GPT-4.1 achieves the closest alignment with humans in linguistic style and emotional dynamics, while Claude-3.7-Sonnet best reflects strategic behavior. Nonetheless, substantial alignment gaps persist. Our findings establish a benchmark for alignment between LLMs and humans in socially complex interactions, underscoring both the promise and the limitations of personality conditioning in dialogue modeling.

CVJul 17, 2025
Salience Adjustment for Context-Based Emotion Recognition

Bin Han, Jonathan Gratch

Emotion recognition in dynamic social contexts requires an understanding of the complex interaction between facial expressions and situational cues. This paper presents a salience-adjusted framework for context-aware emotion recognition with Bayesian Cue Integration (BCI) and Visual-Language Models (VLMs) to dynamically weight facial and contextual information based on the expressivity of facial cues. We evaluate this approach using human annotations and automatic emotion recognition systems in prisoner's dilemma scenarios, which are designed to evoke emotional reactions. Our findings demonstrate that incorporating salience adjustment enhances emotion recognition performance, offering promising directions for future research to extend this framework to broader social contexts and multimodal applications.

LGMay 20, 2025
Fragments to Facts: Partial-Information Fragment Inference from LLMs

Lucas Rosenblatt, Bin Han, Robert Wolfe et al.

Large language models (LLMs) can leak sensitive training data through memorization and membership inference attacks. Prior work has primarily focused on strong adversarial assumptions, including attacker access to entire samples or long, ordered prefixes, leaving open the question of how vulnerable LLMs are when adversaries have only partial, unordered sample information. For example, if an attacker knows a patient has "hypertension," under what conditions can they query a model fine-tuned on patient data to learn the patient also has "osteoarthritis?" In this paper, we introduce a more general threat model under this weaker assumption and show that fine-tuned LLMs are susceptible to these fragment-specific extraction attacks. To systematically investigate these attacks, we propose two data-blind methods: (1) a likelihood ratio attack inspired by methods from membership inference, and (2) a novel approach, PRISM, which regularizes the ratio by leveraging an external prior. Using examples from both medical and legal settings, we show that both methods are competitive with a data-aware baseline classifier that assumes access to labeled in-distribution data, underscoring their robustness.

LGMar 29, 2025
Buyer-Initiated Auction Mechanism for Data Redemption in Machine Unlearning

Bin Han, Di Feng, Jie Wang et al.

The rapid growth of artificial intelligence (AI) has raised privacy concerns over user data, leading to regulations like the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). With the essential toolbox provided by machine unlearning, AI service providers are now able to remove user data from their trained models as well as the training datasets, so as to comply with such regulations. However, extensive data redemption can be costly and degrade model accuracy. To balance the cost of unlearning and the privacy protection, we propose a buyer-initiated auction mechanism for data redemption, enabling the service provider to purchase data from willing users with appropriate compensation. This approach does not require the server to have any a priori knowledge about the users' privacy preference, and provides an efficient solution for maximizing the social welfare in the investigated problem.

LGDec 18, 2024
Quantum Machine Learning in Log-based Anomaly Detection: Challenges and Opportunities

Jiaxing Qi, Chang Zeng, Zhongzhi Luan et al.

Log-based anomaly detection (LogAD) is the main component of Artificial Intelligence for IT Operations (AIOps), which can detect anomalous that occur during the system on-the-fly. Existing methods commonly extract log sequence features using classical machine learning techniques to identify whether a new sequence is an anomaly or not. However, these classical approaches often require trade-offs between efficiency and accuracy. The advent of quantum machine learning (QML) offers a promising alternative. By transforming parts of classical machine learning computations into parameterized quantum circuits (PQCs), QML can significantly reduce the number of trainable parameters while maintaining accuracy comparable to classical counterparts. In this work, we introduce a unified framework, \ourframework{}, for evaluating QML models in the context of LogAD. This framework incorporates diverse log data, integrated QML models, and comprehensive evaluation metrics. State-of-the-art methods such as DeepLog, LogAnomaly, and LogRobust, along with their quantum-transformed counterparts, are included in our framework.Beyond standard metrics like F1 score, precision, and recall, our evaluation extends to factors critical to QML performance, such as specificity, the number of circuits, circuit design, and quantum state encoding. Using \ourframework{}, we conduct extensive experiments to assess the performance of these models and their quantum counterparts, uncovering valuable insights and paving the way for future research in QML model selection and design for LogAD.

LGApr 3, 2024
Robust Federated Learning for Wireless Networks: A Demonstration with Channel Estimation

Zexin Fang, Bin Han, Hans D. Schotten

Federated learning (FL) offers a privacy-preserving collaborative approach for training models in wireless networks, with channel estimation emerging as a promising application. Despite extensive studies on FL-empowered channel estimation, the security concerns associated with FL require meticulous attention. In a scenario where small base stations (SBSs) serve as local models trained on cached data, and a macro base station (MBS) functions as the global model setting, an attacker can exploit the vulnerability of FL, launching attacks with various adversarial attacks or deployment tactics. In this paper, we analyze such vulnerabilities, corresponding solutions were brought forth, and validated through simulation.

HCNov 19, 2021
Multi-Sensory HMI for Human-Centric Industrial Digital Twins: A 6G Vision of Future Industry

Bin Han, Hans D. Schotten

The next revolution of industry will turn the industries as well as the entire society into a human-centric shape. The human presence in industrial environment and the human participation in industrial processes will be magnified more than ever before. To cope with the emerging challenges raised by this revolution, 6G ambitions to bridge the three domains of digital information, physical assets and humans into one merged cyber-physical-human world. This proposes not only an unprecedented demand for digital twin solutions, but also new technical requirements. Especially, aiming at a human-centric industrial DT system, novel multi-sensory human-machine interfaces will play a key role in this paradigm shift.

IRMay 6, 2021
Users' Perception of Search Engine Biases and Satisfaction

Bin Han, Chirag Shah, Daniel Saelid

Search engines could consistently favor certain values over the others, which is considered as biased due to the built-in infrastructures. Many studies have been dedicated to detect, control, and mitigate the impacts of the biases from the perspectives of search engines themselves. In our study, we take the perspective from end-users to analyze their perceptions of search engine biases and their satisfaction when the biases are regulated. In the study, we paired a real search page from search engine Bing with a synthesized page that has more diversities in the results (i.e. less biased). Both pages show the top-10 search items given search queries and we asked participants which one do they prefer and why do they prefer the one selected. Statistical analyses revealed that overall, participants prefer the original Bing pages and the locations where the diversities are introduced are also associated with users' preferences. We found out that users prefer results that are more consistent and relevant to the search queries. Introducing diversities undermines the relevance of the search results and impairs users' satisfaction to some degree. Additionally, we confirmed that users tend to pay more attention to the top portion of the results than the bottom ones.

NIJan 22, 2021
AI-Empowered VNF Migration as a Cost-Loss-Effective Solution for Network Resilience

Amina Lejla Ibrahimpasic, Bin Han, Hans D. Schotten

With a wide deployment of Multi-Access Edge Computing (MEC) in the Fifth Generation (5G) mobile networks, virtual network functions (VNF) can be flexibly migrated between difference locations, and therewith significantly enhances the network resilience to counter the degradation in quality of service (QoS) due to network function outages. A balance has to be taken carefully, between the loss reduced by VNF migration and the operations cost generated thereby. To achieve this in practical scenarios with realistic user behavior, it calls for models of both cost and user mobility. This paper proposes a novel cost model and a AI-empowered approach for a rational migration of stateful VNFs, which minimizes the sum of operations cost and potential loss caused by outages, and is capable to deal with the complex realistic user mobility patterns.

MLMay 8, 2020
In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction

Caroline Wang, Bin Han, Bhrij Patel et al.

Objectives: We study interpretable recidivism prediction using machine learning (ML) models and analyze performance in terms of prediction ability, sparsity, and fairness. Unlike previous works, this study trains interpretable models that output probabilities rather than binary predictions, and uses quantitative fairness definitions to assess the models. This study also examines whether models can generalize across geographic locations. Methods: We generated black-box and interpretable ML models on two different criminal recidivism datasets from Florida and Kentucky. We compared predictive performance and fairness of these models against two methods that are currently used in the justice system to predict pretrial recidivism: the Arnold PSA and COMPAS. We evaluated predictive performance of all models on predicting six different types of crime over two time spans. Results: Several interpretable ML models can predict recidivism as well as black-box ML models and are more accurate than COMPAS or the Arnold PSA. These models are potentially useful in practice. Similar to the Arnold PSA, some of these interpretable models can be written down as a simple table. Others can be displayed using a set of visualizations. Our geographic analysis indicates that ML models should be trained separately for separate locations and updated over time. We also present a fairness analysis for the interpretable models. Conclusions: Interpretable machine learning models can perform just as well as non-interpretable methods and currently-used risk assessment scales, in terms of both prediction accuracy and fairness. Machine learning models might be more accurate when trained separately for distinct locations and kept up-to-date.

NIJan 22, 2020
Machine Learning for Network Slicing Resource Management: A Comprehensive Survey

Bin Han, Hans D. Schotten

The emerging technology of multi-tenancy network slicing is considered as an essential feature of 5G cellular networks. It provides network slices as a new type of public cloud services, and therewith increases the service flexibility and enhances the network resource efficiency. Meanwhile, it raises new challenges of network resource management. A number of various methods have been proposed over the recent past years, in which machine learning and artificial intelligence techniques are widely deployed. In this article, we provide a survey to existing approaches of network slicing resource management, with a highlight on the roles played by machine learning in them.

SPDec 17, 2018
AI-Aided Online Adaptive OFDM Receiver: Design and Experimental Results

Peiwen Jiang, Tianqi Wang, Bin Han et al.

Orthogonal frequency division multiplexing (OFDM) has been widely applied in current communication systems. The artificial intelligence (AI)-aided OFDM receivers are currently brought to the forefront to replace and improve the traditional OFDM receivers. In this study, we first compare two AI-aided OFDM receivers, namely, data-driven fully connected deep neural network and model-driven ComNet, through extensive simulation and real-time video transmission using a 5G rapid prototyping system for an over-the-air (OTA) test. We find a performance gap between the simulation and the OTA test caused by the discrepancy between the channel model for offline training and the real environment. We develop a novel online training system, which is called SwitchNet receiver, to address this issue. This receiver has a flexible and extendable architecture and can adapt to real channels by training only several parameters online. From the OTA test, the AI-aided OFDM receivers, especially the SwitchNet receiver, are robust to real environments and promising for future communication systems. We discuss potential challenges and future research inspired by our initial study in this paper.

NEFeb 13, 2018
Slice as an Evolutionary Service: Genetic Optimization for Inter-Slice Resource Management in 5G Networks

Bin Han, Lianghai Ji, Hans D. Schotten

In the context of Fifth Generation (5G) mobile networks, the concept of "Slice as a Service" (SlaaS) promotes mobile network operators to flexibly share infrastructures with mobile service providers and stakeholders. However, it also challenges with an emerging demand for efficient online algorithms to optimize the request-and-decision-based inter-slice resource management strategy. Based on genetic algorithms, this paper presents a novel online optimizer that efficiently approaches towards the ideal slicing strategy with maximized long-term network utility. The proposed method encodes slicing strategies into binary sequences to cope with the request-and-decision mechanism. It requires no a priori knowledge about the traffic/utility models, and therefore supports heterogeneous slices, while providing solid effectiveness, good robustness against non-stationary service scenarios, and high scalability.

MMApr 10, 2017
Robust Audio Watermarking Algorithm Based on Moving Average and DCT

Jinquan Zhang, Bin Han

Noise is often brought to host audio by common signal processing operation, and it usually changes the high-frequency component of an audio signal. So embedding watermark by adjusting low-frequency coefficient can improve the robustness of a watermark scheme. Moving Average sequence is a low-frequency feature of an audio signal. This work proposed a method which embedding watermark into the maximal coefficient in discrete cosine transform domain of a moving average sequence. Subjective and objective tests reveal that the proposed watermarking scheme maintains highly audio quality, and simultaneously, the algorithm is highly robust to common digital signal processing operations, including additive noise, sampling rate change, bit resolution transformation, MP3 compression, and random cropping, especially low-pass filtering.

ITJul 11, 2014
Image Inpainting Using Directional Tensor Product Complex Tight Framelets

Yi Shen, Bin Han, Elena Braverman

In this paper we are particularly interested in the image inpainting problem using directional complex tight wavelet frames. Under the assumption that frame coefficients of images are sparse, several iterative thresholding algorithms for the image inpainting problem have been proposed in the literature. The outputs of such iterative algorithms are closely linked to solutions of several convex minimization models using the balanced approach which simultaneously combines the $l_1$-regularization for sparsity of frame coefficients and the $l_2$-regularization for smoothness of the solution. Due to the redundancy of a tight frame, elements of a tight frame could be highly correlated and therefore, their corresponding frame coefficients of an image are expected to close to each other. This is called the grouping effect in statistics. In this paper, we establish the grouping effect property for frame-based convex minimization models using the balanced approach. This result on grouping effect partially explains the effectiveness of models using the balanced approach for several image restoration problems. Inspired by recent development on directional tensor product complex tight framelets (TP-CTFs) and their impressive performance for the image denoising problem, in this paper we propose an iterative thresholding algorithm using a single tight frame derived from TP-CTFs for the image inpainting problem. Experimental results show that our proposed algorithm can handle well both cartoons and textures simultaneously and performs comparably and often better than several well-known frame-based iterative thresholding algorithms for the image inpainting problem without noise. For the image inpainting problem with additive zero-mean i.i.d. Gaussian noise, our proposed algorithm using TP-CTFs performs superior than other known state-of-the-art frame-based image inpainting algorithms.

NADec 22, 2009
A Unitary Extension Principle for Shearlet Systems

Bin Han, Gitta Kutyniok, Zuowei Shen

In this paper, we first introduce the concept of an adaptive MRA (AMRA) structure which is a variant of the classical MRA structure suited to the main goal of a fast flexible decomposition strategy adapted to the data at each decomposition level. We then study this novel methodology for the general case of affine-like systems, and derive a Unitary Extension Principle (UEP) for filter design. Finally, we apply our results to the directional representation system of shearlets. This leads to a comprehensive theory for fast decomposition algorithms associated with shearlet systems which encompasses tight shearlet frames with spatially compactly supported generators within such an AMRA structure. Also shearlet-like systems associated with parabolic scaling and unimodular matrices optimally close to rotation as well as 3D shearlet systems are studied within this framework.