Qiang Li

h-index27

6papers

96citations

Novelty53%

AI Score42

Ranked #60,517 of 194,257 authors (top 31%)#11,832 in CL (top 38%)

6 Papers

0.5CLFeb 10, 2023

POSGen: Personalized Opening Sentence Generation for Online Insurance Sales

Yu Li, Yi Zhang, Weijia Wu et al.

The insurance industry is shifting their sales mode from offline to online, in expectation to reach massive potential customers in the digitization era. Due to the complexity and the nature of insurance products, a cost-effective online sales solution is to exploit chatbot AI to raise customers' attention and pass those with interests to human agents for further sales. For high response and conversion rates of customers, it is crucial for the chatbot to initiate a conversation with personalized opening sentences, which are generated with user-specific topic selection and ordering. Such personalized opening sentence generation is challenging because (i) there are limited historical samples for conversation topic recommendation in online insurance sales and (ii) existing text generation schemes often fail to support customized topic ordering based on user preferences. We design POSGen, a personalized opening sentence generation scheme dedicated for online insurance sales. It transfers user embeddings learned from auxiliary online user behaviours to enhance conversation topic recommendation, and exploits a context management unit to arrange the recommended topics in user-specific ordering for opening sentence generation. POSGen is deployed on a real-world online insurance platform. It achieves 2.33x total insurance premium improvement through a two-month global test.

2.1AIAug 26, 2023Code

Reinforcement Learning Based Multi-modal Feature Fusion Network for Novel Class Discovery

Qiang Li, Qiuyang Ma, Weizhi Nie et al.

With the development of deep learning techniques, supervised learning has achieved performances surpassing those of humans. Researchers have designed numerous corresponding models for different data modalities, achieving excellent results in supervised tasks. However, with the exponential increase of data in multiple fields, the recognition and classification of unlabeled data have gradually become a hot topic. In this paper, we employed a Reinforcement Learning framework to simulate the cognitive processes of humans for effectively addressing novel class discovery in the Open-set domain. We deployed a Member-to-Leader Multi-Agent framework to extract and fuse features from multi-modal information, aiming to acquire a more comprehensive understanding of the feature space. Furthermore, this approach facilitated the incorporation of self-supervised learning to enhance model training. We employed a clustering method with varying constraint conditions, ranging from strict to loose, allowing for the generation of dependable labels for a subset of unlabeled data during the training phase. This iterative process is similar to human exploratory learning of unknown data. These mechanisms collectively update the network parameters based on rewards received from environmental feedback. This process enables effective control over the extent of exploration learning, ensuring the accuracy of learning in unknown data categories. We demonstrate the performance of our approach in both the 3D and 2D domains by employing the OS-MN40, OS-MN40-Miss, and Cifar10 datasets. Our approach achieves competitive competitive results.

25.4CLFeb 5, 2024Code

Unified Hallucination Detection for Multimodal Large Language Models

Xiang Chen, Chenxi Wang, Yida Xue et al.

Despite significant strides in multimodal tasks, Multimodal Large Language Models (MLLMs) are plagued by the critical issue of hallucination. The reliable detection of such hallucinations in MLLMs has, therefore, become a vital aspect of model evaluation and the safeguarding of practical application deployment. Prior research in this domain has been constrained by a narrow focus on singular tasks, an inadequate range of hallucination categories addressed, and a lack of detailed granularity. In response to these challenges, our work expands the investigative horizons of hallucination detection. We present a novel meta-evaluation benchmark, MHaluBench, meticulously crafted to facilitate the evaluation of advancements in hallucination detection methods. Additionally, we unveil a novel unified multimodal hallucination detection framework, UNIHD, which leverages a suite of auxiliary tools to validate the occurrence of hallucinations robustly. We demonstrate the effectiveness of UNIHD through meticulous evaluation and comprehensive analysis. We also provide strategic insights on the application of specific tools for addressing various categories of hallucinations.

15.5CVMay 2, 2025

FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors

Chenxi Li, Weijie Wang, Qiang Li et al.

Text-driven object insertion in 3D scenes is an emerging task that enables intuitive scene editing through natural language. However, existing 2D editing-based methods often rely on spatial priors such as 2D masks or 3D bounding boxes, and they struggle to ensure consistency of the inserted object. These limitations hinder flexibility and scalability in real-world applications. In this paper, we propose FreeInsert, a novel framework that leverages foundation models including MLLMs, LGMs, and diffusion models to disentangle object generation from spatial placement. This enables unsupervised and flexible object insertion in 3D scenes without spatial priors. FreeInsert starts with an MLLM-based parser that extracts structured semantics, including object types, spatial relationships, and attachment regions, from user instructions. These semantics guide both the reconstruction of the inserted object for 3D consistency and the learning of its degrees of freedom. We leverage the spatial reasoning capabilities of MLLMs to initialize object pose and scale. A hierarchical, spatially aware refinement stage further integrates spatial semantics and MLLM-inferred priors to enhance placement. Finally, the appearance of the object is improved using the inserted-object image to enhance visual fidelity. Experimental results demonstrate that FreeInsert achieves semantically coherent, spatially precise, and visually realistic 3D insertions without relying on spatial priors, offering a user-friendly and flexible editing experience.

3.6CVOct 21, 2025

Moving Light Adaptive Colonoscopy Reconstruction via Illumination-Attenuation-Aware 3D Gaussian Splatting

Hao Wang, Ying Zhou, Haoyu Zhao et al.

3D Gaussian Splatting (3DGS) has emerged as a pivotal technique for real-time view synthesis in colonoscopy, enabling critical applications such as virtual colonoscopy and lesion tracking. However, the vanilla 3DGS assumes static illumination and that observed appearance depends solely on viewing angle, which causes incompatibility with the photometric variations in colonoscopic scenes induced by dynamic light source/camera. This mismatch forces most 3DGS methods to introduce structure-violating vaporous Gaussian blobs between the camera and tissues to compensate for illumination attenuation, ultimately degrading the quality of 3D reconstructions. Previous works only consider the illumination attenuation caused by light distance, ignoring the physical characters of light source and camera. In this paper, we propose ColIAGS, an improved 3DGS framework tailored for colonoscopy. To mimic realistic appearance under varying illumination, we introduce an Improved Appearance Modeling with two types of illumination attenuation factors, which enables Gaussians to adapt to photometric variations while preserving geometry accuracy. To ensure the geometry approximation condition of appearance modeling, we propose an Improved Geometry Modeling using high-dimensional view embedding to enhance Gaussian geometry attribute prediction. Furthermore, another cosine embedding input is leveraged to generate illumination attenuation solutions in an implicit manner. Comprehensive experimental results on standard benchmarks demonstrate that our proposed ColIAGS achieves the dual capabilities of novel view synthesis and accurate geometric reconstruction. It notably outperforms other state-of-the-art methods by achieving superior rendering fidelity while significantly reducing Depth MSE. Code will be available.

6.6CRApr 10, 2021

Practical Two-party Privacy-preserving Neural Network Based on Secret Sharing

Zhengqiang Ge, Zhipeng Zhou, Dong Guo et al.

Neural networks, with the capability to provide efficient predictive models, have been widely used in medical, financial, and other fields, bringing great convenience to our lives. However, the high accuracy of the model requires a large amount of data from multiple parties, raising public concerns about privacy. Privacy-preserving neural network based on multi-party computation is one of the current methods used to provide model training and inference under the premise of solving data privacy. In this study, we propose a new two-party privacy-preserving neural network training and inference framework in which privacy data is distributed to two non-colluding servers. We construct a preprocessing protocol for mask generation, support and realize secret sharing comparison on 2PC, and propose a new method to further reduce the communication rounds. Based on the comparison protocol, we construct building blocks such as division and exponential, and realize the process of training and inference that no longer needs to convert between different types of secret sharings and is entirely based on arithmetic secret sharing. Compared with the previous works, our work obtains higher accuracy, which is very close to that of plaintext training. While the accuracy has been improved, the runtime is reduced, considering the online phase, our work is 5x faster than SecureML, 4.32-5.75x faster than SecureNN, and is very close to the current optimal 3PC implementation, FALCON. For secure inference, as far as known knowledge is concerned, we should be the current optimal 2PC implementation, which is 4-358x faster than other works.