Qiang Dong

IR
h-index4
9papers
243citations
Novelty48%
AI Score31

9 Papers

CVDec 22, 2024
Adaptive Dataset Quantization

Muquan Li, Dongyang Zhang, Qiang Dong et al.

Contemporary deep learning, characterized by the training of cumbersome neural networks on massive datasets, confronts substantial computational hurdles. To alleviate heavy data storage burdens on limited hardware resources, numerous dataset compression methods such as dataset distillation (DD) and coreset selection have emerged to obtain a compact but informative dataset through synthesis or selection for efficient training. However, DD involves an expensive optimization procedure and exhibits limited generalization across unseen architectures, while coreset selection is limited by its low data keep ratio and reliance on heuristics, hindering its practicality and feasibility. To address these limitations, we introduce a newly versatile framework for dataset compression, namely Adaptive Dataset Quantization (ADQ). Specifically, we first identify the sub-optimal performance of naive Dataset Quantization (DQ), which relies on uniform sampling and overlooks the varying importance of each generated bin. Subsequently, we propose a novel adaptive sampling strategy through the evaluation of generated bins' representativeness score, diversity score and importance score, where the former two scores are quantified by the texture level and contrastive learning-based techniques, respectively. Extensive experiments demonstrate that our method not only exhibits superior generalization capability across different architectures, but also attains state-of-the-art results, surpassing DQ by average 3\% on various datasets.

IRSep 30, 2020
User-item matching for recommendation fairness

Qiang Dong, Shuang-Shuang Xie, Wen-Jun Li

As we all know, users and item-providers are two main parties of participants in recommender systems. However, most existing research efforts on recommendation were focused on better serving users and overlooked the purpose of item-providers. This paper is devoted to improve the item exposure fairness for item-providers' objective, and keep the recommendation accuracy not decreased or even improved for users' objective. We propose to set stock volume constraints on items, to be specific, limit the maximally allowable recommended times of an item to be proportional to the frequency of its being interacted in the past, which is validated to achieve superior item exposure fairness to common recommenders and thus mitigates the Matthew Effect on item popularity. With the two constraints of pre-existing recommendation length of users and our stock volumes of items, a heuristic strategy based on normalized scores and a Minimum Cost Maximum Flow (MCMF) based model are proposed to solve the optimal user-item matching problem, whose accuracy performances are even better than that of baseline algorithm in regular recommendation context, and in line with state-of-the-art enhancement of the baseline. What's more, our MCMF based strategy is parameter-free, while those counterpart algorithms have to resort to parameter traversal process to achieve their best performance.

IVJul 7, 2020
Automatic Ischemic Stroke Lesion Segmentation from Computed Tomography Perfusion Images by Image Synthesis and Attention-Based Deep Neural Networks

Guotai Wang, Tao Song, Qiang Dong et al.

Ischemic stroke lesion segmentation from Computed Tomography Perfusion (CTP) images is important for accurate diagnosis of stroke in acute care units. However, it is challenged by low image contrast and resolution of the perfusion parameter maps, in addition to the complex appearance of the lesion. To deal with this problem, we propose a novel framework based on synthesized pseudo Diffusion-Weighted Imaging (DWI) from perfusion parameter maps to obtain better image quality for more accurate segmentation. Our framework consists of three components based on Convolutional Neural Networks (CNNs) and is trained end-to-end. First, a feature extractor is used to obtain both a low-level and high-level compact representation of the raw spatiotemporal Computed Tomography Angiography (CTA) images. Second, a pseudo DWI generator takes as input the concatenation of CTP perfusion parameter maps and our extracted features to obtain the synthesized pseudo DWI. To achieve better synthesis quality, we propose a hybrid loss function that pays more attention to lesion regions and encourages high-level contextual consistency. Finally, we segment the lesion region from the synthesized pseudo DWI, where the segmentation network is based on switchable normalization and channel calibration for better performance. Experimental results showed that our framework achieved the top performance on ISLES 2018 challenge and: 1) our method using synthesized pseudo DWI outperformed methods segmenting the lesion from perfusion parameter maps directly; 2) the feature extractor exploiting additional spatiotemporal CTA images led to better synthesized pseudo DWI quality and higher segmentation accuracy; and 3) the proposed loss functions and network structure improved the pseudo DWI synthesis and lesion segmentation performance.

SIApr 24, 2020
Improving Recommendation Diversity by Highlighting the ExTrA Fabricated Experts

Ya-Hui An, Qiang Dong, Quan Yuan et al.

Nowadays, recommender systems (RSes) are becoming increasingly important to individual users and business marketing, especially in the online e-commerce scenarios. However, while the majority of recommendation algorithms proposed in the literature have focused their efforts on improving prediction accuracy, other important aspects of recommendation quality, such as diversity of recommendations, have been more or less overlooked. In the latest decade, recommendation diversity has drawn more research attention, especially in the models based on user-item bipartite networks. In this paper, we introduce a family of approaches to extract fabricated experts from users in RSes, named as the Expert Tracking Approaches (ExTrA for short), and explore the capability of these fabricated experts in improving the recommendation diversity, by highlighting them in a well-known bipartite network-based method, called the Mass Diffusion (MD for short) model. These ExTrA-based models are compared with two state-of-the-art MD-improved models HHP and BHC, with respect to recommendation accuracy and diversity. Comprehensive empirical results on three real-world datasets MovieLens, Netflix and RYM show that, our proposed ExTrA-based models can achieve significant diversity gain while maintain comparable level of recommendation accuracy.

SIApr 22, 2020
Alleviating the recommendation bias via rank aggregation

Qiang Dong, Quan Yuan, Yang-Bo Shi

The primary goal of a recommender system is often known as "helping users find relevant items", and a lot of recommendation algorithms are proposed accordingly. However, these accuracy-oriented methods usually suffer the problem of recommendation bias on popular items, which is not welcome to not only users but also item providers. To alleviate the recommendation bias problem, we propose a generic rank aggregation framework for the recommendation results of an existing algorithm, in which the user- and item-oriented ranking results are linearly aggregated together, with a parameter controlling the weight of the latter ranking process. Experiment results of a typical algorithm on two real-world data sets show that, this framework is effective to improve the recommendation fairness of any existing accuracy-oriented algorithms, while avoiding significant accuracy loss.

CLJan 28, 2019
OpenHowNet: An Open Sememe-based Lexical Knowledge Base

Fanchao Qi, Chenghao Yang, Zhiyuan Liu et al.

In this paper, we present an open sememe-based lexical knowledge base OpenHowNet. Based on well-known HowNet, OpenHowNet comprises three components: core data which is composed of more than 100 thousand senses annotated with sememes, OpenHowNet Web which gives a brief introduction to OpenHowNet as well as provides online exhibition of OpenHowNet information, and OpenHowNet API which includes several useful APIs such as accessing OpenHowNet core data and drawing sememe tree structures of senses. In the main text, we first give some backgrounds including definition of sememe and details of HowNet. And then we introduce some previous HowNet and sememe-based research works. Last but not least, we detail the constituents of OpenHowNet and their basic features and functionalities. Additionally, we briefly make a summary and list some future works.

CLNov 21, 2018
Resource Mention Extraction for MOOC Discussion Forums

Ya-Hui An, Liangming Pan, Min-Yen Kan et al.

In discussions hosted on discussion forums for MOOCs, references to online learning resources are often of central importance. They contextualize the discussion, anchoring the discussion participants' presentation of the issues and their understanding. However they are usually mentioned in free text, without appropriate hyperlinking to their associated resource. Automated learning resource mention hyperlinking and categorization will facilitate discussion and searching within MOOC forums, and also benefit the contextualization of such resources across disparate views. We propose the novel problem of learning resource mention identification in MOOC forums. As this is a novel task with no publicly available data, we first contribute a large-scale labeled dataset, dubbed the Forum Resource Mention (FoRM) dataset, to facilitate our current research and future research on this task. We then formulate this task as a sequence tagging problem and investigate solution architectures to address the problem. Importantly, we identify two major challenges that hinder the application of sequence tagging models to the task: (1) the diversity of resource mention expression, and (2) long-range contextual dependencies. We address these challenges by incorporating character-level and thread context information into a LSTM-CRF model. First, we incorporate a character encoder to address the out-of-vocabulary problem caused by the diversity of mention expressions. Second, to address the context dependency challenge, we encode thread contexts using an RNN-based context encoder, and apply the attention mechanism to selectively leverage useful context information during sequence tagging. Experiments on FoRM show that the proposed method improves the baseline deep sequence tagging models notably, significantly bettering performance on instances that exemplify the two challenges.

IRNov 11, 2015
Diffusion-like recommendation with enhanced similarity of objects

Ya-Hui An, Qiang Dong, Chong-Jing Sun et al.

In last decades, diversity and accuracy have been regarded as two important measures in evaluating a recommendation model. However, a clear concern is that a model focusing excessively on one measure will put the other one at risk, thus it is not easy to greatly improve diversity and accuracy simultaneously. In this paper, we propose to enhance the Resource-Allocation (RA) similarity in resource transfer equations of diffusion-like models, by giving a tunable exponent to the RA similarity, and traversing the value of the exponent to achieve the optimal recommendation results. In this way, we can increase the recommendation scores (allocated resource) of many unpopular objects. Experiments on three benchmark data sets, MovieLens, Netflix, and RateYourMusic show that the modified models can yield remarkable performance improvement compared with the original ones.

IRFeb 24, 2014
Information Filtering via Balanced Diffusion on Bipartite Networks

Da-Cheng Nie, Ya-Hui An, Qiang Dong et al.

Recent decade has witnessed the increasing popularity of recommender systems, which help users acquire relevant commodities and services from overwhelming resources on Internet. Some simple physical diffusion processes have been used to design effective recommendation algorithms for user-object bipartite networks, typically mass diffusion (MD) and heat conduction (HC) algorithms which have different advantages respectively on accuracy and diversity. In this paper, we investigate the effect of weight assignment in the hybrid of MD and HC, and find that a new hybrid algorithm of MD and HC with balanced weights will achieve the optimal recommendation results, we name it balanced diffusion (BD) algorithm. Numerical experiments on three benchmark data sets, MovieLens, Netflix and RateYourMusic (RYM), show that the performance of BD algorithm outperforms the existing diffusion-based methods on the three important recommendation metrics, accuracy, diversity and novelty. Specifically, it can not only provide accurately recommendation results, but also yield higher diversity and novelty in recommendations by accurately recommending unpopular objects.