CLFeb 23, 2023
Simple and Scalable Nearest Neighbor Machine TranslationYuhan Dai, Zhirui Zhang, Qiuzhi Liu et al. · tencent-ai
$k$NN-MT is a straightforward yet powerful approach for fast domain adaptation, which directly plugs pre-trained neural machine translation (NMT) models with domain-specific token-level $k$-nearest-neighbor ($k$NN) retrieval to achieve domain adaptation without retraining. Despite being conceptually attractive, $k$NN-MT is burdened with massive storage requirements and high computational complexity since it conducts nearest neighbor searches over the entire reference corpus. In this paper, we propose a simple and scalable nearest neighbor machine translation framework to drastically promote the decoding and storage efficiency of $k$NN-based models while maintaining the translation performance. To this end, we dynamically construct an extremely small datastore for each input via sentence-level retrieval to avoid searching the entire datastore in vanilla $k$NN-MT, based on which we further introduce a distance-aware adapter to adaptively incorporate the $k$NN retrieval results into the pre-trained NMT models. Experiments on machine translation in two general settings, static domain adaptation and online learning, demonstrate that our proposed approach not only achieves almost 90% speed as the NMT model without performance degradation, but also significantly reduces the storage requirements of $k$NN-MT.
CVAug 15, 2022Code
Pyramidal Predictive Network: A Model for Visual-frame Prediction Based on Predictive Coding TheoryChaofan Ling, Junpei Zhong, Weihua Li
Visual-frame prediction is a pixel-dense prediction task that infers future frames from past frames. Lacking of appearance details, low prediction accuracy and high computational overhead are still major problems with current models or methods. In this paper, we propose a novel neural network model inspired by the well-known predictive coding theory to deal with the problems. Predictive coding provides an interesting and reliable computational framework, which will be combined with other theories such as the cerebral cortex at different level oscillates at different frequencies, to design an efficient and reliable predictive network model for visual-frame prediction. Specifically, the model is composed of a series of recurrent and convolutional units forming the top-down and bottom-up streams, respectively. The update frequency of neural units on each of the layer decreases with the increasing of network levels, which results in neurons of higher-level can capture information in longer time dimensions. According to the experimental results, this model shows better compactness and comparable predictive performance with existing works, implying lower computational cost and higher prediction accuracy. Code is available at https://github.com/Ling-CF/PPNet.
CVDec 22, 2022Code
Predictive Coding Based Multiscale Network with Encoder-Decoder LSTM for Video PredictionChaofan Ling, Junpei Zhong, Weihua Li
We present a multi-scale predictive coding model for future video frames prediction. Drawing inspiration on the ``Predictive Coding" theories in cognitive science, it is updated by a combination of bottom-up and top-down information flows, which can enhance the interaction between different network levels. However, traditional predictive coding models only predict what is happening hierarchically rather than predicting the future. To address the problem, our model employs a multi-scale approach (Coarse to Fine), where the higher level neurons generate coarser predictions (lower resolution), while the lower level generate finer predictions (higher resolution). In terms of network architecture, we directly incorporate the encoder-decoder network within the LSTM module and share the final encoded high-level semantic information across different network levels. This enables comprehensive interaction between the current input and the historical states of LSTM compared with the traditional Encoder-LSTM-Decoder architecture, thus learning more believable temporal and spatial dependencies. Furthermore, to tackle the instability in adversarial training and mitigate the accumulation of prediction errors in long-term prediction, we propose several improvements to the training strategy. Our approach achieves good performance on datasets such as KTH, Moving MNIST and Caltech Pedestrian. Code is available at https://github.com/Ling-CF/MSPN.
ASAug 21, 2023
Ultra Dual-Path Compression For Joint Echo Cancellation And Noise SuppressionHangting Chen, Jianwei Yu, Yi Luo et al.
Echo cancellation and noise reduction are essential for full-duplex communication, yet most existing neural networks have high computational costs and are inflexible in tuning model complexity. In this paper, we introduce time-frequency dual-path compression to achieve a wide range of compression ratios on computational cost. Specifically, for frequency compression, trainable filters are used to replace manually designed filters for dimension reduction. For time compression, only using frame skipped prediction causes large performance degradation, which can be alleviated by a post-processing network with full sequence modeling. We have found that under fixed compression ratios, dual-path compression combining both the time and frequency methods will give further performance improvement, covering compression ratios from 4x to 32x with little model size change. Moreover, the proposed models show competitive performance compared with fast FullSubNet and DeepFilterNet.
CVOct 18, 2022
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image GenerationRuijun Li, Weihua Li, Yi Yang et al.
Recently, diffusion models have been proven to perform remarkably well in text-to-image synthesis tasks in a number of studies, immediately presenting new study opportunities for image generation. Google's Imagen follows this research trend and outperforms DALLE2 as the best model for text-to-image generation. However, Imagen merely uses a T5 language model for text processing, which cannot ensure learning the semantic information of the text. Furthermore, the Efficient UNet leveraged by Imagen is not the best choice in image processing. To address these issues, we propose the Swinv2-Imagen, a novel text-to-image diffusion model based on a Hierarchical Visual Transformer and a Scene Graph incorporating a semantic layout. In the proposed model, the feature vectors of entities and relationships are extracted and involved in the diffusion model, effectively improving the quality of generated images. On top of that, we also introduce a Swin-Transformer-based UNet architecture, called Swinv2-Unet, which can address the problems stemming from the CNN convolution operations. Extensive experiments are conducted to evaluate the performance of the proposed model by using three real-world datasets, i.e., MSCOCO, CUB and MM-CelebA-HQ. The experimental results show that the proposed Swinv2-Imagen model outperforms several popular state-of-the-art methods.
SIMar 17, 2022
GAC: A Deep Reinforcement Learning Model Toward User Incentivization in Unknown Social NetworksShiqing Wu, Weihua Li, Quan Bai
In recent years, many applications have deployed incentive mechanisms to promote users' attention and engagement. Most incentive mechanisms determine specific incentive values based on users' attributes (e.g., preferences), while such information is unavailable in many real-world applications. Meanwhile, due to budget restrictions, realizing successful incentivization for all users can be challenging to complete. In this light, we consider leveraging social influence to maximize the incentivization result. We can directly incentivize influential users to affect more users, so the cost of incentivizing these users can be decreased. However, identifying influential users in a social network requires complete information about influence strength among users, which is impractical to acquire in real-world situations. In this research, we propose an end-to-end reinforcement learning-based framework, called Geometric Actor-Critic (GAC), to tackle the abovementioned problem. The proposed approach can realize effective incentive allocation without having prior knowledge about users' attributes. Three real-world social network datasets have been adopted in the experiments to evaluate the performance of GAC. The experimental results indicate that GAC can learn and apply effective incentive allocation policies in unknown social networks and outperform existing incentive allocation approaches.
HCNov 19, 2022
A Light-weight, Effective and Efficient Model for Label Aggregation in CrowdsourcingYi Yang, Zhong-Qiu Zhao, Quan Bai et al.
Due to the noises in crowdsourced labels, label aggregation (LA) has emerged as a standard procedure to post-process crowdsourced labels. LA methods estimate true labels from crowdsourced labels by modeling worker qualities. Most existing LA methods are iterative in nature. They need to traverse all the crowdsourced labels multiple times in order to jointly and iteratively update true labels and worker qualities until convergence. Consequently, these methods have high space and time complexities. In this paper, we treat LA as a dynamic system and model it as a Dynamic Bayesian network. From the dynamic model we derive two light-weight algorithms, LA\textsuperscript{onepass} and LA\textsuperscript{twopass}, which can effectively and efficiently estimate worker qualities and true labels by traversing all the labels at most twice. Due to the dynamic nature, the proposed algorithms can also estimate true labels online without re-visiting historical data. We theoretically prove the convergence property of the proposed algorithms, and bound the error of estimated worker qualities. We also analyze the space and time complexities of the proposed algorithms and show that they are equivalent to those of majority voting. Experiments conducted on 20 real-world datasets demonstrate that the proposed algorithms can effectively and efficiently aggregate labels in both offline and online settings even if they traverse all the labels at most twice.
LGSep 2, 2024
Towards General Industrial Intelligence: A Survey of Continual Large Models in Industrial IoTJiao Chen, Jiayi He, Fangfang Chen et al.
Industrial AI is transitioning from traditional deep learning models to large-scale transformer-based architectures, with the Industrial Internet of Things (IIoT) playing a pivotal role. IIoT evolves from a simple data pipeline to an intelligent infrastructure, enabling and enhancing these advanced AI systems. This survey explores the integration of IIoT with large models (LMs) and their potential applications in industrial environments. We focus on four primary types of industrial LMs: language-based, vision-based, time-series, and multimodal models. The lifecycle of LMs is segmented into four critical phases: data foundation, model training, model connectivity, and continuous evolution. First, we analyze how IIoT provides abundant and diverse data resources, supporting the training and fine-tuning of LMs. Second, we discuss how IIoT offers an efficient training infrastructure in low-latency and bandwidth-optimized environments. Third, we highlight the deployment advantages of LMs within IIoT, emphasizing IIoT's role as a connectivity nexus fostering emergent intelligence through modular design, dynamic routing, and model merging to enhance system scalability and adaptability. Finally, we demonstrate how IIoT supports continual learning mechanisms, enabling LMs to adapt to dynamic industrial conditions and ensure long-term effectiveness. This paper underscores IIoT's critical role in the evolution of industrial intelligence with large models, offering a theoretical framework and actionable insights for future research.
CVJan 13, 2023
Anti-aliasing Predictive Coding Network for Future Video Frame PredictionChaofan Ling, Weihua Li, Junpei Zhong
We introduce here a predictive coding based model that aims to generate accurate and sharp future frames. Inspired by the predictive coding hypothesis and related works, the total model is updated through a combination of bottom-up and top-down information flows, which can enhance the interaction between different network levels. Most importantly, We propose and improve several artifacts to ensure that the neural networks generate clear and natural frames. Different inputs are no longer simply concatenated or added, they are calculated in a modulated manner to avoid being roughly fused. The downsampling and upsampling modules have been redesigned to ensure that the network can more easily construct images from Fourier features of low-frequency inputs. Additionally, the training strategies are also explored and improved to generate believable results and alleviate inconsistency between the input predicted frames and ground truth. Our proposals achieve results that better balance pixel accuracy and visualization effect.
CLMar 1, 2023
Soft Prompt Guided Joint Learning for Cross-Domain Sentiment AnalysisJingli Shi, Weihua Li, Quan Bai et al.
Aspect term extraction is a fundamental task in fine-grained sentiment analysis, which aims at detecting customer's opinion targets from reviews on product or service. The traditional supervised models can achieve promising results with annotated datasets, however, the performance dramatically decreases when they are applied to the task of cross-domain aspect term extraction. Existing cross-domain transfer learning methods either directly inject linguistic features into Language models, making it difficult to transfer linguistic knowledge to target domain, or rely on the fixed predefined prompts, which is time-consuming to construct the prompts over all potential aspect term spans. To resolve the limitations, we propose a soft prompt-based joint learning method for cross domain aspect term extraction in this paper. Specifically, by incorporating external linguistic features, the proposed method learn domain-invariant representations between source and target domains via multiple objectives, which bridges the gap between domains with varied distributions of aspect terms. Further, the proposed method interpolates a set of transferable soft prompts consisted of multiple learnable vectors that are beneficial to detect aspect terms in target domain. Extensive experiments are conducted on the benchmark datasets and the experimental results demonstrate the effectiveness of the proposed method for cross-domain aspect terms extraction.
IRJul 6, 2023
BHEISR: Nudging from Bias to Balance -- Promoting Belief Harmony by Eliminating Ideological Segregation in Knowledge-based RecommendationsMengyan Wang, Yuxuan Hu, Zihan Yuan et al.
In the realm of personalized recommendation systems, the increasing concern is the amplification of belief imbalance and user biases, a phenomenon primarily attributed to the filter bubble. Addressing this critical issue, we introduce an innovative intermediate agency (BHEISR) between users and existing recommendation systems to attenuate the negative repercussions of the filter bubble effect in extant recommendation systems. The main objective is to strike a belief balance for users while minimizing the detrimental influence caused by filter bubbles. The BHEISR model amalgamates principles from nudge theory while upholding democratic and transparent principles. It harnesses user-specific category information to stimulate curiosity, even in areas users might initially deem uninteresting. By progressively stimulating interest in novel categories, the model encourages users to broaden their belief horizons and explore the information they typically overlook. Our model is time-sensitive and operates on a user feedback loop. It utilizes the existing recommendation algorithm of the model and incorporates user feedback from the prior time frame. This approach endeavors to transcend the constraints of the filter bubble, enrich recommendation diversity, and strike a belief balance among users while also catering to user preferences and system-specific business requirements. To validate the effectiveness and reliability of the BHEISR model, we conducted a series of comprehensive experiments with real-world datasets. These experiments compared the performance of the BHEISR model against several baseline models using nearly 200 filter bubble-impacted users as test subjects. Our experimental results conclusively illustrate the superior performance of the BHEISR model in mitigating filter bubbles and balancing user perspectives.
AIFeb 2, 2023
DOR: A Novel Dual-Observation-Based Approach for News Recommendation SystemsMengyan Wang, Weihua Li, Jingli Shi et al.
Online social media platforms offer access to a vast amount of information, but sifting through the abundance of news can be overwhelming and tiring for readers. personalised recommendation algorithms can help users find information that interests them. However, most existing models rely solely on observations of user behaviour, such as viewing history, ignoring the connections between the news and a user's prior knowledge. This can result in a lack of diverse recommendations for individuals. In this paper, we propose a novel method to address the complex problem of news recommendation. Our approach is based on the idea of dual observation, which involves using a deep neural network with observation mechanisms to identify the main focus of a news article as well as the focus of the user on the article. This is achieved by taking into account the user's belief network, which reflects their personal interests and biases. By considering both the content of the news and the user's perspective, our approach is able to provide more personalised and accurate recommendations. We evaluate the performance of our model on real-world datasets and show that our proposed method outperforms several popular baselines.
LGNov 12, 2025
Probing then Editing: A Push-Pull Framework for Retain-Free Machine Unlearning in Industrial IoTJiao Chen, Weihua Li, Jianhua Tang
In dynamic Industrial Internet of Things (IIoT) environments, models need the ability to selectively forget outdated or erroneous knowledge. However, existing methods typically rely on retain data to constrain model behavior, which increases computational and energy burdens and conflicts with industrial data silos and privacy compliance requirements. To address this, we propose a novel retain-free unlearning framework, referred to as Probing then Editing (PTE). PTE frames unlearning as a probe-edit process: first, it probes the decision boundary neighborhood of the model on the to-be-forgotten class via gradient ascent and generates corresponding editing instructions using the model's own predictions. Subsequently, a push-pull collaborative optimization is performed: the push branch actively dismantles the decision region of the target class using the editing instructions, while the pull branch applies masked knowledge distillation to anchor the model's knowledge on retained classes to their original states. Benefiting from this mechanism, PTE achieves efficient and balanced knowledge editing using only the to-be-forgotten data and the original model. Experimental results demonstrate that PTE achieves an excellent balance between unlearning effectiveness and model utility across multiple general and industrial benchmarks such as CWRU and SCUT-FD.
LGAug 27, 2025Code
InfraredGP: Efficient Graph Partitioning via Spectral Graph Neural Networks with Negative CorrectionsMeng Qin, Weihua Li, Jinqiang Cui et al.
Graph partitioning (GP), a.k.a. community detection, is a classic problem that divides nodes of a graph into densely-connected blocks. From a perspective of graph signal processing, we find that graph Laplacian with a negative correction can derive graph frequencies beyond the conventional range $[0, 2]$. To explore whether the low-frequency information beyond this range can encode more informative properties about community structures, we propose InfraredGP. It (\romannumeral1) adopts a spectral GNN as its backbone combined with low-pass filters and a negative correction mechanism, (\romannumeral2) only feeds random inputs to this backbone, (\romannumeral3) derives graph embeddings via one feed-forward propagation (FFP) without any training, and (\romannumeral4) obtains feasible GP results by feeding the derived embeddings to BIRCH. Surprisingly, our experiments demonstrate that based solely on the negative correction mechanism that amplifies low-frequency information beyond $[0, 2]$, InfraredGP can derive distinguishable embeddings for some standard clustering modules (e.g., BIRCH) and obtain high-quality results for GP without any training. Following the IEEE HPEC Graph Challenge benchmark, we evaluate InfraredGP for both static and streaming GP, where InfraredGP can achieve much better efficiency (e.g., 16x-23x faster) and competitive quality over various baselines. We have made our code public at https://github.com/KuroginQin/InfraredGP
CLDec 6, 2022
KATSum: Knowledge-aware Abstractive Text SummarizationGuan Wang, Weihua Li, Edmund Lai et al.
Text Summarization is recognised as one of the NLP downstream tasks and it has been extensively investigated in recent years. It can assist people with perceiving the information rapidly from the Internet, including news articles, social posts, videos, etc. Most existing research works attempt to develop summarization models to produce a better output. However, advent limitations of most existing models emerge, including unfaithfulness and factual errors. In this paper, we propose a novel model, named as Knowledge-aware Abstractive Text Summarization, which leverages the advantages offered by Knowledge Graph to enhance the standard Seq2Seq model. On top of that, the Knowledge Graph triplets are extracted from the source text and utilised to provide keywords with relational information, producing coherent and factually errorless summaries. We conduct extensive experiments by using real-world data sets. The results reveal that the proposed framework can effectively utilise the information from Knowledge Graph and significantly reduce the factual errors in the summary.
CLFeb 19, 2024
Detecting misinformation through Framing Theory: the Frame Element-based ModelGuan Wang, Rebecca Frederick, Jinglong Duan et al.
In this paper, we delve into the rapidly evolving challenge of misinformation detection, with a specific focus on the nuanced manipulation of narrative frames - an under-explored area within the AI community. The potential for Generative AI models to generate misleading narratives underscores the urgency of this problem. Drawing from communication and framing theories, we posit that the presentation or 'framing' of accurate information can dramatically alter its interpretation, potentially leading to misinformation. We highlight this issue through real-world examples, demonstrating how shifts in narrative frames can transmute fact-based information into misinformation. To tackle this challenge, we propose an innovative approach leveraging the power of pre-trained Large Language Models and deep neural networks to detect misinformation originating from accurate facts portrayed under different frames. These advanced AI techniques offer unprecedented capabilities in identifying complex patterns within unstructured data critical for examining the subtleties of narrative frames. The objective of this paper is to bridge a significant research gap in the AI domain, providing valuable insights and methodologies for tackling framing-induced misinformation, thus contributing to the advancement of responsible and trustworthy AI technologies. Several experiments are intensively conducted and experimental results explicitly demonstrate the various impact of elements of framing theory proving the rationale of applying framing theory to increase the performance in misinformation detection.
LGMar 9, 2025
Towards Superior Quantization Accuracy: A Layer-sensitive ApproachFeng Zhang, Yanbin Liu, Weihua Li et al.
Large Vision and Language Models have exhibited remarkable human-like intelligence in tasks such as natural language comprehension, problem-solving, logical reasoning, and knowledge retrieval. However, training and serving these models require substantial computational resources, posing a significant barrier to their widespread application and further research. To mitigate this challenge, various model compression techniques have been developed to reduce computational requirements. Nevertheless, existing methods often employ uniform quantization configurations, failing to account for the varying difficulties across different layers in quantizing large neural network models. This paper tackles this issue by leveraging layer-sensitivity features, such as activation sensitivity and weight distribution Kurtosis, to identify layers that are challenging to quantize accurately and allocate additional memory budget. The proposed methods, named SensiBoost and KurtBoost, respectively, demonstrate notable improvement in quantization accuracy, achieving up to 9% lower perplexity with only a 2% increase in memory budget on LLama models compared to the baseline.
LGApr 23, 2024
Time-aware Heterogeneous Graph Transformer with Adaptive Attention Merging for Health Event PredictionShibo Li, Hengliang Cheng, Weihua Li
The widespread application of Electronic Health Records (EHR) data in the medical field has led to early successes in disease risk prediction using deep learning methods. These methods typically require extensive data for training due to their large parameter sets. However, existing works do not exploit the full potential of EHR data. A significant challenge arises from the infrequent occurrence of many medical codes within EHR data, limiting their clinical applicability. Current research often lacks in critical areas: 1) incorporating disease domain knowledge; 2) heterogeneously learning disease representations with rich meanings; 3) capturing the temporal dynamics of disease progression. To overcome these limitations, we introduce a novel heterogeneous graph learning model designed to assimilate disease domain knowledge and elucidate the intricate relationships between drugs and diseases. This model innovatively incorporates temporal data into visit-level embeddings and leverages a time-aware transformer alongside an adaptive attention mechanism to produce patient representations. When evaluated on two healthcare datasets, our approach demonstrated notable enhancements in both prediction accuracy and interpretability over existing methodologies, signifying a substantial advancement towards personalized and proactive healthcare management.
NEApr 9, 2024
An Enhanced Grey Wolf Optimizer with Elite Inheritance and Balance Search MechanismsJianhua Jiang, Ziying Zhao, Weihua Li et al.
The Grey Wolf Optimizer (GWO) is recognized as a novel meta-heuristic algorithm inspired by the social leadership hierarchy and hunting mechanism of grey wolves. It is well-known for its simple parameter setting, fast convergence speed, and strong optimization capability. In the original GWO, there are two significant design flaws in its fundamental optimization mechanisms. Problem (1): the algorithm fails to inherit from elite positions from the last iteration when generating the next positions of the wolf population, potentially leading to suboptimal solutions. Problem (2): the positions of the population are updated based on the central position of the three leading wolves (alpha, beta, delta), without a balanced mechanism between local and global search. To tackle these problems, an enhanced Grey Wolf Optimizer with Elite Inheritance Mechanism and Balance Search Mechanism, named as EBGWO, is proposed to improve the effectiveness of the position updating and the quality of the convergence solutions. The IEEE CEC 2014 benchmark functions suite and a series of simulation tests are employed to evaluate the performance of the proposed algorithm. The simulation tests involve a comparative study between EBGWO, three GWO variants, GWO and two well-known meta-heuristic algorithms. The experimental results demonstrate that the proposed EBGWO algorithm outperforms other meta-heuristic algorithms in both accuracy and convergence speed. Three engineering optimization problems are adopted to prove its capability in processing real-world problems. The results indicate that the proposed EBGWO outperforms several popular algorithms.
AINov 13, 2024
PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music GenerationYungang Yi, Weihua Li, Matthew Kuo et al.
AI-based music generation has made significant progress in recent years. However, generating symbolic music that is both long-structured and expressive remains a significant challenge. In this paper, we propose PerceiverS (Segmentation and Scale), a novel architecture designed to address this issue by leveraging both Effective Segmentation and Multi-Scale attention mechanisms. Our approach enhances symbolic music generation by simultaneously learning long-term structural dependencies and short-term expressive details. By combining cross-attention and self-attention in a Multi-Scale setting, PerceiverS captures long-range musical structure while preserving performance nuances. The proposed model has been evaluated using the Maestro dataset and has demonstrated improvements in generating coherent and diverse music, characterized by both structural consistency and expressive variation. The project demos and the generated music samples can be accessed through the link: https://perceivers.github.io.
SIJan 12
Ideological Isolation in Online Social Networks: A Survey of Computational Definitions, Metrics, and Mitigation StrategiesXiaodan Wang, Yanbin Liu, Shiqing Wu et al.
The proliferation of online social networks has significantly reshaped the way individuals access and engage with information. While these platforms offer unprecedented connectivity, they may foster environments where users are increasingly exposed to homogeneous content and like-minded interactions. Such dynamics are associated with selective exposure and the emergence of filter bubbles, echo chambers, tunnel vision, and polarization, which together can contribute to ideological isolation and raise concerns about information diversity and public discourse. This survey provides a comprehensive computational review of existing studies that define, analyze, quantify, and mitigate ideological isolation in online social networks. We examine the mechanisms underlying content personalization, user behavior patterns, and network structures that reinforce content-exposure concentration and narrowing dynamics. This paper also systematically reviews methodological approaches for detecting and measuring these isolation-related phenomena, covering network-, content-, and behavior-based metrics. We further organize computational mitigation strategies, including network-topological interventions and recommendation-level controls, and discuss their trade-offs and deployment considerations. By integrating definitions, metrics, and interventions across structural/topological, content-based, interactional, and cognitive isolation, this survey provides a unified computational framework. It serves as a reference for understanding and addressing the key challenges and opportunities in promoting information diversity and reducing ideological fragmentation in the digital age.
COMP-PHOct 3, 2025
Fully automated inverse co-optimization of templates and block copolymer blending recipes for DSA lithographyYuhao Zhou, Huangyan Shen, Qingliang Song et al.
The directed self-assembly (DSA) of block copolymers (BCPs) offers a highly promising approach for the fabrication of contact holes or vertical interconnect access at sub-7nm technology nodes. To fabricate circular holes with precisely controlled size and positions, the self-assembly of block copolymers requires guidance from a properly designed template. Effectively parameterizing the template shape to enable efficient optimization remains a critical yet challenging problem. Moreover, the optimized template must possess excellent manufacturability for practical applications. In this work, we propose a Gaussian descriptor for characterizing the template shape with only two parameters. We further propose to use AB/AB binary blends instead of pure diblock copolymer to improve the adaptability of the block copolymer system to the template shape. The Bayesian optimization (BO) is applied to co-optimize the binary blend and the template shape. Our results demonstrate that BO based on the Gaussian descriptor can efficiently yield the optimal templates for diverse multi-hole patterns, all leading to highly matched self-assembled morphologies. Moreover, by imposing constraints on the variation of curvature of the template during optimization, superior manufacturability is ensured for each optimized template. It is noteworthy that each key parameter of the blend exhibits a relatively wide tunable window under the requirement of rather high precision. Our work provides valuable insights for advancing DSA technology, and thus potentially propels its practical applications forward.
CLMay 21, 2025
FedSEA-LLaMA: A Secure, Efficient and Adaptive Federated Splitting Framework for Large Language ModelsZishuai Zhang, Hainan zhang, Weihua Li et al.
Private data holds promise for improving LLMs due to its high quality, but its scattered distribution across data silos and the high computational demands of LLMs limit their deployment in federated environments. To address this, the transformer-based federated split models are proposed, which offload most model parameters to the server (or distributed clients) while retaining only a small portion on the client to ensure data privacy. Despite this design, they still face three challenges: 1) Peer-to-peer key encryption struggles to secure transmitted vectors effectively; 2) The auto-regressive nature of LLMs means that federated split learning can only train and infer sequentially, causing high communication overhead; 3) Fixed partition points lack adaptability to downstream tasks. In this paper, we introduce FedSEA-LLaMA, a Secure, Efficient, and Adaptive Federated splitting framework based on LLaMA2. First, we inject Gaussian noise into forward-pass hidden states to enable secure end-to-end vector transmission. Second, we employ attention-mask compression and KV cache collaboration to reduce communication costs, accelerating training and inference. Third, we allow users to dynamically adjust the partition points for input/output blocks based on specific task requirements. Experiments on natural language understanding, summarization, and conversational QA tasks show that FedSEA-LLaMA maintains performance comparable to centralized LLaMA2 and achieves up to 8x speedups in training and inference. Further analysis of privacy attacks and different partition points also demonstrates the effectiveness of FedSEA-LLaMA in security and adaptability.
CLMay 26, 2023
AaKOS: Aspect-adaptive Knowledge-based Opinion SummarizationGuan Wang, Weihua Li, Edmund M-K. Lai et al.
The rapid growth of information on the Internet has led to an overwhelming amount of opinions and comments on various activities, products, and services. This makes it difficult and time-consuming for users to process all the available information when making decisions. Text summarization, a Natural Language Processing (NLP) task, has been widely explored to help users quickly retrieve relevant information by generating short and salient content from long or multiple documents. Recent advances in pre-trained language models, such as ChatGPT, have demonstrated the potential of Large Language Models (LLMs) in text generation. However, LLMs require massive amounts of data and resources and are challenging to implement as offline applications. Furthermore, existing text summarization approaches often lack the ``adaptive" nature required to capture diverse aspects in opinion summarization, which is particularly detrimental to users with specific requirements or preferences. In this paper, we propose an Aspect-adaptive Knowledge-based Opinion Summarization model for product reviews, which effectively captures the adaptive nature required for opinion summarization. The model generates aspect-oriented summaries given a set of reviews for a particular product, efficiently providing users with useful information on specific aspects they are interested in, ensuring the generated summaries are more personalized and informative. Extensive experiments have been conducted using real-world datasets to evaluate the proposed model. The results demonstrate that our model outperforms state-of-the-art approaches and is adaptive and efficient in generating summaries that focus on particular aspects, enabling users to make well-informed decisions and catering to their diverse interests and preferences.
SIJul 13, 2021
Identifying Influential Users in Unknown Social Networks for Adaptive Incentive Allocation Under Budget RestrictionShiqing Wu, Weihua Li, Hao Shen et al.
In recent years, recommendation systems have been widely applied in many domains. These systems are impotent in affecting users to choose the behavior that the system expects. Meanwhile, providing incentives has been proven to be a more proactive way to affect users' behaviors. Due to the budget limitation, the number of users who can be incentivized is restricted. In this light, we intend to utilize social influence existing among users to enhance the effect of incentivization. Through incentivizing influential users directly, their followers in the social network are possibly incentivized indirectly. However, in many real-world scenarios, the topological structure of the network is usually unknown, which makes identifying influential users difficult. To tackle the aforementioned challenges, in this paper, we propose a novel algorithm for exploring influential users in unknown networks, which can estimate the influential relationships among users based on their historical behaviors and without knowing the topology of the network. Meanwhile, we design an adaptive incentive allocation approach that determines incentive values based on users' preferences and their influence ability. We evaluate the performance of the proposed approaches by conducting experiments on both synthetic and real-world datasets. The experimental results demonstrate the effectiveness of the proposed approaches.
CLJun 18, 2021
Graph-based Joint Pandemic Concern and Relation Extraction on TwitterJingli Shi, Weihua Li, Sira Yongchareon et al.
Public concern detection provides potential guidance to the authorities for crisis management before or during a pandemic outbreak. Detecting people's concerns and attention from online social media platforms has been widely acknowledged as an effective approach to relieve public panic and prevent a social crisis. However, detecting concerns in time from massive information in social media turns out to be a big challenge, especially when sufficient manually labeled data is in the absence of public health emergencies, e.g., COVID-19. In this paper, we propose a novel end-to-end deep learning model to identify people's concerns and the corresponding relations based on Graph Convolutional Network and Bi-directional Long Short Term Memory integrated with Concern Graph. Except for the sequential features from BERT embeddings, the regional features of tweets can be extracted by the Concern Graph module, which not only benefits the concern detection but also enables our model to be high noise-tolerant. Thus, our model can address the issue of insufficient manually labeled data. We conduct extensive experiments to evaluate the proposed model by using both manually labeled tweets and automatically labeled tweets. The experimental results show that our model can outperform the state-of-art models on real-world datasets.
LGMay 20, 2021
Bidirectional LSTM-CRF Attention-based Model for Chinese Word SegmentationChen Jin, Zhuangwei Shi, Weihua Li et al.
Chinese word segmentation (CWS) is the basic of Chinese natural language processing (NLP). The quality of word segmentation will directly affect the rest of NLP tasks. Recently, with the artificial intelligence tide rising again, Long Short-Term Memory (LSTM) neural network, as one of easily modeling in sequence, has been widely utilized in various kinds of NLP tasks, and functions well. Attention mechanism is an ingenious method to solve the memory compression problem on LSTM. Furthermore, inspired by the powerful abilities of bidirectional LSTM models for modeling sequence and CRF model for decoding, we propose a Bidirectional LSTM-CRF Attention-based Model in this paper. Experiments on PKU and MSRA benchmark datasets show that our model performs better than the baseline methods modeling by other neural networks.
SIApr 14, 2021
ABEM: An Adaptive Agent-based Evolutionary Approach for Mining Influencers in Online Social NetworksWeihua Li, Yuxuan Hu, Shiqing Wu et al.
A key step in influence maximization in online social networks is the identification of a small number of users, known as influencers, who are able to spread influence quickly and widely to other users. The evolving nature of the topological structure of these networks makes it difficult to locate and identify these influencers. In this paper, we propose an adaptive agent-based evolutionary approach to address this problem in the context of both static and dynamic networks. This approach is shown to be able to adapt the solution as the network evolves. It is also applicable to large-scale networks due to its distributed framework. Evaluation of our approach is performed by using both synthetic networks and real-world datasets. Experimental results demonstrate that the proposed approach outperforms state-of-the-art seeding algorithms in terms of maximizing influence.
AIFeb 1, 2021
The 4th International Workshop on Smart Simulation and Modelling for Complex SystemsXing Su, Yan Kong, Weihua Li
Computer-based modelling and simulation have become useful tools to facilitate humans to understand systems in different domains, such as physics, astrophysics, chemistry, biology, economics, engineering and social science. A complex system is featured with a large number of interacting components (agents, processes, etc.), whose aggregate activities are nonlinear and self-organized. Complex systems are hard to be simulated or modelled by using traditional computational approaches due to complex relationships among system components, distributed features of resources, and dynamics of environments. Meanwhile, smart systems such as multi-agent systems have demonstrated advantages and great potentials in modelling and simulating complex systems.
AIJan 27, 2021
Privacy Information Classification: A Hybrid ApproachJiaqi Wu, Weihua Li, Quan Bai et al.
A large amount of information has been published to online social networks every day. Individual privacy-related information is also possibly disclosed unconsciously by the end-users. Identifying privacy-related data and protecting the online social network users from privacy leakage turn out to be significant. Under such a motivation, this study aims to propose and develop a hybrid privacy classification approach to detect and classify privacy information from OSNs. The proposed hybrid approach employs both deep learning models and ontology-based models for privacy-related information extraction. Extensive experiments are conducted to validate the proposed hybrid approach, and the empirical results demonstrate its superiority in assisting online social network users against privacy leakage.
CVAug 12, 2017
Kill Two Birds With One Stone: Boosting Both Object Detection Accuracy and Speed With adaptive Patch-of-Interest CompositionShihao Zhang, Weiyao Lin, Ping Lu et al.
Object detection is an important yet challenging task in video understanding & analysis, where one major challenge lies in the proper balance between two contradictive factors: detection accuracy and detection speed. In this paper, we propose a new adaptive patch-of-interest composition approach for boosting both the accuracy and speed for object detection. The proposed approach first extracts patches in a video frame which have the potential to include objects-of-interest. Then, an adaptive composition process is introduced to compose the extracted patches into an optimal number of sub-frames for object detection. With this process, we are able to maintain the resolution of the original frame during object detection (for guaranteeing the accuracy), while minimizing the number of inputs in detection (for boosting the speed). Experimental results on various datasets demonstrate the effectiveness of the proposed approach.