Sijia Zhou

CL
h-index28
3papers
36citations
Novelty53%
AI Score29

3 Papers

CLDec 15, 2024
Feature engineering vs. deep learning for paper section identification: Toward applications in Chinese medical literature

Sijia Zhou, Xin Li

Section identification is an important task for library science, especially knowledge management. Identifying the sections of a paper would help filter noise in entity and relation extraction. In this research, we studied the paper section identification problem in the context of Chinese medical literature analysis, where the subjects, methods, and results are more valuable from a physician's perspective. Based on previous studies on English literature section identification, we experiment with the effective features to use with classic machine learning algorithms to tackle the problem. It is found that Conditional Random Fields, which consider sentence interdependency, is more effective in combining different feature sets, such as bag-of-words, part-of-speech, and headings, for Chinese literature section identification. Moreover, we find that classic machine learning algorithms are more effective than generic deep learning models for this problem. Based on these observations, we design a novel deep learning model, the Structural Bidirectional Long Short-Term Memory (SLSTM) model, which models word and sentence interdependency together with the contextual information. Experiments on a human-curated asthma literature dataset show that our approach outperforms the traditional machine learning methods and other deep learning methods and achieves close to 90% precision and recall in the task. The model shows good potential for use in other text mining tasks. The research has significant methodological and practical implications.

LGApr 3, 2025
Randomized Pairwise Learning with Adaptive Sampling: A PAC-Bayes Analysis

Sijia Zhou, Yunwen Lei, Ata Kabán

We study stochastic optimization with data-adaptive sampling schemes to train pairwise learning models. Pairwise learning is ubiquitous, and it covers several popular learning tasks such as ranking, metric learning and AUC maximization. A notable difference of pairwise learning from pointwise learning is the statistical dependencies among input pairs, for which existing analyses have not been able to handle in the general setting considered in this paper. To this end, we extend recent results that blend together two algorithm-dependent frameworks of analysis -- algorithmic stability and PAC-Bayes -- which allow us to deal with any data-adaptive sampling scheme in the optimizer. We instantiate this framework to analyze (1) pairwise stochastic gradient descent, which is a default workhorse in many machine learning problems, and (2) pairwise stochastic gradient descent ascent, which is a method used in adversarial training. All of these algorithms make use of a stochastic sampling from a discrete distribution (sample indices) before each update. Non-uniform sampling of these indices has been already suggested in the recent literature, to which our work provides generalization guarantees in both smooth and non-smooth convex problems.

NIJun 9, 2020
Reinforcement Learning-Based Joint Self-Optimisation Method for the Fuzzy Logic Handover Algorithm in 5G HetNets

Qianyu Liu, Chiew Foong Kwong, Sun Wei et al.

5G heterogeneous networks (HetNets) can provide higher network coverage and system capacity to the user by deploying massive small base stations (BSs) within the 4G macro system. However, the large-scale deployment of small BSs significantly increases the complexity and workload of network maintenance and optimisation. The current handover (HO) triggering mechanism A3 event was designed only for mobility management in the macro system. Directly implementing A3 in 5G-HetNets may degrade the user mobility robustness. Motivated by the concept of self-organisation networks (SON), this study developed a self-optimised triggering mechanism to enable automated network maintenance and enhance user mobility robustness in 5G-HetNets. The proposed method integrates the advantages of subtractive clustering and Q-learning frameworks into the conventional fuzzy logic-based HO algorithm (FLHA). Subtractive clustering is first adopted to generate a membership function (MF) for the FLHA to enable FLHA with the self-configuration feature. Subsequently, Q-learning is utilised to learn the optimal HO policy from the environment as fuzzy rules that empower the FLHA with a self-optimisation function. The FLHA with SON functionality also overcomes the limitations of the conventional FLHA that must rely heavily on professional experience to design. The simulation results show that the proposed self-optimised FLHA can effectively generate MF and fuzzy rules for the FLHA. By comparing with conventional triggering mechanisms, the proposed approach can decrease the HO, ping-pong HO, and HO failure ratios by approximately 91%, 49%, and 97.5% while improving network throughput and latency by 8% and 35%, respectively.