CLNov 22, 2021
Reinforcement Learning for Few-Shot Text Generation AdaptationPengsen Cheng, Jinqiao Dai, Jiamiao Liu et al.
Controlling the generative model to adapt a new domain with limited samples is a difficult challenge and it is receiving increasing attention. Recently, methods based on meta-learning have shown promising results for few-shot domain adaptation. However, meta-learning-based methods usually suffer from the problem of overfitting, which results in a lack of diversity in the generated texts. To avoid this problem, in this study, a novel framework based on reinforcement learning (RL) is proposed. In this framework, to increase the sample utilization of RL and decrease its sample requirement, maximum likelihood estimation learning is incorporated into the RL process. When there are only a few in-domain samples available, experimental results on five target domains in two few-shot configurations show that this framework performs better than baselines.
CLJul 12, 2021
CatVRNN: Generating Category Texts via Multi-task LearningPengsen Cheng, Jinqiao Dai, Jiayong Liu
Controlling the model to generate texts of different categories is a challenging task that is receiving increasing attention. Recently, generative adversarial networks (GANs) have shown promising results for category text generation. However, the texts generated by GANs usually suffer from problems of mode collapse and training instability. To avoid the above problems, in this study, inspired by multi-task learning, a novel model called category-aware variational recurrent neural network (CatVRNN) is proposed. In this model, generation and classification tasks are trained simultaneously to generate texts of different categories. The use of multi-task learning can improve the quality of the generated texts, when the classification task is appropriate. In addition, a function is proposed to initialize the hidden state of the CatVRNN to force the model to generate texts of a specific category. Experimental results on three datasets demonstrate that the model can outperform state-of-the-art text generation methods based on GAN in terms of diversity of generated texts.
CRAug 4, 2019
A systematic review of fuzzing based on machine learning techniquesYan Wang, Peng Jia, Luping Liu et al.
Security vulnerabilities play a vital role in network security system. Fuzzing technology is widely used as a vulnerability discovery technology to reduce damage in advance. However, traditional fuzzing techniques have many challenges, such as how to mutate input seed files, how to increase code coverage, and how to effectively bypass verification. Machine learning technology has been introduced as a new method into fuzzing test to alleviate these challenges. This paper reviews the research progress of using machine learning technology for fuzzing test in recent years, analyzes how machine learning improve the fuzz process and results, and sheds light on future work in fuzzing. Firstly, this paper discusses the reasons why machine learning techniques can be used for fuzzing scenarios and identifies six different stages in which machine learning have been used. Then this paper systematically study the machine learning based fuzzing models from selection of machine learning algorithm, pre-processing methods, datasets, evaluation metrics, and hyperparameters setting. Next, this paper assesses the performance of the machine learning models based on the frequently used evaluation metrics. The results of the evaluation prove that machine learning technology has an acceptable capability of categorize predictive for fuzzing. Finally, the comparison on capability of discovering vulnerabilities between traditional fuzzing tools and machine learning based fuzzing tools is analyzed. The results depict that the introduction of machine learning technology can improve the performance of fuzzing. However, there are still some limitations, such as unbalanced training samples and difficult to extract the characteristics related to vulnerabilities.
CVMay 26, 2016
Domain Transfer Multi-Instance Dictionary LearningKe Wang, Jiayong Liu, Daniel González
In this paper, we invest the domain transfer learning problem with multi-instance data. We assume we already have a well-trained multi-instance dictionary and its corresponding classifier from the source domain, which can be used to represent and classify the bags. But it cannot be directly used to the target domain. Thus we propose to adapt them to the target domain by adding an adaptive term to the source domain classifier. The adaptive function is a linear function based a domain transfer multi-instance dictionary. Given a target domain bag, we first map it to a bag-level feature space using the domain transfer dictionary, and then apply a the linear adaptive function to its bag-level feature vector. To learn the domain-transfer dictionary and the adaptive function parameter, we simultaneously minimize the average classification error of the target domain classifier over the target domain training set, and the complexities of both the adaptive function parameter and the domain transfer dictionary. The minimization problem is solved by an iterative algorithm which update the dictionary and the function parameter alternately. Experiments over several benchmark data sets show the advantage of the proposed method over existing state-of-the-art domain transfer multi-instance learning methods.