AIDec 20, 2022
DocAsRef: An Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-FreelyForrest Sheng Bao, Ruixuan Tu, Ge Luo et al.
Automated summary quality assessment falls into two categories: reference-based and reference-free. Reference-based metrics, historically deemed more accurate due to the additional information provided by human-written references, are limited by their reliance on human input. In this paper, we hypothesize that the comparison methodologies used by some reference-based metrics to evaluate a system summary against its corresponding reference can be effectively adapted to assess it against its source document, thereby transforming these metrics into reference-free ones. Experimental results support this hypothesis. After being repurposed reference-freely, the zero-shot BERTScore using the pretrained DeBERTa-large-MNLI model of <0.5B parameters consistently outperforms its original reference-based version across various aspects on the SummEval and Newsroom datasets. It also excels in comparison to most existing reference-free metrics and closely competes with zero-shot summary evaluators based on GPT-3.5.
LGJun 8, 2020
Supervised Whole DAG Causal DiscoveryHebi Li, Qi Xiao, Jin Tian
We propose to address the task of causal structure learning from data in a supervised manner. Existing work on learning causal directions by supervised learning is restricted to learning pairwise relation, and not well suited for whole DAG discovery. We propose a novel approach of modeling the whole DAG structure discovery as a supervised learning. To fit the problem in hand, we propose to use permutation equivariant models that align well with the problem domain. We evaluate the proposed approach extensively on synthetic graphs of size 10,20,50,100 and real data, and show promising results compared with a variety of previous approaches.
CLMay 13, 2020
SueNes: A Weakly Supervised Approach to Evaluating Single-Document Summarization via Negative SamplingForrest Sheng Bao, Hebi Li, Ge Luo et al.
Canonical automatic summary evaluation metrics, such as ROUGE, focus on lexical similarity which cannot well capture semantics nor linguistic quality and require a reference summary which is costly to obtain. Recently, there have been a growing number of efforts to alleviate either or both of the two drawbacks. In this paper, we present a proof-of-concept study to a weakly supervised summary evaluation approach without the presence of reference summaries. Massive data in existing summarization datasets are transformed for training by pairing documents with corrupted reference summaries. In cross-domain tests, our strategy outperforms baselines with promising improvements, and show a great advantage in gauging linguistic qualities over all metrics.
LGMay 26, 2019
Purifying Adversarial Perturbation with Adversarially Trained Auto-encodersHebi Li, Qi Xiao, Shixin Tian et al.
Machine learning models are vulnerable to adversarial examples. Iterative adversarial training has shown promising results against strong white-box attacks. However, adversarial training is very expensive, and every time a model needs to be protected, such expensive training scheme needs to be performed. In this paper, we propose to apply iterative adversarial training scheme to an external auto-encoder, which once trained can be used to protect other models directly. We empirically show that our model outperforms other purifying-based methods against white-box attacks, and transfers well to directly protect other base models with different architectures.