Jun Cao

h-index62

6papers

532citations

Novelty40%

AI Score28

Ranked #151,375 of 194,257 authors (top 78%)#26,509 in CL (top 86%)

6 Papers

18.1LGOct 15, 2022Code

Augmentation-Free Graph Contrastive Learning of Invariant-Discriminative Representations

Haifeng Li, Jun Cao, Jiawei Zhu et al.

The pretasks are mainly built on mutual information estimation, which requires data augmentation to construct positive samples with similar semantics to learn invariant signals and negative samples with dissimilar semantics in order to empower representation discriminability. However, an appropriate data augmentation configuration depends heavily on lots of empirical trials such as choosing the compositions of data augmentation techniques and the corresponding hyperparameter settings. We propose an augmentation-free graph contrastive learning method, invariant-discriminative graph contrastive learning (iGCL), that does not intrinsically require negative samples. iGCL designs the invariant-discriminative loss (ID loss) to learn invariant and discriminative representations. On the one hand, ID loss learns invariant signals by directly minimizing the mean square error between the target samples and positive samples in the representation space. On the other hand, ID loss ensures that the representations are discriminative by an orthonormal constraint forcing the different dimensions of representations to be independent of each other. This prevents representations from collapsing to a point or subspace. Our theoretical analysis explains the effectiveness of ID loss from the perspectives of the redundancy reduction criterion, canonical correlation analysis, and information bottleneck principle. The experimental results demonstrate that iGCL outperforms all baselines on 5 node classification benchmark datasets. iGCL also shows superior performance for different label ratios and is capable of resisting graph attacks, which indicates that iGCL has excellent generalization and robustness. The source code is available at https://github.com/lehaifeng/T-GCN/tree/master/iGCL.

23.9CLOct 20, 2022Code

The VolcTrans System for WMT22 Multilingual Machine Translation Task

Xian Qian, Kai Hu, Jiaqiang Wang et al. · bytedance

This report describes our VolcTrans system for the WMT22 shared task on large-scale multilingual machine translation. We participated in the unconstrained track which allows the use of external resources. Our system is a transformerbased multilingual model trained on data from multiple sources including the public training set from the data track, NLLB data provided by Meta AI, self-collected parallel corpora, and pseudo bitext from back-translation. A series of heuristic rules clean both bilingual and monolingual texts. On the official test set, our system achieves 17.3 BLEU, 21.9 spBLEU, and 41.9 chrF2++ on average over all language pairs. The average inference speed is 11.5 sentences per second using a single Nvidia Tesla V100 GPU. Our code and trained models are available at https://github.com/xian8/wmt22

4.5CLApr 8, 2022

GigaST: A 10,000-hour Pseudo Speech Translation Corpus

Rong Ye, Chengqi Zhao, Tom Ko et al. · bytedance

This paper introduces GigaST, a large-scale pseudo speech translation (ST) corpus. We create the corpus by translating the text in GigaSpeech, an English ASR corpus, into German and Chinese. The training set is translated by a strong machine translation system and the test set is translated by human. ST models trained with an addition of our corpus obtain new state-of-the-art results on the MuST-C English-German benchmark test set. We provide a detailed description of the translation process and verify its quality. We make the translated text data public and hope to facilitate research in speech translation. Additionally, we also release the training scripts on NeurST to make it easy to replicate our systems. GigaST dataset is available at https://st-benchmark.github.io/resources/GigaST.

2.3CLSep 23, 2022

Zero-shot Domain Adaptation for Neural Machine Translation with Retrieved Phrase-level Prompts

Zewei Sun, Qingnan Jiang, Shujian Huang et al. · bytedance

Domain adaptation is an important challenge for neural machine translation. However, the traditional fine-tuning solution requires multiple extra training and yields a high cost. In this paper, we propose a non-tuning paradigm, resolving domain adaptation with a prompt-based method. Specifically, we construct a bilingual phrase-level database and retrieve relevant pairs from it as a prompt for the input sentences. By utilizing Retrieved Phrase-level Prompts (RePP), we effectively boost the translation quality. Experiments show that our method improves domain-specific machine translation for 6.2 BLEU scores and improves translation constraints for 11.5% accuracy without additional training.

1.2SYAug 26, 2022

Battery and Hydrogen Energy Storage Control in a Smart Energy Network with Flexible Energy Demand using Deep Reinforcement Learning

Cephas Samende, Zhong Fan, Jun Cao

Smart energy networks provide for an effective means to accommodate high penetrations of variable renewable energy sources like solar and wind, which are key for deep decarbonisation of energy production. However, given the variability of the renewables as well as the energy demand, it is imperative to develop effective control and energy storage schemes to manage the variable energy generation and achieve desired system economics and environmental goals. In this paper, we introduce a hybrid energy storage system composed of battery and hydrogen energy storage to handle the uncertainties related to electricity prices, renewable energy production and consumption. We aim to improve renewable energy utilisation and minimise energy costs and carbon emissions while ensuring energy reliability and stability within the network. To achieve this, we propose a multi-agent deep deterministic policy gradient approach, which is a deep reinforcement learning-based control strategy to optimise the scheduling of the hybrid energy storage system and energy demand in real-time. The proposed approach is model-free and does not require explicit knowledge and rigorous mathematical models of the smart energy network environment. Simulation results based on real-world data show that: (i) integration and optimised operation of the hybrid energy storage system and energy demand reduces carbon emissions by 78.69%, improves cost savings by 23.5% and renewable energy utilisation by over 13.2% compared to other baseline models and (ii) the proposed algorithm outperforms the state-of-the-art self-learning algorithms like deep-Q network.

6.6MANov 21, 2021

Renewable energy integration and microgrid energy trading using multi-agent deep reinforcement learning

Daniel J. B. Harrold, Jun Cao, Zhong Fan

In this paper, multi-agent reinforcement learning is used to control a hybrid energy storage system working collaboratively to reduce the energy costs of a microgrid through maximising the value of renewable energy and trading. The agents must learn to control three different types of energy storage system suited for short, medium, and long-term storage under fluctuating demand, dynamic wholesale energy prices, and unpredictable renewable energy generation. Two case studies are considered: the first looking at how the energy storage systems can better integrate renewable energy generation under dynamic pricing, and the second with how those same agents can be used alongside an aggregator agent to sell energy to self-interested external microgrids looking to reduce their own energy bills. This work found that the centralised learning with decentralised execution of the multi-agent deep deterministic policy gradient and its state-of-the-art variants allowed the multi-agent methods to perform significantly better than the control from a single global agent. It was also found that using separate reward functions in the multi-agent approach performed much better than using a single control agent. Being able to trade with the other microgrids, rather than just selling back to the utility grid, also was found to greatly increase the grid's savings.