LGFeb 9
Trapped by simplicity: When Transformers fail to learn from noisy featuresEvan Peters, Ando Deng, Matheus H. Zambianco et al.
Noise is ubiquitous in data used to train large language models, but it is not well understood whether these models are able to correctly generalize to inputs generated without noise. Here, we study noise-robust learning: are transformers trained on data with noisy features able to find a target function that correctly predicts labels for noiseless features? We show that transformers succeed at noise-robust learning for a selection of $k$-sparse parity and majority functions, compared to LSTMs which fail at this task for even modest feature noise. However, we find that transformers typically fail at noise-robust learning of random $k$-juntas, especially when the boolean sensitivity of the optimal solution is smaller than that of the target function. We argue that this failure is due to a combination of two factors: transformers' bias toward simpler functions, combined with an observation that the optimal function for noise-robust learning typically has lower sensitivity than the target function for random boolean functions. We test this hypothesis by exploiting transformers' simplicity bias to trap them in an incorrect solution, but show that transformers can escape this trap by training with an additional loss term penalizing high-sensitivity solutions. Overall, we find that transformers are particularly ineffective for learning boolean functions in the presence of feature noise.
QUANT-PHJun 8, 2020
Learning to Utilize Correlated Auxiliary Noise: A Possible Quantum AdvantageAida Ahmadzadegan, Petar Simidzija, Ming Li et al.
This paper has two messages. First, we demonstrate that neural networks that process noisy data can learn to exploit, when available, access to auxiliary noise that is correlated with the noise on the data. In effect, the network learns to use the correlated auxiliary noise as an approximate key to decipher its noisy input data. Second, we show that, for this task, the scaling behavior with increasing noise is such that future quantum machines could possess an advantage. In particular, decoherence generates correlated auxiliary noise in the environment. The new approach could, therefore, help enable future quantum machines by providing machine-learned quantum error correction.
CLASS-PHNov 5, 2019
Random number generation & distribution out of thin (or thick) airNicholas Bornman, Andrew Forbes, Achim Kempf
Much scientific work has focused on the generation of random numbers as well as the distribution of said random numbers for use as a cryptographic key. However, emphasis is often placed on one of the two to the exclusion of the other, but both are often simultaneously important. Here we present a simple hybrid free-space link scheme for both the generation and secure distribution of (pseudo-)random numbers between two remote parties, drawing the randomness from the stochastic nature of atmospheric turbulence. The atmosphere is simulated using digital micro-mirror devices for efficient, all-digital control. After outlining one potential algorithm for extracting random numbers based on finding the centre-of-mass (COM) of turbulent beam intensity profiles, the statistics of our experimental COM measurements is studied and found to agree well with the literature. After implementing the scheme in the laboratory, Alice and Bob are able to establish a string of correlated random bits with an 84% fidelity. Finally, we make a simple modification to the original setup in an attempt to thwart the hacking attempts of an eavesdropper, Eve, who has access to the free-space portion of the link. We find that the fidelity between Eve's key and that of Alice/Bob is 54%, only slightly above the theoretical minimum. Atmospheric turbulence could hence be leveraged as an added security measure, rather than being seen as a drawback.