Aakash Sarkar

6.5LGJul 9, 2021Code

A deep convolutional neural network that is invariant to time rescaling

Brandon G. Jacques, Zoran Tiganj, Aakash Sarkar et al.

Human learners can readily understand speech, or a melody, when it is presented slower or faster than usual. Although deep convolutional neural networks (CNNs) are extremely powerful in extracting information from time series, they require explicit training to generalize to different time scales. This paper presents a deep CNN that incorporates a temporal representation inspired by recent findings from neuroscience. In the mammalian brain, time is represented by populations of neurons with temporal receptive fields. Critically, the peaks of the receptive fields form a geometric series, such that the population codes a set of temporal basis functions over log time. Because memory for the recent past is a function of log time, rescaling the input results in translation of the memory. The Scale-Invariant Temporal History Convolution network (SITHCon) builds a convolutional layer over this logarithmically-distributed temporal memory. A max-pool operation results in a network that is invariant to rescalings of time modulo edge effects. We compare performance of SITHCon to a Temporal Convolution Network (TCN). Although both networks can learn classification and regression problems on both univariate and multivariate time series f(t), only SITHCon generalizes to rescalings f(at). This property, inspired by findings from contemporary neuroscience and consistent with findings from cognitive psychology, may enable networks that learn with fewer training examples, fewer weights and that generalize more robustly to out of sample data.

0.3CLDec 16, 2019

Scale-dependent Relationships in Natural Language

Aakash Sarkar, Marc Howard

Natural language exhibits statistical dependencies at a wide range of scales. For instance, the mutual information between words in natural language decays like a power law with the temporal lag between them. However, many statistical learning models applied to language impose a sampling scale while extracting statistical structure. For instance, Word2Vec constructs a vector embedding that maximizes the prediction between a target word and the context words that appear nearby in the corpus. The size of the context is chosen by the user and defines a strong scale; relationships over much larger temporal scales would be invisible to the algorithm. This paper examines the family of Word2Vec embeddings generated while systematically manipulating the sampling scale used to define the context around each word. The primary result is that different linguistic relationships are preferentially encoded at different scales. Different scales emphasize different syntactic and semantic relations between words.Moreover, the neighborhoods of a given word in the embeddings change significantly depending on the scale. These results suggest that any individual scale can only identify a subset of the meaningful relationships a word might have, and point toward the importance of developing scale-free models of semantic meaning.

Aakash Sarkar

2 Papers