LGFeb 2
Self-Supervised Learning from Structural InvarianceYipeng Zhang, Hafez Ghaemi, Jungyoon Lee et al.
Joint-embedding self-supervised learning (SSL), the key paradigm for unsupervised representation learning from visual data, learns from invariances between semantically-related data pairs. We study the one-to-many mapping problem in SSL, where each datum may be mapped to multiple valid targets. This arises when data pairs come from naturally occurring generative processes, e.g., successive video frames. We show that existing methods struggle to flexibly capture this conditional uncertainty. As a remedy, we introduce a latent variable to account for this uncertainty and derive a variational lower bound on the mutual information between paired embeddings. Our derivation yields a simple regularization term for standard SSL objectives. The resulting method, which we call AdaSSL, applies to both contrastive and distillation-based SSL objectives, and we empirically show its versatility in causal representation learning, fine-grained image understanding, and world modeling on videos.
LGFeb 8, 2024
Risk-Sensitive Multi-Agent Reinforcement Learning in Network Aggregative Markov GamesHafez Ghaemi, Hamed Kebriaei, Alireza Ramezani Moghaddam et al.
Classical multi-agent reinforcement learning (MARL) assumes risk neutrality and complete objectivity for agents. However, in settings where agents need to consider or model human economic or social preferences, a notion of risk must be incorporated into the RL optimization problem. This will be of greater importance in MARL where other human or non-human agents are involved, possibly with their own risk-sensitive policies. In this work, we consider risk-sensitive and non-cooperative MARL with cumulative prospect theory (CPT), a non-convex risk measure and a generalization of coherent measures of risk. CPT is capable of explaining loss aversion in humans and their tendency to overestimate/underestimate small/large probabilities. We propose a distributed sampling-based actor-critic (AC) algorithm with CPT risk for network aggregative Markov games (NAMGs), which we call Distributed Nested CPT-AC. Under a set of assumptions, we prove the convergence of the algorithm to a subjective notion of Markov perfect Nash equilibrium in NAMGs. The experimental results show that subjective CPT policies obtained by our algorithm can be different from the risk-neutral ones, and agents with a higher loss aversion are more inclined to socially isolate themselves in an NAMG.
CVMay 6, 2025
seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World ModelsHafez Ghaemi, Eilif Muller, Shahab Bakhtiari
Current self-supervised algorithms commonly rely on transformations such as data augmentation and masking to learn visual representations. This is achieved by enforcing invariance or equivariance with respect to these transformations after encoding two views of an image. This dominant two-view paradigm often limits the flexibility of learned representations for downstream adaptation by creating performance trade-offs between high-level invariance-demanding tasks such as image classification and more fine-grained equivariance-related tasks. In this work, we proposes \emph{seq-JEPA}, a world modeling framework that introduces architectural inductive biases into joint-embedding predictive architectures to resolve this trade-off. Without relying on dual equivariance predictors or loss terms, seq-JEPA simultaneously learns two architecturally segregated representations: one equivariant to specified transformations and another invariant to them. To do so, our model processes short sequences of different views (observations) of inputs. Each encoded view is concatenated with an embedding of the relative transformation (action) that produces the next observation in the sequence. These view-action pairs are passed through a transformer encoder that outputs an aggregate representation. A predictor head then conditions this aggregate representation on the upcoming action to predict the representation of the next observation. Empirically, seq-JEPA demonstrates strong performance on both equivariant and invariant benchmarks without sacrificing one for the other. Furthermore, it excels at tasks that inherently require aggregating a sequence of observations, such as path integration across actions and predictive learning across eye movements.
GTJun 10, 2024
Risk Sensitivity in Markov Games and Multi-Agent Reinforcement Learning: A Systematic ReviewHafez Ghaemi, Shirin Jamshidi, Mohammad Mashreghi et al.
Markov games (MGs) and multi-agent reinforcement learning (MARL) are studied to model decision making in multi-agent systems. Traditionally, the objective in MG and MARL has been risk-neutral, i.e., agents are assumed to optimize a performance metric such as expected return, without taking into account subjective or cognitive preferences of themselves or of other agents. However, ignoring such preferences leads to inaccurate models of decision making in many real-world scenarios in finance, operations research, and behavioral economics. Therefore, when these preferences are present, it is necessary to incorporate a suitable measure of risk into the optimization objective of agents, which opens the door to risk-sensitive MG and MARL. In this paper, we systemically review the literature on risk sensitivity in MG and MARL that has been growing in recent years alongside other areas of reinforcement learning and game theory. We define and mathematically describe different risk measures used in MG and MARL and individually for each measure, discuss articles that incorporate it. Finally, we identify recent trends in theoretical and applied works in the field and discuss possible directions of future research.
NESep 12, 2021
BioLCNet: Reward-modulated Locally Connected Spiking Neural NetworksHafez Ghaemi, Erfan Mirzaei, Mahbod Nouri et al.
Brain-inspired computation and information processing alongside compatibility with neuromorphic hardware have made spiking neural networks (SNN) a promising method for solving learning tasks in machine learning (ML). Spiking neurons are only one of the requirements for building a bio-plausible learning model. Network architecture and learning rules are other important factors to consider when developing such artificial agents. In this work, inspired by the human visual pathway and the role of dopamine in learning, we propose a reward-modulated locally connected spiking neural network, BioLCNet, for visual learning tasks. To extract visual features from Poisson-distributed spike trains, we used local filters that are more analogous to the biological visual system compared to convolutional filters with weight sharing. In the decoding layer, we applied a spike population-based voting scheme to determine the decision of the network. We employed Spike-timing-dependent plasticity (STDP) for learning the visual features, and its reward-modulated variant (R-STDP) for training the decoder based on the reward or punishment feedback signal. For evaluation, we first assessed the robustness of our rewarding mechanism to varying target responses in a classical conditioning experiment. Afterwards, we evaluated the performance of our network on image classification tasks of MNIST and XOR MNIST datasets.
LGDec 25, 2020
Towards Real-World BCI: CCSPNet, A Compact Subject-Independent Motor Imagery FrameworkMahbod Nouri, Faraz Moradi, Hafez Ghaemi et al.
A conventional brain-computer interface (BCI) requires a complete data gathering, training, and calibration phase for each user before it can be used. In recent years, a number of subject-independent (SI) BCIs have been developed. Many of these methods yield a weaker performance compared to the subject-dependent (SD) approach, and some are computationally expensive. A potential real-world application would greatly benefit from a more accurate, compact, and computationally efficient subject-independent BCI. In this work, we propose a novel subject-independent BCI framework, named CCSPNet (Convolutional Common Spatial Pattern Network) that is trained on the motor imagery (MI) paradigm of a large-scale electroencephalography (EEG) signals database consisting of 400 trials for every 54 subjects who perform two-class hand-movement MI tasks. The proposed framework applies a wavelet kernel convolutional neural network (WKCNN) and a temporal convolutional neural network (TCNN) in order to represent and extract the spectral features of EEG signals. A common spatial pattern (CSP) algorithm is implemented for spatial feature extraction, and the number of CSP features is reduced by a dense neural network. Finally, the class label is determined by a linear discriminant analysis (LDA) classifier. The CCSPNet evaluation results show that it is possible to have a compact BCI that achieves both SD and SI state-of-the-art performance comparable to complex and computationally expensive models.