Hyeryung Jang

LG
h-index1
23papers
300citations
Novelty40%
AI Score45

23 Papers

NEAug 29, 2022
Bayesian Continual Learning via Spiking Neural Networks

Nicolas Skatchkovsky, Hyeryung Jang, Osvaldo Simeone

Among the main features of biological intelligence are energy efficiency, capacity for continual adaptation, and risk management via uncertainty quantification. Neuromorphic engineering has been thus far mostly driven by the goal of implementing energy-efficient machines that take inspiration from the time-based computing paradigm of biological brains. In this paper, we take steps towards the design of neuromorphic systems that are capable of adaptation to changing learning tasks, while producing well-calibrated uncertainty quantification estimates. To this end, we derive online learning rules for spiking neural networks (SNNs) within a Bayesian continual learning framework. In it, each synaptic weight is represented by parameters that quantify the current epistemic uncertainty resulting from prior knowledge and observed data. The proposed online rules update the distribution parameters in a streaming fashion as data are observed. We instantiate the proposed approach for both real-valued and binary synaptic weights. Experimental results using Intel's Lava platform show the merits of Bayesian over frequentist learning in terms of capacity for adaptation and uncertainty quantification.

18.1CVApr 14
MAST: Mask-Guided Attention Mass Allocation for Training-Free Multi-Style Transfer

Dongkyung Kang, Jaeyeon Hwang, Junseo Park et al.

Style transfer aims to render a content image with the visual characteristics of a reference style while preserving its underlying semantic layout and structural geometry. While recent diffusion-based models demonstrate strong stylization capabilities by leveraging powerful generative priors and controllable internal representations, they typically assume a single global style. Extending them to multi-style scenarios often leads to boundary artifacts, unstable stylization, and structural inconsistency due to interference between multiple style representations. To overcome these limitations, we propose MAST (Mask-Guided Attention Mass Allocation for Training-Free Multi-Style Transfer), a novel training-free framework that explicitly controls content-style interactions within the diffusion attention mechanism. To achieve artifact-free and structure-preserving stylization, MAST integrates four connected modules. First, Layout-preserving Query Anchoring prevents global layout collapse by firmly anchoring the semantic structure using content queries. Second, Logit-level Attention Mass Allocation deterministically distributes attention probability mass across spatial regions, seamlessly fusing multiple styles without boundary artifacts. Third, Sharpness-aware Temperature Scaling restores the attention sharpness degraded by multi-style expansion. Finally, Discrepancy-aware Detail Injection adaptively compensates for localized high-frequency detail losses by measuring structural discrepancies. Extensive experiments demonstrate that MAST effectively mitigates boundary artifacts and maintains structural consistency, preserving texture fidelity and spatial coherence even as the number of applied styles increases.

CVJul 17, 2024
I2AM: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps

Junseo Park, Hyeryung Jang

Large-scale diffusion models have made significant advances in image generation, particularly through cross-attention mechanisms. While cross-attention has been well-studied in text-to-image tasks, their interpretability in image-to-image (I2I) diffusion models remains underexplored. This paper introduces Image-to-Image Attribution Maps (I2AM), a method that enhances the interpretability of I2I models by visualizing bidirectional attribution maps, from the reference image to the generated image and vice versa. I2AM aggregates cross-attention scores across time steps, attention heads, and layers, offering insights into how critical features are transferred between images. We demonstrate the effectiveness of I2AM across object detection, inpainting, and super-resolution tasks. Our results demonstrate that I2AM successfully identifies key regions responsible for generating the output, even in complex scenes. Additionally, we introduce the Inpainting Mask Attention Consistency Score (IMACS) as a novel evaluation metric to assess the alignment between attribution maps and inpainting masks, which correlates strongly with existing performance metrics. Through extensive experiments, we show that I2AM enables model debugging and refinement, providing practical tools for improving I2I model's performance and interpretability.

CVSep 30, 2025
ART-VITON: Measurement-Guided Latent Diffusion for Artifact-Free Virtual Try-On

Junseo Park, Hyeryung Jang

Virtual try-on (VITON) aims to generate realistic images of a person wearing a target garment, requiring precise garment alignment in try-on regions and faithful preservation of identity and background in non-try-on regions. While latent diffusion models (LDMs) have advanced alignment and detail synthesis, preserving non-try-on regions remains challenging. A common post-hoc strategy directly replaces these regions with original content, but abrupt transitions often produce boundary artifacts. To overcome this, we reformulate VITON as a linear inverse problem and adopt trajectory-aligned solvers that progressively enforce measurement consistency, reducing abrupt changes in non-try-on regions. However, existing solvers still suffer from semantic drift during generation, leading to artifacts. We propose ART-VITON, a measurement-guided diffusion framework that ensures measurement adherence while maintaining artifact-free synthesis. Our method integrates residual prior-based initialization to mitigate training-inference mismatch and artifact-free measurement-guided sampling that combines data consistency, frequency-level correction, and periodic standard denoising. Experiments on VITON-HD, DressCode, and SHHQ-1.0 demonstrate that ART-VITON effectively preserves identity and background, eliminates boundary artifacts, and consistently improves visual fidelity and robustness over state-of-the-art baselines.

CVJun 20, 2025
PQCAD-DM: Progressive Quantization and Calibration-Assisted Distillation for Extremely Efficient Diffusion Model

Beomseok Ko, Hyeryung Jang

Diffusion models excel in image generation but are computational and resource-intensive due to their reliance on iterative Markov chain processes, leading to error accumulation and limiting the effectiveness of naive compression techniques. In this paper, we propose PQCAD-DM, a novel hybrid compression framework combining Progressive Quantization (PQ) and Calibration-Assisted Distillation (CAD) to address these challenges. PQ employs a two-stage quantization with adaptive bit-width transitions guided by a momentum-based mechanism, reducing excessive weight perturbations in low-precision. CAD leverages full-precision calibration datasets during distillation, enabling the student to match full-precision performance even with a quantized teacher. As a result, PQCAD-DM achieves a balance between computational efficiency and generative quality, halving inference time while maintaining competitive performance. Extensive experiments validate PQCAD-DM's superior generative capabilities and efficiency across diverse datasets, outperforming fixed-bit quantization methods.

CVApr 8, 2024
StyleForge: Enhancing Text-to-Image Synthesis for Any Artistic Styles with Dual Binding

Junseo Park, Beomseok Ko, Hyeryung Jang

Recent advancements in text-to-image models, such as Stable Diffusion, have showcased their ability to create visual images from natural language prompts. However, existing methods like DreamBooth struggle with capturing arbitrary art styles due to the abstract and multifaceted nature of stylistic attributes. We introduce Single-StyleForge, a novel approach for personalized text-to-image synthesis across diverse artistic styles. Using approximately 15 to 20 images of the target style, Single-StyleForge establishes a foundational binding of a unique token identifier with a broad range of attributes of the target style. Additionally, auxiliary images are incorporated for dual binding that guides the consistent representation of crucial elements such as people within the target style. Furthermore, we present Multi-StyleForge, which enhances image quality and text alignment by binding multiple tokens to partial style attributes. Experimental evaluations across six distinct artistic styles demonstrate significant improvements in image quality and perceptual fidelity, as measured by FID, KID, and CLIP scores.

NEJun 2, 2021
Learning to Time-Decode in Spiking Neural Networks Through the Information Bottleneck

Nicolas Skatchkovsky, Osvaldo Simeone, Hyeryung Jang

One of the key challenges in training Spiking Neural Networks (SNNs) is that target outputs typically come in the form of natural signals, such as labels for classification or images for generative models, and need to be encoded into spikes. This is done by handcrafting target spiking signals, which in turn implicitly fixes the mechanisms used to decode spikes into natural signals, e.g., rate decoding. The arbitrary choice of target signals and decoding rule generally impairs the capacity of the SNN to encode and process information in the timing of spikes. To address this problem, this work introduces a hybrid variational autoencoder architecture, consisting of an encoding SNN and a decoding Artificial Neural Network (ANN). The role of the decoding ANN is to learn how to best convert the spiking signals output by the SNN into the target natural signal. A novel end-to-end learning rule is introduced that optimizes a directed information bottleneck training criterion via surrogate gradients. We demonstrate the applicability of the technique in an experimental settings on various tasks, including real-life datasets.

LGFeb 5, 2021
Multi-Sample Online Learning for Spiking Neural Networks based on Generalized Expectation Maximization

Hyeryung Jang, Osvaldo Simeone

Spiking Neural Networks (SNNs) offer a novel computational paradigm that captures some of the efficiency of biological brains by processing through binary neural dynamic activations. Probabilistic SNN models are typically trained to maximize the likelihood of the desired outputs by using unbiased estimates of the log-likelihood gradients. While prior work used single-sample estimators obtained from a single run of the network, this paper proposes to leverage multiple compartments that sample independent spiking signals while sharing synaptic weights. The key idea is to use these signals to obtain more accurate statistical estimates of the log-likelihood training criterion, as well as of its gradient. The approach is based on generalized expectation-maximization (GEM), which optimizes a tighter approximation of the log-likelihood using importance sampling. The derived online learning algorithm implements a three-factor rule with global per-compartment learning signals. Experimental results on a classification task on the neuromorphic MNIST-DVS data set demonstrate significant improvements in terms of log-likelihood, accuracy, and calibration when increasing the number of compartments used for training and inference.

LGDec 15, 2020
BiSNN: Training Spiking Neural Networks with Binary Weights via Bayesian Learning

Hyeryung Jang, Nicolas Skatchkovsky, Osvaldo Simeone

Artificial Neural Network (ANN)-based inference on battery-powered devices can be made more energy-efficient by restricting the synaptic weights to be binary, hence eliminating the need to perform multiplications. An alternative, emerging, approach relies on the use of Spiking Neural Networks (SNNs), biologically inspired, dynamic, event-driven models that enhance energy efficiency via the use of binary, sparse, activations. In this paper, an SNN model is introduced that combines the benefits of temporally sparse binary activations and of binary weights. Two learning rules are derived, the first based on the combination of straight-through and surrogate gradient techniques, and the second based on a Bayesian paradigm. Experiments validate the performance loss with respect to full-precision implementations, and demonstrate the advantage of the Bayesian paradigm in terms of accuracy and calibration.

NEOct 27, 2020
Spiking Neural Networks -- Part III: Neuromorphic Communications

Nicolas Skatchkovsky, Hyeryung Jang, Osvaldo Simeone

Synergies between wireless communications and artificial intelligence are increasingly motivating research at the intersection of the two fields. On the one hand, the presence of more and more wirelessly connected devices, each with its own data, is driving efforts to export advances in machine learning (ML) from high performance computing facilities, where information is stored and processed in a single location, to distributed, privacy-minded, processing at the end user. On the other hand, ML can address algorithm and model deficits in the optimization of communication protocols. However, implementing ML models for learning and inference on battery-powered devices that are connected via bandwidth-constrained channels remains challenging. This paper explores two ways in which Spiking Neural Networks (SNNs) can help address these open problems. First, we discuss federated learning for the distributed training of SNNs, and then describe the integration of neuromorphic sensing, SNNs, and impulse radio technologies for low-power remote inference.

NEOct 27, 2020
Spiking Neural Networks -- Part II: Detecting Spatio-Temporal Patterns

Nicolas Skatchkovsky, Hyeryung Jang, Osvaldo Simeone

Inspired by the operation of biological brains, Spiking Neural Networks (SNNs) have the unique ability to detect information encoded in spatio-temporal patterns of spiking signals. Examples of data types requiring spatio-temporal processing include logs of time stamps, e.g., of tweets, and outputs of neural prostheses and neuromorphic sensors. In this paper, the second of a series of three review papers on SNNs, we first review models and training algorithms for the dominant approach that considers SNNs as a Recurrent Neural Network (RNN) and adapt learning rules based on backpropagation through time to the requirements of SNNs. In order to tackle the non-differentiability of the spiking mechanism, state-of-the-art solutions use surrogate gradients that approximate the threshold activation function with a differentiable function. Then, we describe an alternative approach that relies on probabilistic models for spiking neurons, allowing the derivation of local learning rules via stochastic estimates of the gradient. Finally, experiments are provided for neuromorphic data sets, yielding insights on accuracy and convergence under different SNN models.

NEOct 27, 2020
Spiking Neural Networks -- Part I: Detecting Spatial Patterns

Hyeryung Jang, Nicolas Skatchkovsky, Osvaldo Simeone

Spiking Neural Networks (SNNs) are biologically inspired machine learning models that build on dynamic neuronal models processing binary and sparse spiking signals in an event-driven, online, fashion. SNNs can be implemented on neuromorphic computing platforms that are emerging as energy-efficient co-processors for learning and inference. This is the first of a series of three papers that introduce SNNs to an audience of engineers by focusing on models, algorithms, and applications. In this first paper, we first cover neural models used for conventional Artificial Neural Networks (ANNs) and SNNs. Then, we review learning algorithms and applications for SNNs that aim at mimicking the functionality of ANNs by detecting or generating spatial patterns in rate-encoded spiking signals. We specifically discuss ANN-to-SNN conversion and neural sampling. Finally, we validate the capabilities of SNNs for detecting and generating spatial patterns through experiments.

NESep 3, 2020
End-to-End Learning of Neuromorphic Wireless Systems for Low-Power Edge Artificial Intelligence

Nicolas Skatchkovsky, Hyeryung Jang, Osvaldo Simeone

This paper introduces a novel "all-spike" low-power solution for remote wireless inference that is based on neuromorphic sensing, Impulse Radio (IR), and Spiking Neural Networks (SNNs). In the proposed system, event-driven neuromorphic sensors produce asynchronous time-encoded data streams that are encoded by an SNN, whose output spiking signals are pulse modulated via IR and transmitted over general frequence-selective channels; while the receiver's inputs are obtained via hard detection of the received signals and fed to an SNN for classification. We introduce an end-to-end training procedure that treats the cascade of encoder, channel, and decoder as a probabilistic SNN-based autoencoder that implements Joint Source-Channel Coding (JSCC). The proposed system, termed NeuroJSCC, is compared to conventional synchronous frame-based and uncoded transmissions in terms of latency and accuracy. The experiments confirm that the proposed end-to-end neuromorphic edge architecture provides a promising framework for efficient and low-latency remote sensing, communication, and inference.

LGJul 23, 2020
Multi-Sample Online Learning for Probabilistic Spiking Neural Networks

Hyeryung Jang, Osvaldo Simeone

Spiking Neural Networks (SNNs) capture some of the efficiency of biological brains for inference and learning via the dynamic, online, event-driven processing of binary time series. Most existing learning algorithms for SNNs are based on deterministic neuronal models, such as leaky integrate-and-fire, and rely on heuristic approximations of backpropagation through time that enforce constraints such as locality. In contrast, probabilistic SNN models can be trained directly via principled online, local, update rules that have proven to be particularly effective for resource-constrained systems. This paper investigates another advantage of probabilistic SNNs, namely their capacity to generate independent outputs when queried over the same input. It is shown that the multiple generated output samples can be used during inference to robustify decisions and to quantify uncertainty -- a feature that deterministic SNN models cannot provide. Furthermore, they can be leveraged for training in order to obtain more accurate statistical estimates of the log-loss training criterion, as well as of its gradient. Specifically, this paper introduces an online learning rule based on generalized expectation-maximization (GEM) that follows a three-factor form with global learning signals and is referred to as GEM-SNN. Experimental results on structured output memorization and classification on a standard neuromorphic data set demonstrate significant improvements in terms of log-likelihood, accuracy, and calibration when increasing the number of samples used for inference and training.

LGApr 20, 2020
VOWEL: A Local Online Learning Rule for Recurrent Networks of Probabilistic Spiking Winner-Take-All Circuits

Hyeryung Jang, Nicolas Skatchkovsky, Osvaldo Simeone

Networks of spiking neurons and Winner-Take-All spiking circuits (WTA-SNNs) can detect information encoded in spatio-temporal multi-valued events. These are described by the timing of events of interest, e.g., clicks, as well as by categorical numerical values assigned to each event, e.g., like or dislike. Other use cases include object recognition from data collected by neuromorphic cameras, which produce, for each pixel, signed bits at the times of sufficiently large brightness variations. Existing schemes for training WTA-SNNs are limited to rate-encoding solutions, and are hence able to detect only spatial patterns. Developing more general training algorithms for arbitrary WTA-SNNs inherits the challenges of training (binary) Spiking Neural Networks (SNNs). These amount, most notably, to the non-differentiability of threshold functions, to the recurrent behavior of spiking neural models, and to the difficulty of implementing backpropagation in neuromorphic hardware. In this paper, we develop a variational online local training rule for WTA-SNNs, referred to as VOWEL, that leverages only local pre- and post-synaptic information for visible circuits, and an additional common reward signal for hidden circuits. The method is based on probabilistic generalized linear neural models, control variates, and variational regularization. Experimental results on real-world neuromorphic datasets with multi-valued events demonstrate the advantages of WTA-SNNs over conventional binary SNNs trained with state-of-the-art methods, especially in the presence of limited computing resources.

LGOct 21, 2019
Federated Neuromorphic Learning of Spiking Neural Networks for Low-Power Edge Intelligence

Nicolas Skatchkovsky, Hyeryung Jang, Osvaldo Simeone

Spiking Neural Networks (SNNs) offer a promising alternative to conventional Artificial Neural Networks (ANNs) for the implementation of on-device low-power online learning and inference. On-device training is, however, constrained by the limited amount of data available at each device. In this paper, we propose to mitigate this problem via cooperative training through Federated Learning (FL). To this end, we introduce an online FL-based learning rule for networked on-device SNNs, which we refer to as FL-SNN. FL-SNN leverages local feedback signals within each SNN, in lieu of backpropagation, and global feedback through communication via a base station. The scheme demonstrates significant advantages over separate training and features a flexible trade-off between communication load and accuracy via the selective exchange of synaptic weights.

LGOct 2, 2019
An Introduction to Probabilistic Spiking Neural Networks: Probabilistic Models, Learning Rules, and Applications

Hyeryung Jang, Osvaldo Simeone, Brian Gardner et al.

Spiking neural networks (SNNs) are distributed trainable systems whose computing elements, or neurons, are characterized by internal analog dynamics and by digital and sparse synaptic communications. The sparsity of the synaptic spiking inputs and the corresponding event-driven nature of neural processing can be leveraged by energy-efficient hardware implementations, which can offer significant energy reductions as compared to conventional artificial neural networks (ANNs). The design of training algorithms lags behind the hardware implementations. Most existing training algorithms for SNNs have been designed either for biological plausibility or through conversion from pretrained ANNs via rate encoding. This article provides an introduction to SNNs by focusing on a probabilistic signal processing methodology that enables the direct derivation of learning rules by leveraging the unique time-encoding capabilities of SNNs. We adopt discrete-time probabilistic models for networked spiking neurons and derive supervised and unsupervised learning rules from first principles via variational inference. Examples and open research problems are also provided.

LGSep 9, 2019
Solving Continual Combinatorial Selection via Deep Reinforcement Learning

Hyungseok Song, Hyeryung Jang, Hai H. Tran et al.

We consider the Markov Decision Process (MDP) of selecting a subset of items at each step, termed the Select-MDP (S-MDP). The large state and action spaces of S-MDPs make them intractable to solve with typical reinforcement learning (RL) algorithms especially when the number of items is huge. In this paper, we present a deep RL algorithm to solve this issue by adopting the following key ideas. First, we convert the original S-MDP into an Iterative Select-MDP (IS-MDP), which is equivalent to the S-MDP in terms of optimal actions. IS-MDP decomposes a joint action of selecting K items simultaneously into K iterative selections resulting in the decrease of actions at the expense of an exponential increase of states. Second, we overcome this state space explo-sion by exploiting a special symmetry in IS-MDPs with novel weight shared Q-networks, which prov-ably maintain sufficient expressive power. Various experiments demonstrate that our approach works well even when the item space is large and that it scales to environments with item spaces different from those used in training.

SPDec 10, 2018
An Introduction to Spiking Neural Networks: Probabilistic Models, Learning Rules, and Applications

Hyeryung Jang, Osvaldo Simeone, Brian Gardner et al.

Spiking Neural Networks (SNNs) are distributed trainable systems whose computing elements, or neurons, are characterized by internal analog dynamics and by digital and sparse synaptic communications. The sparsity of the synaptic spiking inputs and the corresponding event-driven nature of neural processing can be leveraged by hardware implementations that have demonstrated significant energy reductions as compared to conventional Artificial Neural Networks (ANNs). Most existing training algorithms for SNNs have been designed either for biological plausibility or through conversion from pre-trained ANNs via rate encoding. This paper aims at providing an introduction to SNNs by focusing on a probabilistic signal processing methodology that enables the direct derivation of learning rules leveraging the unique time encoding capabilities of SNNs. To this end, the paper adopts discrete-time probabilistic models for networked spiking neurons, and it derives supervised and unsupervised learning rules from first principles by using variational inference. Examples and open research problems are also provided.

LGOct 21, 2018
Training Dynamic Exponential Family Models with Causal and Lateral Dependencies for Generalized Neuromorphic Computing

Hyeryung Jang, Osvaldo Simeone

Neuromorphic hardware platforms, such as Intel's Loihi chip, support the implementation of Spiking Neural Networks (SNNs) as an energy-efficient alternative to Artificial Neural Networks (ANNs). SNNs are networks of neurons with internal analogue dynamics that communicate by means of binary time series. In this work, a probabilistic model is introduced for a generalized set-up in which the synaptic time series can take values in an arbitrary alphabet and are characterized by both causal and instantaneous statistical dependencies. The model, which can be considered as an extension of exponential family harmoniums to time series, is introduced by means of a hybrid directed-undirected graphical representation. Furthermore, distributed learning rules are derived for Maximum Likelihood and Bayesian criteria under the assumption of fully observed time series in the training set.

SYSep 13, 2018
Simulation-based Distributed Coordination Maximization over Networks

Hyeryung Jang, Jinwoo Shin, Yung Yi

In various online/offline multi-agent networked environments, it is very popular that the system can benefit from coordinating actions of two interacting agents at some cost of coordination. In this paper, we first formulate an optimization problem that captures the amount of coordination gain at the cost of node activation over networks. This problem is challenging to solve in a distributed manner, since the target gain is a function of the long-term time portion of the inter-coupled activations of two adjacent nodes, and thus a standard Lagrange duality theory is hard to apply to obtain a distributed decomposition as in the standard Network Utility Maximization. In this paper, we propose three simulation-based distributed algorithms, each having different update rules, all of which require only one-hop message passing and locally-observed information. The key idea for being distributedness is due to a stochastic approximation method that runs a Markov chain simulation incompletely over time, but provably guarantees its convergence to the optimal solution. Next, we provide a game-theoretic framework to interpret our proposed algorithms from a different perspective. We artificially select the payoff function, where the game's Nash equilibrium is asymptotically equal to the socially optimal point, i.e., no Price-of-Anarchy. We show that two stochastically-approximated variants of standard game-learning dynamics overlap with two algorithms developed from the optimization perspective. Finally, we demonstrate our theoretical findings on convergence, optimality, and further features such as a trade-off between efficiency and convergence speed through extensive simulations.

MLApr 29, 2018
Learning Data Dependency with Communication Cost

Hyeryung Jang, HyungSeok Song, Yung Yi

In this paper, we consider the problem of recovering a graph that represents the statistical data dependency among nodes for a set of data samples generated by nodes, which provides the basic structure to perform an inference task, such as MAP (maximum a posteriori). This problem is referred to as structure learning. When nodes are spatially separated in different locations, running an inference algorithm requires a non-negligible amount of message passing, incurring some communication cost. We inevitably have the trade-off between the accuracy of structure learning and the cost we need to pay to perform a given message-passing based inference task because the learnt edge structures of data dependency and physical connectivity graph are often highly different. In this paper, we formalize this trade-off in an optimization problem which outputs the data dependency graph that jointly considers learning accuracy and message-passing costs. We focus on a distributed MAP as the target inference task, and consider two different implementations, ASYNC-MAP and SYNC-MAP that have different message-passing mechanisms and thus different cost structures. In ASYNC- MAP, we propose a polynomial time learning algorithm that is optimal, motivated by the problem of finding a maximum weight spanning tree. In SYNC-MAP, we first prove that it is NP-hard and propose a greedy heuristic. For both implementations, we then quantify how the probability that the resulting data graphs from those learning algorithms differ from the ideal data graph decays as the number of data samples grows, using the large deviation principle, where the decaying rate is characterized by some topological structures of both original data dependency and physical connectivity graphs as well as the degree of the trade-off. We validate our theoretical findings through extensive simulations, which confirms that it has a good match.

LGMay 26, 2016
Adiabatic Persistent Contrastive Divergence Learning

Hyeryung Jang, Hyungwon Choi, Yung Yi et al.

This paper studies the problem of parameter learning in probabilistic graphical models having latent variables, where the standard approach is the expectation maximization algorithm alternating expectation (E) and maximization (M) steps. However, both E and M steps are computationally intractable for high dimensional data, while the substitution of one step to a faster surrogate for combating against intractability can often cause failure in convergence. We propose a new learning algorithm which is computationally efficient and provably ensures convergence to a correct optimum. Its key idea is to run only a few cycles of Markov Chains (MC) in both E and M steps. Such an idea of running incomplete MC has been well studied only for M step in the literature, called Contrastive Divergence (CD) learning. While such known CD-based schemes find approximated gradients of the log-likelihood via the mean-field approach in E step, our proposed algorithm does exact ones via MC algorithms in both steps due to the multi-time-scale stochastic approximation theory. Despite its theoretical guarantee in convergence, the proposed scheme might suffer from the slow mixing of MC in E step. To tackle it, we also propose a hybrid approach applying both mean-field and MC approximation in E step, where the hybrid approach outperforms the bare mean-field CD scheme in our experiments on real-world datasets.