CVAug 28, 2023Code
LatentDR: Improving Model Generalization Through Sample-Aware Latent Degradation and RestorationRan Liu, Sahil Khose, Jingyun Xiao et al.
Despite significant advances in deep learning, models often struggle to generalize well to new, unseen domains, especially when training data is limited. To address this challenge, we propose a novel approach for distribution-aware latent augmentation that leverages the relationships across samples to guide the augmentation procedure. Our approach first degrades the samples stochastically in the latent space, mapping them to augmented labels, and then restores the samples from their corrupted versions during training. This process confuses the classifier in the degradation step and restores the overall class distribution of the original samples, promoting diverse intra-class/cross-domain variability. We extensively evaluate our approach on a diverse set of datasets and tasks, including domain generalization benchmarks and medical imaging datasets with strong domain shift, where we show our approach achieves significant improvements over existing methods for latent space augmentation. We further show that our method can be flexibly adapted to long-tail recognition tasks, demonstrating its versatility in building more generalizable models. Code is available at https://github.com/nerdslab/LatentDR.
LGAug 17, 2023
Half-Hop: A graph upsampling approach for slowing down message passingMehdi Azabou, Venkataramana Ganesh, Shantanu Thakoor et al. · gatech
Message passing neural networks have shown a lot of success on graph-structured data. However, there are many instances where message passing can lead to over-smoothing or fail when neighboring nodes belong to different classes. In this work, we introduce a simple yet general framework for improving learning in message passing neural networks. Our approach essentially upsamples edges in the original graph by adding "slow nodes" at each edge that can mediate communication between a source and a target node. Our method only modifies the input graph, making it plug-and-play and easy to use with existing models. To understand the benefits of slowing down message passing, we provide theoretical and empirical analyses. We report results on several supervised and self-supervised benchmarks, and show improvements across the board, notably in heterophilic conditions where adjacent nodes are more likely to have different labels. Finally, we show how our approach can be used to generate augmentations for self-supervised learning, where slow nodes are randomly introduced into different edges in the graph to generate multi-scale views with variable path lengths.
NCJun 10, 2022
Seeing the forest and the tree: Building representations of both individual and collective dynamics with transformersRan Liu, Mehdi Azabou, Max Dabagia et al. · gatech
Complex time-varying systems are often studied by abstracting away from the dynamics of individual components to build a model of the population-level dynamics from the start. However, when building a population-level description, it can be easy to lose sight of each individual and how they contribute to the larger picture. In this paper, we present a novel transformer architecture for learning from time-varying data that builds descriptions of both the individual as well as the collective population dynamics. Rather than combining all of our data into our model at the onset, we develop a separable architecture that operates on individual time-series first before passing them forward; this induces a permutation-invariance property and can be used to transfer across systems of different size and order. After demonstrating that our model can be applied to successfully recover complex interactions and dynamics in many-body systems, we apply our approach to populations of neurons in the nervous system. On neural activity datasets, we show that our model not only yields robust decoding performance, but also provides impressive performance in transfer across recordings of different animals without any neuron-level correspondence. By enabling flexible pre-training that can be transferred to neural recordings of different size and order, our work provides a first step towards creating a foundation model for neural decoding.
LGSep 12, 2023Code
Frequency-Aware Masked Autoencoders for Multimodal Pretraining on BiosignalsRan Liu, Ellen L. Zippi, Hadi Pouransari et al.
Leveraging multimodal information from biosignals is vital for building a comprehensive representation of people's physical and mental states. However, multimodal biosignals often exhibit substantial distributional shifts between pretraining and inference datasets, stemming from changes in task specification or variations in modality compositions. To achieve effective pretraining in the presence of potential distributional shifts, we propose a frequency-aware masked autoencoder ($\texttt{bio}$FAME) that learns to parameterize the representation of biosignals in the frequency space. $\texttt{bio}$FAME incorporates a frequency-aware transformer, which leverages a fixed-size Fourier-based operator for global token mixing, independent of the length and sampling rate of inputs. To maintain the frequency components within each input channel, we further employ a frequency-maintain pretraining strategy that performs masked autoencoding in the latent space. The resulting architecture effectively utilizes multimodal information during pretraining, and can be seamlessly adapted to diverse tasks and modalities at test time, regardless of input size and order. We evaluated our approach on a diverse set of transfer experiments on unimodal time series, achieving an average of $\uparrow$5.5% improvement in classification accuracy over the previous state-of-the-art. Furthermore, we demonstrated that our architecture is robust in modality mismatch scenarios, including unpredicted modality dropout or substitution, proving its practical utility in real-world applications. Code is available at https://github.com/apple/ml-famae .
52.0LGJun 1
Repurposing Adversarial Perturbations for Continual Learning: From Defense to Active AlignmentRan Liu, Min Yu, Mingqi Liu et al.
In dynamic environments, large language models need to keep adapting to new tasks, but continual learning often suffers from forgetting, limited transfer, and vulnerability to adversarial perturbations. To address this, we present AdvCL, which repurposes adversarial perturbations as a geometric control signal for stable continual adaptation. AdvCL combines three plug-in modules: Intra-Smooth promotes local smoothness via small adversarial perturbations; Proto-Clip uses similarity clipping to prevent excessive alignment to current task prototype; and Inter-Align applies directional alignment toward previous task prototype to reduce representational gaps. Experiments show consistent gains in both standard performance and robustness, with lower forgetting and stronger transfer. We further analyze key mechanisms by quantifying the sensitivity of Intra-Smooth to perturbation settings and the effect of Inter-Align on task similarity and geometric distance. In summary, the modules provide complementary gains when combined, and each can also be integrated individually into diverse CL paradigms, including replay, regularization, and dynamic architectures, thereby offering a geometric control mechanism for continual learning.
LGDec 1, 2022
Clustering and Analysis of GPS Trajectory Data using Distance-based FeaturesZann Koh, Yuren Zhou, Billy Pik Lik Lau et al.
The proliferation of smartphones has accelerated mobility studies by largely increasing the type and volume of mobility data available. One such source of mobility data is from GPS technology, which is becoming increasingly common and helps the research community understand mobility patterns of people. However, there lacks a standardized framework for studying the different mobility patterns created by the non-Work, non-Home locations of Working and Nonworking users on Workdays and Offdays using machine learning methods. We propose a new mobility metric, Daily Characteristic Distance, and use it to generate features for each user together with Origin-Destination matrix features. We then use those features with an unsupervised machine learning method, $k$-means clustering, and obtain three clusters of users for each type of day (Workday and Offday). Finally, we propose two new metrics for the analysis of the clustering results, namely User Commonality and Average Frequency. By using the proposed metrics, interesting user behaviors can be discerned and it helps us to better understand the mobility patterns of the users.
CVJan 1, 2023
MTNeuro: A Benchmark for Evaluating Representations of Brain Structure Across Multiple Levels of AbstractionJorge Quesada, Lakshmi Sathidevi, Ran Liu et al. · gatech
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
CLAug 19, 2024Code
GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document SummarizationRan Liu, Ming Liu, Min Yu et al.
Pre-trained language models are increasingly being used in multi-document summarization tasks. However, these models need large-scale corpora for pre-training and are domain-dependent. Other non-neural unsupervised summarization approaches mostly rely on key sentence extraction, which can lead to information loss. To address these challenges, we propose a lightweight yet effective unsupervised approach called GLIMMER: a Graph and LexIcal features based unsupervised Multi-docuMEnt summaRization approach. It first constructs a sentence graph from the source documents, then automatically identifies semantic clusters by mining low-level features from raw texts, thereby improving intra-cluster correlation and the fluency of generated sentences. Finally, it summarizes clusters into natural sentences. Experiments conducted on Multi-News, Multi-XScience and DUC-2004 demonstrate that our approach outperforms existing unsupervised approaches. Furthermore, it surpasses state-of-the-art pre-trained multi-document summarization models (e.g. PEGASUS and PRIMERA) under zero-shot settings in terms of ROUGE scores. Additionally, human evaluations indicate that summaries generated by GLIMMER achieve high readability and informativeness scores. Our code is available at https://github.com/Oswald1997/GLIMMER.
SDJul 25, 2024
Model-driven Heart Rate Estimation and Heart Murmur Detection based on PhonocardiogramJingping Nie, Ran Liu, Behrooz Mahasseni et al.
Acoustic signals are crucial for health monitoring, particularly heart sounds which provide essential data like heart rate and detect cardiac anomalies such as murmurs. This study utilizes a publicly available phonocardiogram (PCG) dataset to estimate heart rate using model-driven methods and extends the best-performing model to a multi-task learning (MTL) framework for simultaneous heart rate estimation and murmur detection. Heart rate estimates are derived using a sliding window technique on heart sound snippets, analyzed with a combination of acoustic features (Mel spectrogram, cepstral coefficients, power spectral density, root mean square energy). Our findings indicate that a 2D convolutional neural network (\textbf{\texttt{2dCNN}}) is most effective for heart rate estimation, achieving a mean absolute error (MAE) of 1.312 bpm. We systematically investigate the impact of different feature combinations and find that utilizing all four features yields the best results. The MTL model (\textbf{\texttt{2dCNN-MTL}}) achieves accuracy over 95% in murmur detection, surpassing existing models, while maintaining an MAE of 1.636 bpm in heart rate estimation, satisfying the requirements stated by Association for the Advancement of Medical Instrumentation (AAMI).
36.3AIMar 14
Multimodal Emotion Regression with Multi-Objective Optimization and VAD-Aware Audio Modeling for the 10th ABAW EMI TrackJiawen Huang, Chenxi Huang, Zhuofan Wen et al.
We participated in the 10th ABAW Challenge, focusing on the Emotional Mimicry Intensity (EMI) Estimation track on the Hume-Vidmimic2 dataset. This task aims to predict six continuous emotion dimensions: Admiration, Amusement, Determination, Empathic Pain, Excitement, and Joy. Through systematic multimodal exploration of pretrained high-level features, we found that, under our pretrained feature setting, direct feature concatenation outperformed the more complex fusion strategies we tested. This empirical finding motivated us to design a systematic approach built upon three core principles: (i) preserving modality-specific attributes through feature-level concatenation; (ii) improving training stability and metric alignment via multi-objective optimization; and (iii) enriching acoustic representations with a VAD-inspired latent prior. Our final framework integrates concatenation-based multimodal fusion, a shared six-dimensional regression head, multi-objective optimization with MSE, Pearson-correlation, and auxiliary branch supervision, EMA for parameter stabilization, and a VAD-inspired latent prior for the acoustic branch. On the official validation set, the proposed scheme achieved our best mean Pearson Correlation Coefficient of 0.478567.
CRAug 9, 2023
A Feature Set of Small Size for the PDF Malware DetectionRan Liu, Charles Nicholas
Machine learning (ML)-based malware detection systems are becoming increasingly important as malware threats increase and get more sophisticated. PDF files are often used as vectors for phishing attacks because they are widely regarded as trustworthy data resources, and are accessible across different platforms. Therefore, researchers have developed many different PDF malware detection methods. Performance in detecting PDF malware is greatly influenced by feature selection. In this research, we propose a small features set that don't require too much domain knowledge of the PDF file. We evaluate proposed features with six different machine learning models. We report the best accuracy of 99.75% when using Random Forest model. Our proposed feature set, which consists of just 12 features, is one of the most conciseness in the field of PDF malware detection. Despite its modest size, we obtain comparable results to state-of-the-art that employ a much larger set of features.
CVMar 21, 2025Code
Feature-Based Dual Visual Feature Extraction Model for Compound Multimodal Emotion RecognitionRan Liu, Fengyu Zhang, Cong Yu et al.
This article presents our results for the eighth Affective Behavior Analysis in-the-wild (ABAW) competition.Multimodal emotion recognition (ER) has important applications in affective computing and human-computer interaction. However, in the real world, compound emotion recognition faces greater issues of uncertainty and modal conflicts. For the Compound Expression (CE) Recognition Challenge,this paper proposes a multimodal emotion recognition method that fuses the features of Vision Transformer (ViT) and Residual Network (ResNet). We conducted experiments on the C-EXPR-DB and MELD datasets. The results show that in scenarios with complex visual and audio cues (such as C-EXPR-DB), the model that fuses the features of ViT and ResNet exhibits superior performance.Our code are avalible on https://github.com/MyGitHub-ax/8th_ABAW
LGOct 11, 2024
Context-Aware Adapter Tuning for Few-Shot Relation Learning in Knowledge GraphsRan Liu, Zhongzhou Liu, Xiaoli Li et al.
Knowledge graphs (KGs) are instrumental in various real-world applications, yet they often suffer from incompleteness due to missing relations. To predict instances for novel relations with limited training examples, few-shot relation learning approaches have emerged, utilizing techniques such as meta-learning. However, the assumption is that novel relations in meta-testing and base relations in meta-training are independently and identically distributed, which may not hold in practice. To address the limitation, we propose RelAdapter, a context-aware adapter for few-shot relation learning in KGs designed to enhance the adaptation process in meta-learning. First, RelAdapter is equipped with a lightweight adapter module that facilitates relation-specific, tunable adaptation of meta-knowledge in a parameter-efficient manner. Second, RelAdapter is enriched with contextual information about the target relation, enabling enhanced adaptation to each distinct relation. Extensive experiments on three benchmark KGs validate the superiority of RelAdapter over state-of-the-art methods.
CLApr 22, 2024
A Survey on the Real Power of ChatGPTMing Liu, Ran Liu, Ye Zhu et al.
ChatGPT has changed the AI community and an active research line is the performance evaluation of ChatGPT. A key challenge for the evaluation is that ChatGPT is still closed-source and traditional benchmark datasets may have been used by ChatGPT as the training data. In this paper, (i) we survey recent studies which uncover the real performance levels of ChatGPT in seven categories of NLP tasks, (ii) review the social implications and safety issues of ChatGPT, and (iii) emphasize key challenges and opportunities for its evaluation. We hope our survey can shed some light on its blackbox manner, so that researchers are not misleaded by its surface generation.
LGFeb 27, 2025
Your contrastive learning problem is secretly a distribution alignment problemZihao Chen, Chi-Heng Lin, Ran Liu et al.
Despite the success of contrastive learning (CL) in vision and language, its theoretical foundations and mechanisms for building representations remain poorly understood. In this work, we build connections between noise contrastive estimation losses widely used in CL and distribution alignment with entropic optimal transport (OT). This connection allows us to develop a family of different losses and multistep iterative variants for existing CL methods. Intuitively, by using more information from the distribution of latents, our approach allows a more distribution-aware manipulation of the relationships within augmented sample sets. We provide theoretical insights and experimental evidence demonstrating the benefits of our approach for {\em generalized contrastive alignment}. Through this framework, it is possible to leverage tools in OT to build unbalanced losses to handle noisy views and customize the representation space by changing the constraints on alignment. By reframing contrastive learning as an alignment problem and leveraging existing optimization tools for OT, our work provides new insights and connections between different self-supervised learning models in addition to new tools that can be more easily adapted to incorporate domain knowledge into learning.
AIOct 1, 2025
FusionAdapter for Few-Shot Relation Learning in Multimodal Knowledge GraphsRan Liu, Yuan Fang, Xiaoli Li
Multimodal Knowledge Graphs (MMKGs) incorporate various modalities, including text and images, to enhance entity and relation representations. Notably, different modalities for the same entity often present complementary and diverse information. However, existing MMKG methods primarily align modalities into a shared space, which tends to overlook the distinct contributions of specific modalities, limiting their performance particularly in low-resource settings. To address this challenge, we propose FusionAdapter for the learning of few-shot relationships (FSRL) in MMKG. FusionAdapter introduces (1) an adapter module that enables efficient adaptation of each modality to unseen relations and (2) a fusion strategy that integrates multimodal entity representations while preserving diverse modality-specific characteristics. By effectively adapting and fusing information from diverse modalities, FusionAdapter improves generalization to novel relations with minimal supervision. Extensive experiments on two benchmark MMKG datasets demonstrate that FusionAdapter achieves superior performance over state-of-the-art methods.
LGSep 4, 2025
CPEP: Contrastive Pose-EMG Pre-training Enhances Gesture Generalization on EMG SignalsWenhui Cui, Christopher Sandino, Hadi Pouransari et al.
Hand gesture classification using high-quality structured data such as videos, images, and hand skeletons is a well-explored problem in computer vision. Leveraging low-power, cost-effective biosignals, e.g. surface electromyography (sEMG), allows for continuous gesture prediction on wearables. In this paper, we demonstrate that learning representations from weak-modality data that are aligned with those from structured, high-quality data can improve representation quality and enables zero-shot classification. Specifically, we propose a Contrastive Pose-EMG Pre-training (CPEP) framework to align EMG and pose representations, where we learn an EMG encoder that produces high-quality and pose-informative representations. We assess the gesture classification performance of our model through linear probing and zero-shot setups. Our model outperforms emg2pose benchmark models by up to 21% on in-distribution gesture classification and 72% on unseen (out-of-distribution) gesture classification.
LGMay 3, 2023
Can Feature Engineering Help Quantum Machine Learning for Malware Detection?Ran Liu, Maksim Eren, Charles Nicholas
With the increasing number and sophistication of malware attacks, malware detection systems based on machine learning (ML) grow in importance. At the same time, many popular ML models used in malware classification are supervised solutions. These supervised classifiers often do not generalize well to novel malware. Therefore, they need to be re-trained frequently to detect new malware specimens, which can be time-consuming. Our work addresses this problem in a hybrid framework of theoretical Quantum ML, combined with feature selection strategies to reduce the data size and malware classifier training time. The preliminary results show that VQC with XGBoost selected features can get a 78.91% test accuracy on the simulator. The average accuracy for the model trained using the features selected with XGBoost was 74% (+- 11.35%) on the IBM 5 qubits machines.
LGNov 3, 2021
Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural ActivityRan Liu, Mehdi Azabou, Max Dabagia et al.
Meaningful and simplified representations of neural activity can yield insights into how and what information is being processed within a neural circuit. However, without labels, finding representations that reveal the link between the brain and behavior can be challenging. Here, we introduce a novel unsupervised approach for learning disentangled representations of neural activity called Swap-VAE. Our approach combines a generative modeling framework with an instance-specific alignment loss that tries to maximize the representational similarity between transformed views of the input (brain state). These transformed (or augmented) views are created by dropping out neurons and jittering samples in time, which intuitively should lead the network to a representation that maintains both temporal consistency and invariance to the specific neurons used to represent the neural state. Through evaluations on both synthetic data and neural recordings from hundreds of neurons in different primate brains, we show that it is possible to build representations that disentangle neural datasets along relevant latent dimensions linked to behavior.
ROOct 13, 2021
Collaborative Radio SLAM for Multiple Robots based on WiFi Fingerprint SimilarityRan Liu, Zhenghong Qin, Hua Zhang et al.
Simultaneous Localization and Mapping (SLAM) enables autonomous robots to navigate and execute their tasks through unknown environments. However, performing SLAM in large environments with a single robot is not efficient, and visual or LiDAR-based SLAM requires feature extraction and matching algorithms, which are computationally expensive. In this paper, we present a collaborative SLAM approach with multiple robots using the pervasive WiFi radio signals. A centralized solution is proposed to optimize the trajectory based on the odometry and radio fingerprints collected from multiple robots. To improve the localization accuracy, a novel similarity model is introduced that combines received signal strength (RSS) and detection likelihood of an access point (AP). We perform extensive experiments to demonstrate the effectiveness of the proposed similarity model and collaborative SLAM framework.
ROJul 19, 2021
Relative Localization of Mobile Robots with Multiple Ultra-WideBand Ranging MeasurementsZhiqiang Cao, Ran Liu, Chau Yuen et al.
Relative localization between autonomous robots without infrastructure is crucial to achieve their navigation, path planning, and formation in many applications, such as emergency response, where acquiring a prior knowledge of the environment is not possible. The traditional Ultra-WideBand (UWB)-based approach provides a good estimation of the distance between the robots, but obtaining the relative pose (including the displacement and orientation) remains challenging. We propose an approach to estimate the relative pose between a group of robots by equipping each robot with multiple UWB ranging nodes. We determine the pose between two robots by minimizing the residual error of the ranging measurements from all UWB nodes. To improve the localization accuracy, we propose to utilize the odometry constraints through a sliding window-based optimization. The optimized pose is then fused with the odometry in a particle filtering for pose tracking among a group of mobile robots. We have conducted extensive experiments to validate the effectiveness of the proposed approach.
LGJul 9, 2021
Offline reinforcement learning with uncertainty for treatment strategies in sepsisRan Liu, Joseph L. Greenstein, James C. Fackler et al.
Guideline-based treatment for sepsis and septic shock is difficult because sepsis is a disparate range of life-threatening organ dysfunctions whose pathophysiology is not fully understood. Early intervention in sepsis is crucial for patient outcome, yet those interventions have adverse effects and are frequently overadministered. Greater personalization is necessary, as no single action is suitable for all patients. We present a novel application of reinforcement learning in which we identify optimal recommendations for sepsis treatment from data, estimate their confidence level, and identify treatment options infrequently observed in training data. Rather than a single recommendation, our method can present several treatment options. We examine learned policies and discover that reinforcement learning is biased against aggressive intervention due to the confounding relationship between mortality and level of treatment received. We mitigate this bias using subspace learning, and develop methodology that can yield more accurate learning policies across healthcare applications.
ROJun 7, 2021
Cost-effective Mapping of Mobile Robot Based on the Fusion of UWB and Short-range 2D LiDARRan Liu, Yongping He, Chau Yuen et al.
Environment mapping is an essential prerequisite for mobile robots to perform different tasks such as navigation and mission planning. With the availability of low-cost 2D LiDARs, there are increasing applications of such 2D LiDARs in industrial environments. However, environment mapping in an unknown and feature-less environment with such low-cost 2D LiDARs remains a challenge. The challenge mainly originates from the short-range of LiDARs and complexities in performing scan matching in these environments. In order to resolve these shortcomings, we propose to fuse the ultra-wideband (UWB) with 2D LiDARs to improve the mapping quality of a mobile robot. The optimization-based approach is utilized for the fusion of UWB ranging information and odometry to first optimize the trajectory. Then the LiDAR-based loop closures are incorporated to improve the accuracy of the trajectory estimation. Finally, the optimized trajectory is combined with the LiDAR scans to produce the occupancy map of the environment. The performance of the proposed approach is evaluated in an indoor feature-less environment with a size of 20m*20m. Obtained results show that the mapping error of the proposed scheme is 85.5% less than that of the conventional GMapping algorithm with short-range LiDAR (for example Hokuyo URG-04LX in our experiment with a maximum range of 5.6m).
LGJun 6, 2021
Understand and Improve Contrastive Learning Methods for Visual Representation: A ReviewRan Liu
Traditional supervised learning methods are hitting a bottleneck because of their dependency on expensive manually labeled data and their weaknesses such as limited generalization ability and vulnerability to adversarial attacks. A promising alternative, self-supervised learning, as a type of unsupervised learning, has gained popularity because of its potential to learn effective data representations without manual labeling. Among self-supervised learning algorithms, contrastive learning has achieved state-of-the-art performance in several fields of research. This literature review aims to provide an up-to-date analysis of the efforts of researchers to understand the key components and the limitations of self-supervised learning.
LGMay 4, 2021
WiFi Fingerprint Clustering for Urban Mobility AnalysisSumudu HasalaMarakkalage, Billy Pik Lik Lau, Yuren Zhou et al.
In this paper, we present an unsupervised learning approach to identify the user points of interest (POI) by exploiting WiFi measurements from smartphone application data. Due to the lack of GPS positioning accuracy in indoor, sheltered, and high rise building environments, we rely on widely available WiFi access points (AP) in contemporary urban areas to accurately identify POI and mobility patterns, by comparing the similarity in the WiFi measurements. We propose a system architecture to scan the surrounding WiFi AP, and perform unsupervised learning to demonstrate that it is possible to identify three major insights, namely the indoor POI within a building, neighbourhood activity, and micro-mobility of the users. Our results show that it is possible to identify the aforementioned insights, with the fusion of WiFi and GPS, which are not possible to identify by only using GPS.
LGFeb 19, 2021
Mine Your Own vieW: Self-Supervised Learning Through Across-Sample PredictionMehdi Azabou, Mohammad Gheshlaghi Azar, Ran Liu et al.
State-of-the-art methods for self-supervised learning (SSL) build representations by maximizing the similarity between different transformed "views" of a sample. Without sufficient diversity in the transformations used to create views, however, it can be difficult to overcome nuisance variables in the data and build rich representations. This motivates the use of the dataset itself to find similar, yet distinct, samples to serve as views for one another. In this paper, we introduce Mine Your Own vieW (MYOW), a new approach for self-supervised learning that looks within the dataset to define diverse targets for prediction. The idea behind our approach is to actively mine views, finding samples that are neighbors in the representation space of the network, and then predict, from one sample's latent representation, the representation of a nearby sample. After showing the promise of MYOW on benchmarks used in computer vision, we highlight the power of this idea in a novel application in neuroscience where SSL has yet to be applied. When tested on multi-unit neural recordings, we find that MYOW outperforms other self-supervised approaches in all examples (in some cases by more than 10%), and often surpasses the supervised baseline. With MYOW, we show that it is possible to harness the diversity of the data to build rich views and leverage self-supervision in new domains where augmentations are limited or unknown.
LGFeb 9, 2020
A Physiology-Driven Computational Model for Post-Cardiac Arrest Outcome PredictionHan B. Kim, Hieu Nguyen, Qingchu Jin et al.
Patients resuscitated from cardiac arrest (CA) face a high risk of neurological disability and death, however pragmatic methods are lacking for accurate and reliable prognostication. The aim of this study was to build computational models to predict post-CA outcome by leveraging high-dimensional patient data available early after admission to the intensive care unit (ICU). We hypothesized that model performance could be enhanced by integrating physiological time series (PTS) data and by training machine learning (ML) classifiers. We compared three models integrating features extracted from the electronic health records (EHR) alone, features derived from PTS collected in the first 24hrs after ICU admission (PTS24), and models integrating PTS24 and EHR. Outcomes of interest were survival and neurological outcome at ICU discharge. Combined EHR-PTS24 models had higher discrimination (area under the receiver operating characteristic curve [AUC]) than models which used either EHR or PTS24 alone, for the prediction of survival (AUC 0.85, 0.80 and 0.68 respectively) and neurological outcome (0.87, 0.83 and 0.78). The best ML classifier achieved higher discrimination than the reference logistic regression model (APACHE III) for survival (AUC 0.85 vs 0.70) and neurological outcome prediction (AUC 0.87 vs 0.75). Feature analysis revealed previously unknown factors to be associated with post-CA recovery. Results attest to the effectiveness of ML models for post-CA predictive modeling and suggest that PTS recorded in very early phase after resuscitation encode short-term outcome probabilities.
NINov 30, 2019
Collaborative SLAM based on Wifi Fingerprint Similarity and Motion InformationRan Liu, Sumudu Hasala Marakkalage, Madhushanka Padmal et al.
Simultaneous localization and mapping (SLAM) has been extensively researched in past years particularly with regard to range-based or visual-based sensors. Instead of deploying dedicated devices that use visual features, it is more pragmatic to exploit the radio features to achieve this task, due to their ubiquitous nature and the widespread deployment of Wi-Fi wireless network. This paper presents a novel approach for collaborative simultaneous localization and radio fingerprint mapping (C-SLAM-RF) in large unknown indoor environments. The proposed system uses received signal strengths (RSS) from Wi-Fi access points (AP) in the existing infrastructure and pedestrian dead reckoning (PDR) from a smart phone, without a prior knowledge about map or distribution of AP in the environment. We claim a loop closure based on the similarity of the two radio fingerprints. To further improve the performance, we incorporate the turning motion and assign a small uncertainty value to a loop closure if a matched turning is identified. The experiment was done in an area of 130 meters by 70 meters and the results show that our proposed system is capable of estimating the tracks of four users with an accuracy of 0.6 meters with Tango-based PDR and 4.76 meters with a step counter-based PDR.
ROApr 26, 2019
Crowd-sensing Simultaneous Localization and Radio Fingerprint Mapping based on Probabilistic Similarity ModelsRan Liu, Sumudu Hasala Marakkalage, Madhushanka Padmal et al.
Simultaneous localization and mapping (SLAM) has been richly researched in past years particularly with regard to range-based or visual-based sensors. Instead of deploying dedicated devices that use visual features, it is more pragmatic to exploit the radio features to achieve this task, due to their ubiquitous nature and the wide deployment of Wifi wireless network. In this paper, we present a novel approach for crowd-sensing simultaneous localization and radio fingerprint mapping (C-SLAM-RF) in large unknown indoor environments. The proposed system makes use of the received signal strength (RSS) from surrounding Wifi access points (AP) and the motion tracking data from a smart phone (Tango as an example). These measurements are captured duration the walking of multiple users in unknown environments without map information and location of the AP. The experiments were done in a university building with dynamic environment and the results show that the proposed system is capable of estimating the tracks of a group of users with an accuracy of 1.74 meters when compared to the ground truth acquired from a point cloud-based SLAM.