11.2LGMay 8
SparseRL-Sync: Lossless Weight Synchronization with ~100x Less CommunicationLucas Hu, Ranchi Zhao, Isaac Zhu et al.
In large-scale reinforcement learning (RL) systems with decoupled Trainer-Rollout execution, the Trainer must regularly synchronize policy weights to the Rollout side to limit policy staleness. When inter-node bandwidth is abundant, such synchronization is usually only a small fraction of end-to-end cost. As model size grows, however, the communication demand rises rapidly. In bandwidth-constrained or network-variable deployments -- for example, cross-datacenter or cross-cluster settings, heterogeneous resource pools, and online RL -- weight synchronization can become a dominant bottleneck for throughput and tail latency. We observe that, in mainstream large-model RL training, the locations where parameters actually change are highly sparse at the element level (often 99%+ sparsity). Building on this observation, we propose and implement SparseRL-Sync, which replaces full-weight transfers with a lossless sparse update payload (indices and values) that can be exactly reconstructed on the inference side, thereby preserving 100% fidelity. Under a simplified cost model, sparse synchronization reduces the per-update communication volume from S to approximately S/X; with 99% sparsity (X ~ 100), this yields about a 100x reduction in transmitted data. Combined with appropriate bucketing, SparseRL-Sync also reduces launch and control-plane overhead, significantly improving scalability and end-to-end efficiency in bandwidth-limited and highly asynchronous RL settings.
LGOct 12, 2020
A Skew-Sensitive Evaluation Framework for Imbalanced Data ClassificationMin Du, Nesime Tatbul, Brian Rivers et al.
Class distribution skews in imbalanced datasets may lead to models with prediction bias towards majority classes, making fair assessment of classifiers a challenging task. Metrics such as Balanced Accuracy are commonly used to evaluate a classifier's prediction performance under such scenarios. However, these metrics fall short when classes vary in importance. In this paper, we propose a simple and general-purpose evaluation framework for imbalanced data classification that is sensitive to arbitrary skews in class cardinalities and importances. Experiments with several state-of-the-art classifiers tested on real-world datasets from three different domains show the effectiveness of our framework - not only in evaluating and ranking classifiers, but also training them.
CVAug 9, 2020
Model Generalization in Deep Learning Applications for Land Cover MappingLucas Hu, Caleb Robinson, Bistra Dilkina
Recent work has shown that deep learning models can be used to classify land-use data from geospatial satellite imagery. We show that when these deep learning models are trained on data from specific continents/seasons, there is a high degree of variability in model performance on out-of-sample continents/seasons. This suggests that just because a model accurately predicts land-use classes in one continent or season does not mean that the model will accurately predict land-use classes in a different continent or season. We then use clustering techniques on satellite imagery from different continents to visualize the differences in landscapes that make geospatial generalization particularly difficult, and summarize our takeaways for future satellite imagery-related applications.