Rong Ou

CR
h-index25
4papers
123citations
Novelty40%
AI Score39

4 Papers

CVMar 17, 2025Code
Training Video Foundation Models with NVIDIA NeMo

Zeeshan Patel, Ethan He, Parth Mannan et al.

Video Foundation Models (VFMs) have recently been used to simulate the real world to train physical AI systems and develop creative visual experiences. However, there are significant challenges in training large-scale, high quality VFMs that can generate high-quality videos. We present a scalable, open-source VFM training pipeline with NVIDIA NeMo, providing accelerated video dataset curation, multimodal data loading, and parallelized video diffusion model training and inference. We also provide a comprehensive performance analysis highlighting best practices for efficient VFM training and inference.

ROFeb 25
Iterative Closed-Loop Motion Synthesis for Scaling the Capabilities of Humanoid Control

Weisheng Xu, Qiwei Wu, Jiaxi Zhang et al.

Physics-based humanoid control relies on training with motion datasets that have diverse data distributions. However, the fixed difficulty distribution of datasets limits the performance ceiling of the trained control policies. Additionally, the method of acquiring high-quality data through professional motion capture systems is constrained by costs, making it difficult to achieve large-scale scalability. To address these issues, we propose a closed-loop automated motion data generation and iterative framework. It can generate high-quality motion data with rich action semantics, including martial arts, dance, combat, sports, gymnastics, and more. Furthermore, our framework enables difficulty iteration of policies and data through physical metrics and objective evaluations, allowing the trained tracker to break through its original difficulty limits. On the PHC single-primitive tracker, using only approximately 1/10 of the AMASS dataset size, the average failure rate on the test set (2201 clips) is reduced by 45\% compared to the baseline. Finally, we conduct comprehensive ablation and comparative experiments to highlight the rationality and advantages of our framework.

LGMay 19, 2020
Out-of-Core GPU Gradient Boosting

Rong Ou

GPU-based algorithms have greatly accelerated many machine learning methods; however, GPU memory is typically smaller than main memory, limiting the size of training data. In this paper, we describe an out-of-core GPU gradient boosting algorithm implemented in the XGBoost library. We show that much larger datasets can fit on a given GPU, without degrading model accuracy or training time. To the best of our knowledge, this is the first out-of-core GPU implementation of gradient boosting. Similar approaches can be applied to other machine learning algorithms

CRMar 25, 2012
Breaking a novel colour image encryption algorithm based on chaos

Chengqing Li, Yu Zhang, Rong Ou et al.

Recently, a colour image encryption algorithm based on chaos was proposed by cascading two position permutation operations and one substitution operation, which are all determined by some pseudo-random number sequences generated by iterating the Logistic map. This paper evaluates the security level of the encryption algorithm and finds that the position permutation-only part and the substitution part can be separately broken with only $\lceil (\log_2(3MN))/8 \rceil$ and 2 chosen plain-images, respectively, where $MN$ is the size of the plain-image. Concise theoretical analyses are provided to support the chosen-plaintext attack, which are verified by experimental results also.