Zhonghao Shi

h-index9

3papers

10citations

Novelty43%

AI Score24

Ranked #170,294 of 194,257 authors (top 88%)#37,052 in LG (top 92%)

3 Papers

7.8LGOct 18, 2022

MaSS: Multi-attribute Selective Suppression

Chun-Fu Chen, Shaohan Hu, Zhonghao Shi et al.

The recent rapid advances in machine learning technologies largely depend on the vast richness of data available today, in terms of both the quantity and the rich content contained within. For example, biometric data such as images and voices could reveal people's attributes like age, gender, sentiment, and origin, whereas location/motion data could be used to infer people's activity levels, transportation modes, and life habits. Along with the new services and applications enabled by such technological advances, various governmental policies are put in place to regulate such data usage and protect people's privacy and rights. As a result, data owners often opt for simple data obfuscation (e.g., blur people's faces in images) or withholding data altogether, which leads to severe data quality degradation and greatly limits the data's potential utility. Aiming for a sophisticated mechanism which gives data owners fine-grained control while retaining the maximal degree of data utility, we propose Multi-attribute Selective Suppression, or MaSS, a general framework for performing precisely targeted data surgery to simultaneously suppress any selected set of attributes while preserving the rest for downstream machine learning tasks. MaSS learns a data modifier through adversarial games between two sets of networks, where one is aimed at suppressing selected attributes, and the other ensures the retention of the rest of the attributes via general contrastive loss as well as explicit classification metrics. We carried out an extensive evaluation of our proposed method using multiple datasets from different domains including facial images, voice audio, and video clips, and obtained promising results in MaSS' generalizability and capability of suppressing targeted attributes without negatively affecting the data's usability in other downstream ML tasks.

4.6LGSep 19, 2024

Examining Test-Time Adaptation for Personalized Child Speech Recognition

Zhonghao Shi, Xuan Shi, Anfeng Xu et al.

Automatic speech recognition (ASR) models often experience performance degradation due to data domain shifts introduced at test time, a challenge that is further amplified for child speakers. Test-time adaptation (TTA) methods have shown great potential in bridging this domain gap. However, the use of TTA to adapt ASR models to the individual differences in each child's speech has not yet been systematically studied. In this work, we investigate the effectiveness of two widely used TTA methods-SUTA, SGEM-in adapting off-the-shelf ASR models and their fine-tuned versions for child speech recognition, with the goal of enabling continuous, unsupervised adaptation at test time. Our findings show that TTA significantly improves the performance of both off-the-shelf and fine-tuned ASR models, both on average and across individual child speakers, compared to unadapted baselines. However, while TTA helps adapt to individual variability, it may still be limited with non-linguistic child speech.

7.3ROJan 26, 2021

Toward Personalized Affect-Aware Socially Assistive Robot Tutors in Long-Term Interventions for Children with Autism

Zhonghao Shi, Thomas R Groechel, Shomik Jain et al.

Affect-aware socially assistive robotics (SAR) has shown great potential for augmenting interventions for children with autism spectrum disorders (ASD). However, current SAR cannot yet perceive the unique and diverse set of atypical cognitive-affective behaviors from children with ASD in an automatic and personalized fashion in long-term (multi-session) real-world interactions. To bridge this gap, this work designed and validated personalized models of arousal and valence for children with ASD using a multi-session in-home dataset of SAR interventions. By training machine learning (ML) algorithms with supervised domain adaptation (s-DA), the personalized models were able to trade off between the limited individual data and the more abundant less personal data pooled from other study participants. We evaluated the effects of personalization on a long-term multimodal dataset consisting of 4 children with ASD with a total of 19 sessions, and derived inter-rater reliability (IR) scores for binary arousal (IR = 83%) and valence (IR = 81%) labels between human annotators. Our results show that personalized Gradient Boosted Decision Trees (XGBoost) models with s-DA outperformed two non-personalized individualized and generic model baselines not only on the weighted average of all sessions, but also statistically (p < .05) across individual sessions. This work paves the way for the development of personalized autonomous SAR systems tailored toward individuals with atypical cognitive-affective and socio-emotional needs.