Minghui Sun

LG
h-index3
6papers
4citations
Novelty51%
AI Score53

6 Papers

SYMay 15
Functional requirements decomposition in set-based design

Minghui Sun, Zhaoyang Chen, Georgios Bakirtzis et al.

Designing systems is typically uncertain and ambiguous at early stages. Set-based design supports alternative exploration and gradual uncertainty reduction during the early lifecycle, making it practical for complex systems design. In parallel, the functional requirements decomposition helps to advance the design incrementally. However, current literature on set-based design lacks formal guidance in how to decompose functional requirements. To bridge this gap, we introduce a four-step method to decompose functional requirements for set-based design hierarchically. We systematically define, reason, and narrow the sets, breaking down the functional requirements into formal sub-requirements. This method allows parallel abstraction, ensuring the resulting system satisfies the top-level functional requirements.

LGMay 14
NEST: Nested Event Stream Transformer for Sequences of Multisets

Minghui Sun, Haoyu Gong, Xingyu You et al.

Event stream data often exhibit hierarchical structure in which multiple events co-occur, resulting in a sequence of multisets (i.e., bags of events). In electronic health records (EHRs), for example, medical events are grouped into a sequence of clinical encounters with well-defined temporal structure, but the order and timing of events within each encounter may be unknown or unreliable. Most existing foundation models (FMs) for event stream data flatten this hierarchy into a one-dimensional sequence, leading to (i) computational inefficiency associated with dense attention and learning spurious within-set relationships, and (ii) lower-quality set-level representations from heuristic post-training pooling for downstream tasks. Here, we show that preserving the original hierarchy in the FM architecture provides a useful inductive bias that improves both computational efficiency and representation quality. We then introduce Nested Event Stream Transformer (NEST), a FM for event streams comprised of sequences of multisets. Building on this architecture, we formulate Masked Set Modeling (MSM), an efficient paradigm that promotes improved set-level representation learning. Experiments on real-world multiset sequence data show that NEST captures real-world dynamics while improving both pretraining efficiency and downstream performance.

CVApr 24, 2023
NoiseTrans: Point Cloud Denoising with Transformers

Guangzhe Hou, Guihe Qin, Minghui Sun et al.

Point clouds obtained from capture devices or 3D reconstruction techniques are often noisy and interfere with downstream tasks. The paper aims to recover the underlying surface of noisy point clouds. We design a novel model, NoiseTrans, which uses transformer encoder architecture for point cloud denoising. Specifically, we obtain structural similarity of point-based point clouds with the assistance of the transformer's core self-attention mechanism. By expressing the noisy point cloud as a set of unordered vectors, we convert point clouds into point embeddings and employ Transformer to generate clean point clouds. To make the Transformer preserve details when sensing the point cloud, we design the Local Point Attention to prevent the point cloud from being over-smooth. In addition, we also propose sparse encoding, which enables the Transformer to better perceive the structural relationships of the point cloud and improve the denoising performance. Experiments show that our model outperforms state-of-the-art methods in various datasets and noise environments.

LGAug 15, 2025Code
Borrowing From the Future: Enhancing Early Risk Assessment through Contrastive Learning

Minghui Sun, Matthew M. Engelhard, Benjamin A. Goldstein

Risk assessments for a pediatric population are often conducted across multiple stages. For example, clinicians may evaluate risks prenatally, at birth, and during Well-Child visits. Although predictions made at later stages typically achieve higher precision, it is clinically desirable to make reliable risk assessments as early as possible. Therefore, this study focuses on improving prediction performance in early-stage risk assessments. Our solution, \textbf{Borrowing From the Future (BFF)}, is a contrastive multi-modal framework that treats each time window as a distinct modality. In BFF, a model is trained on all available data throughout the time while performing a risk assessment using up-to-date information. This contrastive framework allows the model to ``borrow'' informative signals from later stages (e.g., Well-Child visits) to implicitly supervise the learning at earlier stages (e.g., prenatal/birth stages). We validate BFF on two real-world pediatric outcome prediction tasks, demonstrating consistent improvements in early risk assessments. The code is available at https://github.com/scotsun/bff.

LGJul 24, 2025Code
CLEAR: Unlearning Spurious Style-Content Associations with Contrastive LEarning with Anti-contrastive Regularization

Minghui Sun, Benjamin A. Goldstein, Matthew M. Engelhard

Learning representations unaffected by superficial characteristics is important to ensure that shifts in these characteristics at test time do not compromise downstream prediction performance. For instance, in healthcare applications, we might like to learn features that contain information about pathology yet are unaffected by race, sex, and other sources of physiologic variability, thereby ensuring predictions are equitable and generalizable across all demographics. Here we propose Contrastive LEarning with Anti-contrastive Regularization (CLEAR), an intuitive and easy-to-implement framework that effectively separates essential (i.e., task-relevant) characteristics from superficial (i.e., task-irrelevant) characteristics during training, leading to better performance when superficial characteristics shift at test time. We begin by supposing that data representations can be semantically separated into task-relevant content features, which contain information relevant to downstream tasks, and task-irrelevant style features, which encompass superficial attributes that are irrelevant to these tasks, yet may degrade performance due to associations with content present in training data that do not generalize. We then prove that our anti-contrastive penalty, which we call Pair-Switching (PS), minimizes the Mutual Information between the style attributes and content labels. Finally, we instantiate CLEAR in the latent space of a Variational Auto-Encoder (VAE), then perform experiments to quantitatively and qualitatively evaluate the resulting CLEAR-VAE over several image datasets. Our results show that CLEAR-VAE allows us to: (a) swap and interpolate content and style between any pair of samples, and (b) improve downstream classification performance in the presence of previously unseen combinations of content and style. Our code will be made publicly available.

LGMar 23
Multimodal Training to Unimodal Deployment: Leveraging Unstructured Data During Training to Optimize Structured Data Only Deployment

Zigui Wang, Minghui Sun, Jiang Shu et al.

Unstructured Electronic Health Record (EHR) data, such as clinical notes, contain clinical contextual observations that are not directly reflected in structured data fields. This additional information can substantially improve model learning. However, due to their unstructured nature, these data are often unavailable or impractical to use when deploying a model. We introduce a multimodal learning framework that leverages unstructured EHR data during training while producing a model that can be deployed using only structured EHR data. Using a cohort of 3,466 children evaluated for late talking, we generated note embeddings with BioClinicalBERT and encoded structured embeddings from demographics and medical codes. A note-based teacher model and a structured-only student model were jointly trained using contrastive learning and contrastive knowledge distillation loss, producing a strong classifier (AUROC = 0.985). Our proposed model reached an AUROC of 0.705, outperforming the structured-only baseline of 0.656. These results demonstrate that incorporating unstructured data during training enhances the model's capacity to identify task-relevant information within structured EHR data, enabling a deployable structured-only phenotype model.