Shan Shan

h-index41

3papers

12citations

Novelty43%

AI Score31

Ranked #134,680 of 194,257 authors (top 69%)#2,112 in ML (top 63%)

3 Papers

2.3MLJun 19, 2023Code

Human Limits in Machine Learning: Prediction of Plant Phenotypes Using Soil Microbiome Data

Rosa Aghdam, Xudong Tang, Shan Shan et al.

The preservation of soil health is a critical challenge in the 21st century due to its significant impact on agriculture, human health, and biodiversity. We provide the first deep investigation of the predictive potential of machine learning models to understand the connections between soil and biological phenotypes. We investigate an integrative framework performing accurate machine learning-based prediction of plant phenotypes from biological, chemical, and physical properties of the soil via two models: random forest and Bayesian neural network. We show that prediction is improved when incorporating environmental features like soil physicochemical properties and microbial population density into the models, in addition to the microbiome information. Exploring various data preprocessing strategies confirms the significant impact of human decisions on predictive performance. We show that the naive total sum scaling normalization that is commonly used in microbiome research is not the optimal strategy to maximize predictive power. Also, we find that accurately defined labels are more important than normalization, taxonomic level or model characteristics. In cases where humans are unable to classify samples accurately, machine learning model performance is limited. Lastly, we provide domain scientists via a full model selection decision tree to identify the human choices that optimize model prediction power. Our work is accompanied by open source reproducible scripts (https://github.com/solislemuslab/soil-microbiome-nn) for maximum outreach among the microbiome research community.

6.7MLMar 6, 2022

Diffusion Maps : Using the Semigroup Property for Parameter Tuning

Shan Shan, Ingrid Daubechies

Diffusion maps (DM) constitute a classic dimension reduction technique, for data lying on or close to a (relatively) low-dimensional manifold embedded in a much larger dimensional space. The DM procedure consists in constructing a spectral parametrization for the manifold from simulated random walks or diffusion paths on the data set. However, DM is hard to tune in practice. In particular, the task to set a diffusion time t when constructing the diffusion kernel matrix is critical. We address this problem by using the semigroup property of the diffusion operator. We propose a semigroup criterion for picking t. Experiments show that this principled approach is effective and robust.

5.8AIJun 4, 2025

Computational Architects of Society: Quantum Machine Learning for Social Rule Genesis

Shan Shan

The quantification of social science remains a longstanding challenge, largely due to the philosophical nature of its foundational theories. Although quantum computing has advanced rapidly in recent years, its relevance to social theory remains underexplored. Most existing research focuses on micro-cognitive models or philosophical analogies, leaving a gap in system-level applications of quantum principles to the analysis of social systems. This study addresses that gap by proposing a theoretical and computational framework that combines quantum mechanics with Generative AI to simulate the emergence and evolution of social norms. Drawing on core quantum concepts--such as superposition, entanglement, and probabilistic measurement--this research models society as a dynamic, uncertain system and sets up five ideal-type experiments. These scenarios are simulated using 25 generative agents, each assigned evolving roles as compliers, resistors, or enforcers. Within a simulated environment monitored by a central observer (the Watcher), agents interact, respond to surveillance, and adapt to periodic normative disruptions. These interactions allow the system to self-organize under external stress and reveal emergent patterns. Key findings show that quantum principles, when integrated with generative AI, enable the modeling of uncertainty, emergence, and interdependence in complex social systems. Simulations reveal patterns including convergence toward normative order, the spread of resistance, and the spontaneous emergence of new equilibria in social rules. In conclusion, this study introduces a novel computational lens that lays the groundwork for a quantum-informed social theory. It offers interdisciplinary insights into how society can be understood not just as a structure to observe but as a dynamic system to simulate and redesign through quantum technologies.