SYSep 24, 2014
An Upper Bound on the Convergence Time for Quantized Consensus of Arbitrary Static GraphsShang Shang, Paul Cuff, Pan Hui et al.
We analyze a class of distributed quantized consensus algorithms for arbitrary static networks. In the initial setting, each node in the network has an integer value. Nodes exchange their current estimate of the mean value in the network, and then update their estimation by communicating with their neighbors in a limited capacity channel in an asynchronous clock setting. Eventually, all nodes reach consensus with quantized precision. We analyze the expected convergence time for the general quantized consensus algorithm proposed by Kashyap et al \cite{Kashyap}. We use the theory of electric networks, random walks, and couplings of Markov chains to derive an $O(N^3\log N)$ upper bound for the expected convergence time on an arbitrary graph of size $N$, improving on the state of art bound of $O(N^5)$ for quantized consensus algorithms. Our result is not dependent on graph topology. Example of complete graphs is given to show how to extend the analysis to graphs of given topology.
LGFeb 14, 2025Code
AffinityFlow: Guided Flows for Antibody Affinity MaturationCan Chen, Karla-Luise Herpoldt, Chenchao Zhao et al.
Antibodies are widely used as therapeutics, but their development requires costly affinity maturation, involving iterative mutations to enhance binding affinity.This paper explores a sequence-only scenario for affinity maturation, using solely antibody and antigen sequences. Recently AlphaFlow wraps AlphaFold within flow matching to generate diverse protein structures, enabling a sequence-conditioned generative model of structure. Building on this, we propose an alternating optimization framework that (1) fixes the sequence to guide structure generation toward high binding affinity using a structure-based affinity predictor, then (2) applies inverse folding to create sequence mutations, refined by a sequence-based affinity predictor for post selection. A key challenge is the lack of labeled data for training both predictors. To address this, we develop a co-teaching module that incorporates valuable information from noisy biophysical energies into predictor refinement. The sequence-based predictor selects consensus samples to teach the structure-based predictor, and vice versa. Our method, AffinityFlow, achieves state-of-the-art performance in affinity maturation experiments. We plan to open-source our code after acceptance.
32.3CRMay 12
PhishSigma++: Malicious Email Detection with Typed Entity RelationsShang Shang, Ruiqi Wang, Ruijie Qi et al.
Here is a further shortened version (pure text, no extra formatting, academic style preserved, no content change): Abstract. With the rise of AI-generated content (AIGC), phishing actors now possess richer linguistic capabilities and evasion techniques. Most existing detectors over-rely on mutable textual features, achieving high accuracy on clean data but degrading severely under text-focused adversarial manipulation. This mirrors the lab-to-real performance gap. We investigate invariant signals in phishing emails: even when attackers modify surface text, functional intent constrains relations among typed entities. Threat-actor tradecraft is described via high-level TTPs, but rule-based systems like Sigma express invariants only through manually curated, field-specific patterns, limiting flexibility. We introduce PhishSigma++, an entity-relation-based malicious email detector for RFC822 messages that generalizes Sigma's design. It extracts 40 typed entity classes, computes 5 cross-type relations to build a typed email graph, and uses particle swarm optimization (PSO) to select a sparse discriminative mask, supporting classification and type-level evidence summary. On 29,142 messages, PhishSigma++ achieves 0.9675 F1 on clean data and outperforms text-centric baselines under non-adaptive Good Word padding at \r{ho}=0.8. It maintains 0.9579 F1, while a token-based Bayesian filter collapses to 0.0243 and a DistilBERT phishing checkpoint falls to 0.7284. Compared with traditional Sigma rules, PhishSigma++ offers higher detection, broader relational invariance coverage, and data-driven feature selection. We also show that thresholded typed relation scores encode a useful fragment of Sigma-style field conditions, unifying hand-crafted rule logic and learned relation masks in a single-email framework.
CRMay 6, 2024
Can LLMs Deeply Detect Complex Malicious Queries? A Framework for Jailbreaking via Obfuscating IntentShang Shang, Xinqiang Zhao, Zhongjiang Yao et al.
To demonstrate and address the underlying maliciousness, we propose a theoretical hypothesis and analytical approach, and introduce a new black-box jailbreak attack methodology named IntentObfuscator, exploiting this identified flaw by obfuscating the true intentions behind user prompts.This approach compels LLMs to inadvertently generate restricted content, bypassing their built-in content security measures. We detail two implementations under this framework: "Obscure Intention" and "Create Ambiguity", which manipulate query complexity and ambiguity to evade malicious intent detection effectively. We empirically validate the effectiveness of the IntentObfuscator method across several models, including ChatGPT-3.5, ChatGPT-4, Qwen and Baichuan, achieving an average jailbreak success rate of 69.21\%. Notably, our tests on ChatGPT-3.5, which claims 100 million weekly active users, achieved a remarkable success rate of 83.65\%. We also extend our validation to diverse types of sensitive content like graphic violence, racism, sexism, political sensitivity, cybersecurity threats, and criminal skills, further proving the substantial impact of our findings on enhancing 'Red Team' strategies against LLM content security frameworks.
AISep 24, 2014
The Application of Differential Privacy for Rank Aggregation: Privacy and AccuracyShang Shang, Tiance Wang, Paul Cuff et al.
The potential risk of privacy leakage prevents users from sharing their honest opinions on social platforms. This paper addresses the problem of privacy preservation if the query returns the histogram of rankings. The framework of differential privacy is applied to rank aggregation. The error probability of the aggregated ranking is analyzed as a result of noise added in order to achieve differential privacy. Upper bounds on the error rates for any positional ranking rule are derived under the assumption that profiles are uniformly distributed. Simulation results are provided to validate the probabilistic analysis.
IRMay 2, 2013
Privacy Preserving Recommendation System Based on GroupsShang Shang, Yuk Hui, Pan Hui et al.
Recommendation systems have received considerable attention in the recent decades. Yet with the development of information technology and social media, the risk in revealing private data to service providers has been a growing concern to more and more users. Trade-offs between quality and privacy in recommendation systems naturally arise. In this paper, we present a privacy preserving recommendation framework based on groups. The main idea is to use groups as a natural middleware to preserve users' privacy. A distributed preference exchange algorithm is proposed to ensure the anonymity of data, wherein the effective size of the anonymity set asymptotically approaches the group size with time. We construct a hybrid collaborative filtering model based on Markov random walks to provide recommendations and predictions to group members. Experimental results on the MovieLens and Epinions datasets show that our proposed methods outperform the baseline methods, L+ and ItemRank, two state-of-the-art personalized recommendation algorithms, for both recommendation precision and hit rate despite the absence of personal preference information.
IRAug 3, 2012
A Random Walk Based Model Incorporating Social Information for RecommendationsShang Shang, Sanjeev R. Kulkarni, Paul W. Cuff et al.
Collaborative filtering (CF) is one of the most popular approaches to build a recommendation system. In this paper, we propose a hybrid collaborative filtering model based on a Makovian random walk to address the data sparsity and cold start problems in recommendation systems. More precisely, we construct a directed graph whose nodes consist of items and users, together with item content, user profile and social network information. We incorporate user's ratings into edge settings in the graph model. The model provides personalized recommendations and predictions to individuals and groups. The proposed algorithms are evaluated on MovieLens and Epinions datasets. Experimental results show that the proposed methods perform well compared with other graph-based methods, especially in the cold start case.
IRAug 3, 2012
Wisdom of the Crowd: Incorporating Social Influence in Recommendation ModelsShang Shang, Pan Hui, Sanjeev R. Kulkarni et al.
Recommendation systems have received considerable attention recently. However, most research has been focused on improving the performance of collaborative filtering (CF) techniques. Social networks, indispensably, provide us extra information on people's preferences, and should be considered and deployed to improve the quality of recommendations. In this paper, we propose two recommendation models, for individuals and for groups respectively, based on social contagion and social influence network theory. In the recommendation model for individuals, we improve the result of collaborative filtering prediction with social contagion outcome, which simulates the result of information cascade in the decision-making process. In the recommendation model for groups, we apply social influence network theory to take interpersonal influence into account to form a settled pattern of disagreement, and then aggregate opinions of group members. By introducing the concept of susceptibility and interpersonal influence, the settled rating results are flexible, and inclined to members whose ratings are "essential".