62.9OCMay 27
Implicit Regularization in Perturbed Deep Matrix Factorization: Spectral Conditions and StabilityJingzhe Wang, Hung-Hsu Chou
This paper studies the stability of low-rank implicit regularization in perturbed deep matrix factorization, where the target matrix is corrupted by a noise matrix. We first derive sufficient spectral conditions under which gradient descent exhibits a low-rank phase in the noiseless setting. These conditions show how the target spectrum, initialization, and step size jointly determine the existence of a nonempty low-rank interval. We then analyze the perturbed gradient descent dynamics, proving convergence guarantees and quantifying how the perturbation affects iteration complexity and eigenvalue recovery. Finally, we show that the low-rank phase persists under perturbation, with explicit dependence on the perturbation size. Numerical experiments support the theoretical findings.
16.8OCMay 20
Distributed and Decentralized Optimization Algorithms via Consensus ALADINXu Du, Jingzhe Wang, Karl H. Johansson et al.
Distributed optimization has found widespread applications in smart grids, optimal control, and machine learning. This paper studies distributed consensus optimization. We extend the Augmented Lagrangian-based Alternating Direction Inexact Newton (ALADIN) framework to propose Consensus ALADIN (C-ALADIN) with a central coordinator, which directly handles consensus constraints. Our C-ALADIN algorithm admits both a first-order variant and a second-order variant that employs a Hessian approximation, avoiding direct transmission of second-order information while preserving fast local convergence. We then develop a decentralized version of C-ALADIN that operates over directed graphs with quantized communication, using a finite-time coordination protocol. For both versions, we establish global convergence guarantees for convex problems and local convergence guarantees for non-convex problems. For the decentralized case, the iterates converge to a neighborhood of the optimum determined by the quantization level. Numerical results demonstrate that our methods retain fast convergence while substantially reducing communication and computational costs compared to existing decentralized approaches.
CLDec 19, 2024
Eliciting Causal Abilities in Large Language Models for Reasoning TasksYajing Wang, Zongwei Luo, Jingzhe Wang et al.
Prompt optimization automatically refines prompting expressions, unlocking the full potential of LLMs in downstream tasks. However, current prompt optimization methods are costly to train and lack sufficient interpretability. This paper proposes enhancing LLMs' reasoning performance by eliciting their causal inference ability from prompting instructions to correct answers. Specifically, we introduce the Self-Causal Instruction Enhancement (SCIE) method, which enables LLMs to generate high-quality, low-quantity observational data, then estimates the causal effect based on these data, and ultimately generates instructions with the optimized causal effect. In SCIE, the instructions are treated as the treatment, and textual features are used to process natural language, establishing causal relationships through treatments between instructions and downstream tasks. Additionally, we propose applying Object-Relational (OR) principles, where the uncovered causal relationships are treated as the inheritable class across task objects, ensuring low-cost reusability. Extensive experiments demonstrate that our method effectively generates instructions that enhance reasoning performance with reduced training cost of prompts, leveraging interpretable textual features to provide actionable insights.
CRJan 30, 2021
SteemOps: Extracting and Analyzing Key Operations in Steemit Blockchain-based Social Media PlatformChao Li, Balaji Palanisamy, Runhua Xu et al.
Advancements in distributed ledger technologies are driving the rise of blockchain-based social media platforms such as Steemit, where users interact with each other in similar ways as conventional social networks. These platforms are autonomously managed by users using decentralized consensus protocols in a cryptocurrency ecosystem. The deep integration of social networks and blockchains in these platforms provides potential for numerous cross-domain research studies that are of interest to both the research communities. However, it is challenging to process and analyze large volumes of raw Steemit data as it requires specialized skills in both software engineering and blockchain systems and involves substantial efforts in extracting and filtering various types of operations. To tackle this challenge, we collect over 38 million blocks generated in Steemit during a 45 month time period from 2016/03 to 2019/11 and extract ten key types of operations performed by the users. The results generate SteemOps, a new dataset that organizes more than 900 million operations from Steemit into three sub-datasets namely (i) social-network operation dataset (SOD), (ii) witness-election operation dataset (WOD) and (iii) value-transfer operation dataset (VOD). We describe the dataset schema and its usage in detail and outline possible future research studies using SteemOps. SteemOps is designed to facilitate future research aimed at providing deeper insights on emerging blockchain-based social media platforms.