Trustworthy AI: Ensuring Reliability and Accountability from Models to Agents
For researchers and practitioners building reliable and accountable ML systems, this work provides theoretical foundations and practical algorithms for bias, watermarking, and agent evaluation, though the contributions are incremental in nature.
This thesis develops algorithms with theoretical guarantees for trustworthy AI, addressing bias mitigation, predictive multiplicity, watermarking for LLMs, and evaluation of LLM agents. Key results include optimal watermarking strategies with superior detection-quality tradeoffs and LLM agents reducing supply chain costs by up to 67% while introducing systemic risks.
In this thesis, we develop algorithms with theoretical guarantees for ensuring reliability and accountability of Machine Learning (ML) systems. As ML systems evolve from predictive models to generative models and autonomous agents, the landscape of trustworthy AI has shifted. This thesis introduces tools grounded in information theory, optimization, and statistical learning to mitigate bias, reduce arbitrary decisions, ensure content provenance, and evaluate LLM-driven agents in autonomous settings. Towards mitigating bias and arbitrariness in traditional ML models, we introduce a kernel-based method to achieve multiaccuracy across complex subpopulations that traditional demographic categories may overlook. We also develop methods to address predictive multiplicity, where equally accurate models yield conflicting individual predictions. We ensure the accountability in generative AI through watermarking large language models (LLMs). We characterize the information-theoretic trade-off between watermark detection and text distortion and derive optimal watermarking strategies by leveraging optimal transport and coding theory. Empirical evaluations show our watermarks achieve a superior detection-quality tradeoff across language generation and coding tasks. Finally, we evaluate autonomous LLM agents in multi-agent environments through the first simulator of a fully LLM-driven supply chain. LLM agents offer significant performance gains, outperforming human teams and reducing costs by up to 67%, but also introduce systemic risks, including costly tail events.