Mihir Bellare

IT
h-index104
3papers
151citations
Novelty75%
AI Score37

3 Papers

CRApr 7, 2025
Pr$εε$mpt: Sanitizing Sensitive Prompts for LLMs

Amrita Roy Chowdhury, David Glukhov, Divyam Anshumaan et al.

The rise of large language models (LLMs) has introduced new privacy challenges, particularly during inference where sensitive information in prompts may be exposed to proprietary LLM APIs. In this paper, we address the problem of formally protecting the sensitive information contained in a prompt while maintaining response quality. To this end, first, we introduce a cryptographically inspired notion of a prompt sanitizer which transforms an input prompt to protect its sensitive tokens. Second, we propose Pr$εε$mpt, a novel system that implements a prompt sanitizer. Pr$εε$mpt categorizes sensitive tokens into two types: (1) those where the LLM's response depends solely on the format (such as SSNs, credit card numbers), for which we use format-preserving encryption (FPE); and (2) those where the response depends on specific values, (such as age, salary) for which we apply metric differential privacy (mDP). Our evaluation demonstrates that Pr$εε$mpt is a practical method to achieve meaningful privacy guarantees, while maintaining high utility compared to unsanitized prompts, and outperforming prior methods

ITJan 16, 2012
Polynomial-Time, Semantically-Secure Encryption Achieving the Secrecy Capacity

Mihir Bellare, Stefano Tessaro

In the wiretap channel setting, one aims to get information-theoretic privacy of communicated data based only on the assumption that the channel from sender to receiver is noisier than the one from sender to adversary. The secrecy capacity is the optimal (highest possible) rate of a secure scheme, and the existence of schemes achieving it has been shown. For thirty years the ultimate and unreached goal has been to achieve this optimal rate with a scheme that is polynomial-time. (This means both encryption and decryption are proven polynomial time algorithms.) This paper finally delivers such a scheme. In fact it does more. Our scheme not only meets the classical notion of security from the wiretap literature, called MIS-R (mutual information security for random messages) but achieves the strictly stronger notion of semantic security, thus delivering more in terms of security without loss of rate.

ITJan 10, 2012
A Cryptographic Treatment of the Wiretap Channel

Mihir Bellare, Stefano Tessaro, Alexander Vardy

The wiretap channel is a setting where one aims to provide information-theoretic privacy of communicated data based solely on the assumption that the channel from sender to adversary is "noisier" than the channel from sender to receiver. It has been the subject of decades of work in the information and coding (I&C) community. This paper bridges the gap between this body of work and modern cryptography with contributions along two fronts, namely metrics (definitions) of security, and schemes. We explain that the metric currently in use is weak and insufficient to guarantee security of applications and propose two replacements. One, that we call mis-security, is a mutual-information based metric in the I&C style. The other, semantic security, adapts to this setting a cryptographic metric that, in the cryptography community, has been vetted by decades of evaluation and endorsed as the target for standards and implementations. We show that they are equivalent (any scheme secure under one is secure under the other), thereby connecting two fundamentally different ways of defining security and providing a strong, unified and well-founded target for designs. Moving on to schemes, results from the wiretap community are mostly non-constructive, proving the existence of schemes without necessarily yielding ones that are explicit, let alone efficient, and only meeting their weak notion of security. We apply cryptographic methods based on extractors to produce explicit, polynomial-time and even practical encryption schemes that meet our new and stronger security target.