99.4AIApr 9
Emotion Concepts and their Function in a Large Language ModelNicholas Sofroniew, Isaac Kauvar, William Saunders et al.
Large language models (LLMs) sometimes appear to exhibit emotional reactions. We investigate why this is the case in Claude Sonnet 4.5 and explore implications for alignment-relevant behavior. We find internal representations of emotion concepts, which encode the broad concept of a particular emotion and generalize across contexts and behaviors it might be linked to. These representations track the operative emotion concept at a given token position in a conversation, activating in accordance with that emotion's relevance to processing the present context and predicting upcoming text. Our key finding is that these representations causally influence the LLM's outputs, including Claude's preferences and its rate of exhibiting misaligned behaviors such as reward hacking, blackmail, and sycophancy. We refer to this phenomenon as the LLM exhibiting functional emotions: patterns of expression and behavior modeled after humans under the influence of an emotion, which are mediated by underlying abstract representations of emotion concepts. Functional emotions may work quite differently from human emotions, and do not imply that LLMs have any subjective experience of emotions, but appear to be important for understanding the model's behavior.
CRSep 25, 2020
Target Privacy Threat Modeling for COVID-19 Exposure Notification SystemsAnanya Gangavarapu, Ellie Daw, Abhishek Singh et al.
The adoption of digital contact tracing (DCT) technology during the COVID-19pandemic has shown multiple benefits, including helping to slow the spread of infectious disease and to improve the dissemination of accurate information. However, to support both ethical technology deployment and user adoption, privacy must be at the forefront. With the loss of privacy being a critical threat, thorough threat modeling will help us to strategize and protect privacy as digital contact tracing technologies advance. Various threat modeling frameworks exist today, such as LINDDUN, STRIDE, PASTA, and NIST, which focus on software system privacy, system security, application security, and data-centric risk, respectively. When applied to the exposure notification system (ENS) context, these models provide a thorough view of the software side but fall short in addressing the integrated nature of hardware, humans, regulations, and software involved in such systems. Our approach addresses ENSsas a whole and provides a model that addresses the privacy complexities of a multi-faceted solution. We define privacy principles, privacy threats, attacker capabilities, and a comprehensive threat model. Finally, we outline threat mitigation strategies that address the various threats defined in our model
CRJul 1, 2020
Adding Location and Global Context to the Google/Apple Exposure Notification Bluetooth APIRamesh Raskar, Abhishek Singh, Sam Zimmerman et al.
Contact tracing requires a strong understanding of the context of a user, and location with other sensory data could provide a context for any infection encounter. Although Bluetooth technology gives a good insight into the proximity aspect of an encounter, it does not provide any location context related to it which helps to make better decisions. Using the ideas presented in this paper, one shall be able to obtain this valuable information that could address the problem of false-positive and false-negative to a certain extent. All of this within the purview of Google/Apple Exposure Notification (GAEN) specification, while preserving complete user privacy. There are four ways of propagating context between any two users. Two such methods allow private location logging, without revealing the location history within an app. The other two are encryption-based methods. The first encryption method is a variant of Apple's FindMy protocol, that allows nearby Apple devices to capture the GPS location of a lost Apple device. The second encryption is a minor modification of the existing GAEN protocol so that global context is available to a healthy phone only when it is exposed - this is a better option comparatively. It will still be the role of Public Health smartphone app to decide, on how to use the location-time context, to build a full-fledged contact tracing and public health solution. Lastly, we highlight the benefits and potential privacy issues with each of these context propagation methods proposed here.