AISep 8, 2023
Inferring physical laws by artificial intelligence based causal modelsJorawar Singh, Kishor Bharti, Arvind
The advances in Artificial Intelligence (AI) and Machine Learning (ML) have opened up many avenues for scientific research, and are adding new dimensions to the process of knowledge creation. However, even the most powerful and versatile of ML applications till date are primarily in the domain of analysis of associations and boil down to complex data fitting. Judea Pearl has pointed out that Artificial General Intelligence must involve interventions involving the acts of doing and imagining. Any machine assisted scientific discovery thus must include casual analysis and interventions. In this context, we propose a causal learning model of physical principles, which not only recognizes correlations but also brings out casual relationships. We use the principles of causal inference and interventions to study the cause-and-effect relationships in the context of some well-known physical phenomena. We show that this technique can not only figure out associations among data, but is also able to correctly ascertain the cause-and-effect relations amongst the variables, thereby strengthening (or weakening) our confidence in the proposed model of the underlying physical process.
CRJan 13, 2025Code
Leveraging ASIC AI Chips for Homomorphic EncryptionJianming Tong, Tianhao Huang, Leo de Castro et al.
Cloud-based services are making the outsourcing of sensitive client data increasingly common. Although homomorphic encryption (HE) offers strong privacy guarantee, it requires substantially more resources than computing on plaintext, often leading to unacceptably large latencies in getting the results. HE accelerators have emerged to mitigate this latency issue, but with the high cost of ASICs. In this paper we show that HE primitives can be converted to AI operators and accelerated on existing ASIC AI accelerators, like TPUs, which are already widely deployed in the cloud. Adapting such accelerators for HE requires (1) supporting modular multiplication, (2) high-precision arithmetic in software, and (3) efficient mapping on matrix engines. We introduce the CROSS compiler (1) to adopt Barrett reduction to provide modular reduction support using multiplier and adder, (2) Basis Aligned Transformation (BAT) to convert high-precision multiplication as low-precision matrix-vector multiplication, (3) Matrix Aligned Transformation (MAT) to covert vectorized modular operation with reduction into matrix multiplication that can be efficiently processed on 2D spatial matrix engine. Our evaluation of CROSS on a Google TPUv4 demonstrates significant performance improvements, with up to 161x and 5x speedup compared to the previous work on many-core CPUs and V100. The kernel-level codes are open-sourced at https://github.com/google/jaxite/tree/main/jaxite_word.
CVJan 10, 2025Code
A Multimodal Dataset for Enhancing Industrial Task Monitoring and Engagement PredictionNaval Kishore Mehta, Arvind, Himanshu Kumar et al.
Detecting and interpreting operator actions, engagement, and object interactions in dynamic industrial workflows remains a significant challenge in human-robot collaboration research, especially within complex, real-world environments. Traditional unimodal methods often fall short of capturing the intricacies of these unstructured industrial settings. To address this gap, we present a novel Multimodal Industrial Activity Monitoring (MIAM) dataset that captures realistic assembly and disassembly tasks, facilitating the evaluation of key meta-tasks such as action localization, object interaction, and engagement prediction. The dataset comprises multi-view RGB, depth, and Inertial Measurement Unit (IMU) data collected from 22 sessions, amounting to 290 minutes of untrimmed video, annotated in detail for task performance and operator behavior. Its distinctiveness lies in the integration of multiple data modalities and its emphasis on real-world, untrimmed industrial workflows-key for advancing research in human-robot collaboration and operator monitoring. Additionally, we propose a multimodal network that fuses RGB frames, IMU data, and skeleton sequences to predict engagement levels during industrial tasks. Our approach improves the accuracy of recognizing engagement states, providing a robust solution for monitoring operator performance in dynamic industrial environments. The dataset and code can be accessed from https://github.com/navalkishoremehta95/MIAM/.
LGApr 29, 2024
The Shape of Money Laundering: Subgraph Representation Learning on the Blockchain with the Elliptic2 DatasetClaudio Bellei, Muhua Xu, Ross Phillips et al.
Subgraph representation learning is a technique for analyzing local structures (or shapes) within complex networks. Enabled by recent developments in scalable Graph Neural Networks (GNNs), this approach encodes relational information at a subgroup level (multiple connected nodes) rather than at a node level of abstraction. We posit that certain domain applications, such as anti-money laundering (AML), are inherently subgraph problems and mainstream graph techniques have been operating at a suboptimal level of abstraction. This is due in part to the scarcity of annotated datasets of real-world size and complexity, as well as the lack of software tools for managing subgraph GNN workflows at scale. To enable work in fundamental algorithms as well as domain applications in AML and beyond, we introduce Elliptic2, a large graph dataset containing 122K labeled subgraphs of Bitcoin clusters within a background graph consisting of 49M node clusters and 196M edge transactions. The dataset provides subgraphs known to be linked to illicit activity for learning the set of "shapes" that money laundering exhibits in cryptocurrency and accurately classifying new criminal activity. Along with the dataset we share our graph techniques, software tooling, promising early experimental results, and new domain insights already gleaned from this approach. Taken together, we find immediate practical value in this approach and the potential for a new standard in anti-money laundering and forensic analytics in cryptocurrencies and other financial networks.
CVJan 9, 2025
Optimizing Multitask Industrial Processes with Predictive Action GuidanceNaval Kishore Mehta, Arvind, Shyam Sunder Prasad et al.
Monitoring complex assembly processes is critical for maintaining productivity and ensuring compliance with assembly standards. However, variability in human actions and subjective task preferences complicate accurate task anticipation and guidance. To address these challenges, we introduce the Multi-Modal Transformer Fusion and Recurrent Units (MMTFRU) Network for egocentric activity anticipation, utilizing multimodal fusion to improve prediction accuracy. Integrated with the Operator Action Monitoring Unit (OAMU), the system provides proactive operator guidance, preventing deviations in the assembly process. OAMU employs two strategies: (1) Top-5 MMTF-RU predictions, combined with a reference graph and an action dictionary, for next-step recommendations; and (2) Top-1 MMTF-RU predictions, integrated with a reference graph, for detecting sequence deviations and predicting anomaly scores via an entropy-informed confidence mechanism. We also introduce Time-Weighted Sequence Accuracy (TWSA) to evaluate operator efficiency and ensure timely task completion. Our approach is validated on the industrial Meccano dataset and the largescale EPIC-Kitchens-55 dataset, demonstrating its effectiveness in dynamic environments.
CRJul 9, 2019
An Energy-Efficient Reconfigurable DTLS Cryptographic Engine for Securing Internet-of-Things ApplicationsUtsav Banerjee, Andrew Wright, Chiraag Juvekar et al.
This paper presents the first hardware implementation of the Datagram Transport Layer Security (DTLS) protocol to enable end-to-end security for the Internet of Things (IoT). A key component of this design is a reconfigurable prime field elliptic curve cryptography (ECC) accelerator, which is 238x and 9x more energy-efficient compared to software and state-of-the-art hardware respectively. Our full hardware implementation of the DTLS 1.3 protocol provides 438x improvement in energy-efficiency over software, along with code size and data memory usage as low as 8 KB and 3 KB respectively. The cryptographic accelerators are coupled with an on-chip low-power RISC-V processor to benchmark applications beyond DTLS with up to two orders of magnitude energy savings. The test chip, fabricated in 65 nm CMOS, demonstrates hardware-accelerated DTLS sessions while consuming 44.08 uJ per handshake, and 0.89 nJ per byte of encrypted data at 16 MHz and 0.8 V.
CRMar 11, 2019
An Energy-Efficient Reconfigurable DTLS Cryptographic Engine for End-to-End Security in IoT ApplicationsUtsav Banerjee, Chiraag Juvekar, Andrew Wright et al.
This paper presents a reconfigurable cryptographic engine that implements the DTLS protocol to enable end-to-end security for IoT. This implementation of the DTLS engine demonstrates 10x reduction in code size and 438x improvement in energy-efficiency over software. Our ECC primitive is 237x and 9x more energy-efficient compared to software and state-of-the-art hardware respectively. Pairing the DTLS engine with an on-chip RISC-V allows us to demonstrate applications beyond DTLS with up to 2 orders of magnitude energy savings.
CRDec 24, 2018
MI6: Secure Enclaves in a Speculative Out-of-Order ProcessorThomas Bourgeat, Ilia Lebedev, Andrew Wright et al.
Recent attacks have broken process isolation by exploiting microarchitectural side channels that allow indirect access to shared microarchitectural state. Enclaves strengthen the process abstraction to restore isolation guarantees. We propose MI6, an aggressive, speculative out-of-order processor capable of providing secure enclaves under a threat model that includes an untrusted OS and an attacker capable of mounting any software attack currently considered practical, including control flow speculation attacks. MI6 is inspired by Sanctum [16] and extends its isolation guarantee to more realistic memory hierarchies. It also introduces a purge instruction, which is used only when a secure process is scheduled, and implements it for a complex processor microarchitecture. We model the performance impact of enclaves in MI6 through FPGA emulation on AWS F1 FPGAs by running SPEC CINT2006 benchmarks on top of an untrusted Linux OS. Security comes at the cost of approximately 16.4% average slowdown for protected programs.