36.2ARMar 31
PC-Indexed Data Address TranslationShyam Murthy, Gurindar S Sohi
This paper proposes a novel way to assist conventional data address translation. The approach, PC-Indexed Data Address Translation (PCAX), uses the PC of a load instruction, and not a data virtual address, to obtain the page table entry (PTE) for the data accessed by a load instruction. PCAX is intended to be used for a small subset of the static loads in a program. We observe that: (i) a small subset of static loads is responsible for most of the misses in a data translation lookaside buffer (DTLB), and (ii) often a dynamic instance of a static load instruction accesses the same PTE as the last dynamic instance, and consider PCAX for this subset. With PCAX the effective miss rate of a conventional DTLB can be cut down by a factor of 2-3X in many cases, and even more in some cases. PCAX is also beneficial in reducing the number of secondary TLB (STLB) misses. Since the tables used for PCAX can be accessed alongside instruction fetch, they can be slow, yet frequently provide a valid PTE even before the data address calculation. This yields an average performance improvement of 1.7% while reducing data address translation energy by 7% across 84 server traces.
22.5ARMar 31
Improving Instruction Fetch Efficiency via High-Level Program Map TraversalShyam Murthy, Gurindar S. Sohi
Efficiency in instruction fetching is critical to performance, and this requires the primary structures--L1 instruction caches (L1i), branch target buffers (BTB) and instruction TLBs (iTLB)--to have the requisite information when needed. This paper proposes instruction presending, which traverses a high-level program map to identify and move instruction cache blocks, BTB entries, and iTLB entries from the secondary to the primary structures in a "just in time" fashion. Empirical results are presented to demonstrate the efficacy of the proposed presending scheme. Presending reduces the number of cycles where the instruction fetch is waiting by an order of magnitude as compared to state-of-the-art instruction prefetching schemes while operating with small-sized primary BTBs. It is especially effective for benchmarks with a high base MPKI, where movement from secondary to primary structures is frequent. This improvement in fetch efficiency results in performance improvements in cases where this efficiency is important.
CRJan 16, 2021
Revisiting Driver Anonymity in ORideDeepak Kumaraswamy, Shyam Murthy, Srinivas Vivek
Ride Hailing Services (RHS) have become a popular means of transportation, and with its popularity comes the concerns of privacy of riders and drivers. ORide is a privacy-preserving RHS proposed at the USENIX Security Symposium 2017 and uses Somewhat Homomorphic Encryption (SHE). In their protocol, a rider and all drivers in a zone send their encrypted coordinates to the RHS Service Provider (SP) who computes the squared Euclidean distances between them and forwards them to the rider. The rider decrypts these and selects the optimal driver with least Euclidean distance. In this work, we demonstrate a location-harvesting attack where an honest-but-curious rider, making only a single ride request, can determine the exact coordinates of about half the number of responding drivers even when only the distance between the rider and drivers are given. The significance of our attack lies in inferring locations of other drivers in the zone, which are not (supposed to be) revealed to the rider as per the protocol. We validate our attack by running experiments on zones of varying sizes in arbitrarily selected big cities. Our attack is based on enumerating lattice points on a circle of sufficiently small radius and eliminating solutions based on conditions imposed by the application scenario. Finally, we propose a modification to ORide aimed at thwarting our attack and show that this modification provides sufficient driver anonymity while preserving ride matching accuracy.