Saavan Patel

3papers

80citations

Novelty55%

AI Score26

Ranked #169,309 of 205,806 authors (top 82%)#36,813 in LG (top 87%)

3 Papers

ARMar 14, 2022

Distributed On-Sensor Compute System for AR/VR Devices: A Semi-Analytical Simulation Framework for Power Estimation

Jorge Gomez, Saavan Patel, Syed Shakib Sarwar et al.

Augmented Reality/Virtual Reality (AR/VR) glasses are widely foreseen as the next generation computing platform. AR/VR glasses are a complex "system of systems" which must satisfy stringent form factor, computing-, power- and thermal- requirements. In this paper, we will show that a novel distributed on-sensor compute architecture, coupled with new semiconductor technologies (such as dense 3D-IC interconnects and Spin-Transfer Torque Magneto Random Access Memory, STT-MRAM) and, most importantly, a full hardware-software co-optimization are the solutions to achieve attractive and socially acceptable AR/VR glasses. To this end, we developed a semi-analytical simulation framework to estimate the power consumption of novel AR/VR distributed on-sensor computing architectures. The model allows the optimization of the main technological features of the system modules, as well as the computer-vision algorithm partition strategy across the distributed compute architecture. We show that, in the case of the compute-intensive machine learning based Hand Tracking algorithm, the distributed on-sensor compute architecture can reduce the system power consumption compared to a centralized system, with the additional benefits in terms of latency and privacy.

LGJun 16, 2020

Logically Synthesized, Hardware-Accelerated, Restricted Boltzmann Machines for Combinatorial Optimization and Integer Factorization

Saavan Patel, Philip Canoza, Sayeef Salahuddin

The Restricted Boltzmann Machine (RBM) is a stochastic neural network capable of solving a variety of difficult tasks such as NP-Hard combinatorial optimization problems and integer factorization. The RBM architecture is also very compact; requiring very few weights and biases. This, along with its simple, parallelizable sampling algorithm for finding the ground state of such problems, makes the RBM amenable to hardware acceleration. However, training of the RBM on these problems can pose a significant challenge, as the training algorithm tends to fail for large problem sizes and efficient mappings can be hard to find. Here, we propose a method of combining RBMs together that avoids the need to train large problems in their full form. We also propose methods for making the RBM more hardware amenable, allowing the algorithm to be efficiently mapped to an FPGA-based accelerator. Using this accelerator, we are able to show hardware accelerated factorization of 16 bit numbers with high accuracy with a speed improvement of 10000x and a power improvement of 32x.

LGSep 9, 2019

Combining Learned Representations for Combinatorial Optimization

Saavan Patel, Sayeef Salahuddin

We propose a new approach to combine Restricted Boltzmann Machines (RBMs) that can be used to solve combinatorial optimization problems. This allows synthesis of larger models from smaller RBMs that have been pretrained, thus effectively bypassing the problem of learning in large RBMs, and creating a system able to model a large, complex multi-modal space. We validate this approach by using learned representations to create ``invertible boolean logic'', where we can use Markov chain Monte Carlo (MCMC) approaches to find the solution to large scale boolean satisfiability problems and show viability towards other combinatorial optimization problems. Using this method, we are able to solve 64 bit addition based problems, as well as factorize 16 bit numbers. We find that these combined representations can provide a more accurate result for the same sample size as compared to a fully trained model.