Abhishek Chakraborty

CR
5papers
31citations
Novelty53%
AI Score44

5 Papers

CRJul 27, 2022
DynaMarks: Defending Against Deep Learning Model Extraction Using Dynamic Watermarking

Abhishek Chakraborty, Daniel Xing, Yuntao Liu et al.

The functionality of a deep learning (DL) model can be stolen via model extraction where an attacker obtains a surrogate model by utilizing the responses from a prediction API of the original model. In this work, we propose a novel watermarking technique called DynaMarks to protect the intellectual property (IP) of DL models against such model extraction attacks in a black-box setting. Unlike existing approaches, DynaMarks does not alter the training process of the original model but rather embeds watermark into a surrogate model by dynamically changing the output responses from the original model prediction API based on certain secret parameters at inference runtime. The experimental outcomes on Fashion MNIST, CIFAR-10, and ImageNet datasets demonstrate the efficacy of DynaMarks scheme to watermark surrogate models while preserving the accuracies of the original models deployed in edge devices. In addition, we also perform experiments to evaluate the robustness of DynaMarks against various watermark removal strategies, thus allowing a DL model owner to reliably prove model ownership.

LGMay 18
Distance-Aware Muon: Adaptive Step Scaling for Normalized Optimization

Yury Demidovich, Abhishek Chakraborty, Grigory Malinovsky et al.

Muon and related normalized optimizers decouple the choice of update direction from the choice of step scale, but their practical performance remains sensitive to the scale of the normalized step. We study adaptive scaling rules for Muon in general norm geometries and develop three complementary algorithms. For smooth non-convex objectives, we introduce Distance-Adaptive Muon, whose trust-region radius is set from the radius explored by the trajectory, and prove a stationarity guarantee under a bounded-trajectory assumption. We then turn to star-convex objectives, a tractable model of the favorable global geometry often used to reason about the empirical loss landscapes of deep neural networks, where objective-gap guarantees are possible. In this setting, we first introduce Scale-Calibrated Muon, which keeps Muon's exponential moving average but sets the step length from a local descent certificate computed from the current gradient and momentum. For this method, we prove a last-iterate O(1/T) objective-gap bound under a bounded initial sublevel-set assumption, where the corresponding radius parameter appears only in the analysis and not in the algorithm. Finally, we develop Distance-Free Muon, a recentered trust-region method that uses a scalar distance certificate and a majorized one-dimensional search to select the trust-region radius without requiring the unknown distance from the initialization to a global minimizer. Experiments on Transformer language modeling (GPT-124M/WikiText-103) and image classification (ViT-Tiny/CIFAR-100) show that the proposed adaptive scaling rules reduce sensitivity to manual scale tuning and match or improve tuned fixed-scale Muon baselines under the tested budgets.

OCJan 27
Randomized Feasibility Methods for Constrained Optimization with Adaptive Step Sizes

Abhishek Chakraborty, Angelia Nedić

We consider minimizing an objective function subject to constraints defined by the intersection of lower-level sets of convex functions. We study two cases: (i) strongly convex and Lipschitz-smooth objective function and (ii) convex but possibly nonsmooth objective function. To deal with the constraints that are not easy to project on, we use a randomized feasibility algorithm with Polyak steps and a random number of sampled constraints per iteration, while taking (sub)gradient steps to minimize the objective function. For case (i), we prove linear convergence in expectation of the objective function values to any prescribed tolerance using an adaptive stepsize. For case (ii), we develop a fully problem parameter-free and adaptive stepsize scheme that yields an $O(1/\sqrt{T})$ worst-case rate in expectation. The infeasibility of the iterates decreases geometrically with the number of feasibility updates almost surely, while for the averaged iterates, we establish an expected lower bound on the function values relative to the optimal value that depends on the distribution for the random number of sampled constraints. For certain choices of sample-size growth, optimal rates are achieved. Finally, simulations on a Quadratically Constrained Quadratic Programming (QCQP) problem and Support Vector Machines (SVM) demonstrate the computational efficiency of our algorithm compared to other state-of-the-art methods.

CRJan 7, 2021
Robust and Attack Resilient Logic Locking with a High Application-Level Impact

Yuntao Liu, Michael Zuzak, Yang Xie et al.

Logic locking is a hardware security technique to intellectual property (IP) against security threats in the IC supply chain, especially untrusted fabs. Such techniques incorporate additional locking circuitry within an IC that induces incorrect functionality when an incorrect key is provided. The amount of error induced is known as the effectiveness of the locking technique. "SAT attacks" provide a strong mathematical formulation to find the correct key of locked circuits. In order to achieve high SAT resilience(i.e. complexity of SAT attacks), many conventional logic locking schemes fail to inject sufficient error into the circuit. For example, in the case of SARLock and Anti-SAT, there are usually very few (or only one) input minterms that cause any error at the circuit output. The state-of-the-art stripped functionality logic locking (SFLL) technique introduced a trade-off between SAT resilience and effectiveness. In this work, we prove that such a trade-off is universal in logic locking. In order to attain high effectiveness of locking without compromising SAT resilience, we propose a novel logic locking scheme, called Strong Anti-SAT (SAS). In addition to SAT attacks, removal-based attacks are also popular against logic locking. Based on SAS, we propose Robust SAS (RSAS) which is resilient to removal attacks and maintains the same SAT resilience and as effectiveness as SAS. SAS and RSAS have the following significant improvements over existing techniques. (1) SAT resilience of SAS and RSAS against SAT attack is not compromised by increase in effectiveness. (2) In contrast to prior work focusing solely on the circuit-level locking impact, we integrate SAS-locked modules into a processor and show that SAS has a high application-level impact. (3) Our experiments show that SAS and RSAS exhibit better SAT resilience than SFLL and have similar effectiveness.

MLNov 13, 2020
Sparse Representations of Positive Functions via First and Second-Order Pseudo-Mirror Descent

Abhishek Chakraborty, Ketan Rajawat, Alec Koppel

We consider expected risk minimization problems when the range of the estimator is required to be nonnegative, motivated by the settings of maximum likelihood estimation (MLE) and trajectory optimization. To facilitate nonlinear interpolation, we hypothesize that the search space is a Reproducing Kernel Hilbert Space (RKHS). We develop first and second-order variants of stochastic mirror descent employing (i) \emph{pseudo-gradients} and (ii) complexity-reducing projections. Compressive projection in the first-order scheme is executed via kernel orthogonal matching pursuit (KOMP), which overcomes the fact that the vanilla RKHS parameterization grows unbounded with the iteration index in the stochastic setting. Moreover, pseudo-gradients are needed when gradient estimates for cost are only computable up to some numerical error, which arise in, e.g., integral approximations. Under constant step-size and compression budget, we establish tradeoffs between the radius of convergence of the expected sub-optimality and the projection budget parameter, as well as non-asymptotic bounds on the model complexity. To refine the solution's precision, we develop a second-order extension which employs recursively averaged pseudo-gradient outer-products to approximate the Hessian inverse, whose convergence in mean is established under an additional eigenvalue decay condition on the Hessian of the optimal RKHS element, which is unique to this work. Experiments demonstrate favorable performance on inhomogeneous Poisson Process intensity estimation in practice.