Jiwoon Lee

LGFeb 22, 2023

Debiased Distillation by Transplanting the Last Layer

Jiwoon Lee, Jaeho Lee

Deep models are susceptible to learning spurious correlations, even during the post-processing. We take a closer look at the knowledge distillation -- a popular post-processing technique for model compression -- and find that distilling with biased training data gives rise to a biased student, even when the teacher is debiased. To address this issue, we propose a simple knowledge distillation algorithm, coined DeTT (Debiasing by Teacher Transplanting). Inspired by a recent observation that the last neural net layer plays an overwhelmingly important role in debiasing, DeTT directly transplants the teacher's last layer to the student. Remaining layers are distilled by matching the feature map outputs of the student and the teacher, where the samples are reweighted to mitigate the dataset bias. Importantly, DeTT does not rely on the availability of extensive annotations on the bias-related attribute, which is typically not available during the post-processing phase. Throughout our experiments, DeTT successfully debiases the student model, consistently outperforming the baselines in terms of the worst-group accuracy.

3.5ARApr 24

Hardware-Software Co-Design for Event-Driven SNN Deployment on Low-Cost Neuromorphic FPGAs

Jiwoon Lee, Souvik Chakraborty, Syed Bahauddin Alam et al.

Low-cost FPGA platforms can broaden access to neuromorphic systems research, but current spiking neural network (SNN) workflows remain divided between hardware-first implementations, which are difficult to integrate with PyTorch-style development, and software-first frameworks, which often stop at simulation or GPU execution. This paper presents a semantics-preserving hardware-software co-design framework for the deterministic deployment of PyTorch-defined SNNs to event-driven FPGA execution. A single exported artifact carries weights, thresholds, connectivity descriptors, and grouped time-to-first-spike (TTFS) decoding metadata from software definition to board execution and is reused unchanged by both the software reference and the board runtime. A 10-class MNIST TTFS classifier implemented in the routed 80 MHz design achieves 87.40\% accuracy and matches the software reference on all 10,000 test images. The programmable-logic path delivers a service latency of 0.1375 μs/image and an estimated dynamic energy of 31.6 nJ/image, while scope-aware comparisons with matched GPU and CPU baselines keep accelerator-only and system-level measurements distinct. These results show that low-cost event-driven FPGA hardware can provide a direct and reproducible software-to-board path for software-defined SNN models.

Jiwoon Lee

2 Papers