LGApr 28, 2021

PyTorch Tabular: A Framework for Deep Learning with Tabular Data

arXiv:2104.13638v149 citations
AI Analysis

This addresses the problem for researchers, practitioners, and industry users by providing a ready-to-use library to bridge the gap in popularity and ease-of-use compared to gradient boosting methods, though it is incremental as it builds on existing models and tools.

The authors tackled the lack of an easy-to-use deep learning library for tabular data by developing PyTorch Tabular, a framework built on PyTorch and PyTorch Lightning that integrates SOTA models like NODE and TabNet with a unified API, making deep learning on tabular data more accessible and efficient.

In spite of showing unreasonable effectiveness in modalities like Text and Image, Deep Learning has always lagged Gradient Boosting in tabular data - both in popularity and performance. But recently there have been newer models created specifically for tabular data, which is pushing the performance bar. But popularity is still a challenge because there is no easy, ready-to-use library like Sci-Kit Learn for deep learning. PyTorch Tabular is a new deep learning library which makes working with Deep Learning and tabular data easy and fast. It is a library built on top of PyTorch and PyTorch Lightning and works on pandas dataframes directly. Many SOTA models like NODE and TabNet are already integrated and implemented in the library with a unified API. PyTorch Tabular is designed to be easily extensible for researchers, simple for practitioners, and robust in industrial deployments.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes