An Introduction to Deep Learning for Tabular Data

Jun 11, 2018
No comments yet
Industry News
3676 Views

6 years ago
An Introduction to Deep Learning for Tabular Data

By: Rachel Thomas

Originally published in fast.ai, April 29, 2018

There is a powerful technique that is winning Kaggle competitions and is widely used at Google (according to Jeff Dean), Pinterest, and Instacart, yet that many people don’t even realize is possible: the use of deep learning for tabular data, and in particular, the creation of embeddings for categorical variables.

Despite what you may have heard, you can use deep learning for the type of data you might keep in a SQL database, a Pandas DataFrame, or an Excel spreadsheet (including time-series data). I will refer to this as tabular data, although it can also be known as relational data, structured data, or other terms (see my twitter poll and comments for more discussion).

From the Pinterest blog post ‘Applying deep learning to Related Pins’

Tabular data is the most commonly used type of data in industry, but deep learning on tabular data receives far less attention than deep learning for computer vision and natural language processing. This post covers some key concepts from applying neural networks to tabular data, in particular the idea of creating embeddings for categorical variables, and highlights 2 relevant modules of the fastai library:

fastai.structured: this module works with Pandas DataFrames, is not dependent on PyTorch, and can be used separately from the rest of the fastai library to process and work with tabular data.
fastai.column_data: this module also works with Pandas DataFrames, and provides methods to convert DataFrames (with both continuous and categorical variables) into ModelData objects that can easily be used when training neural networks. It also includes an implementation for creating embeddings of categorical variables, a powerful technique I will explain below.

To continue reading this article in fast.ai, click here.

About the Author:

Rachel Thomas was selected by Forbes as one of 20 Incredible Women in AI, earned her math PhD at Duke, and was an early engineer at Uber. She is a professor at the University of San Francisco and co-founder of fast.ai, which created the “Practical Deep Learning for Coders” course that over 100,000 students have taken. Rachel is a popular writer and keynote speaker. Her writing has been read by over half a million people; has been translated into Chinese, Spanish, Korean, & Portuguese; and has made the front page of Hacker News 7x.

EXCLUSIVE HIGHLIGHTS

Related

6 years ago
An Introduction to Deep Learning for Tabular Data

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact

EXCLUSIVE HIGHLIGHTS

Related

6 years agoAn Introduction to Deep Learning for Tabular Data

Recommended

This new forecasting model is better than machine learning, researchers say

Widespread machine learning methods behind ‘link prediction’ are performing very poorly, study shows

AI’s $600B Question

Google scrambles to manually remove weird AI answers in search

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190 Produced by: Rising Media & Prediction Impact

6 years ago
An Introduction to Deep Learning for Tabular Data

The Machine Learning Times © 2020 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact