TAMU Neural Network 6 Deep Learning Overview

Learning higher level abstraction/representations from data

Motivation: how the brain represents and processes sensory information in a hierarchical manner.

Deep learning is based on neural networks.

Complex models with large number of parameters
- Hierarchical representations
- More params = more accurate on training data
- Simple learning rule for training (gradient)
Lots of data
- Needed to get better generalization performance
- High-dim input need exponentially many inputs (curse of dimensionality)
Lots of computing power: GPGPU
- Time consuming

History

Long history

Fukushima’s Neocognitron (1980)
LeCun’s Convolutional neural networks (1989)
Schmidhuber’s work on stacked recurrent neural networks (1993). Vanishing gradient problem

Rise

Geoffrey Hinton (2006)
Andrew Ng & Jeff Dean (2012)
Schmidhuber, Recurrent neural network using LSTM (2011-)
Google Deepmind (2015,2016)
ICLR, meeting from 2013
Textbook (2016)

Current trend

Deep belief networks (based on Boltzmann machine), Hinton
Convolutional neural networks, LeCun
Deep Q-learning Network (extension to reinforcement learning)
Deep recurrent neural network using LSTM
Representation learning
Reinforcement learning
Extended memory

Boltzmann machine

Given a test data, return the closet data in training

Deep belief net

Deep belief net is layer-by-layer training using RBM

Overcome issues with logistic belief net, Hinton
Based on Restricted Boltzmann Machine, but no within-layer connections
RBM back-and forth update: update hidden given visible, then update visible given hidden

Training

Train RBM based on input to form hidden representation
Use hidden representation as input to train another RBM
Repeat steps 2-3

Deep convolutional neural networks

Stride of n (sliding window by n pixels)
Convolution layer (kernels)
Max pooling

Deep Q-Network

Google Deepmind, Atari2600

Input: video screen
Output: (Q(s,a))
- Action-value function
- Value of taking action (a) when in state (s)
Reward: game score

Deep recurrent neural networks

Feedforward: No memory of past input
Recurrent:
- Good: Past input affects present output
- Bad: Cannot remember far into the past

Backprop in time:

Can unfold recurrent loop: make it into a feedforward net
Use the same backprop algorithm for training
Cannot remember too far into the past

Long Short-Term Memory

LSTM to the rescue (Schmidhuber, 2017)
Built-in recurrent memory that can be written (Input gate), reset (Forget gate), and outputted (Output gate)
Long-term retention possible with LSTM
Unfold in time and use backprop as usual
Application: Sequence classification, sequence translation, sequence prediction

AI 2

Algorithm 17

Amazon 1

Authorization 1

Blog 3

Bootstrap 1

C++ 1

CCpp 5

CSS 2

Cloud 3

Code 1

Crawler 1

DNS 1

Database 17

DeepLearning 1

Design 17

Development 1

Docker 1

English 1

Express 1

GDB 1

Go 3

Google 4

HTML 3

IOS 1

Java 17

Javascript 4

Jekyll 1

Linux 4

MacOS 2

MachineLearning 17

Markdown 4

Mobile 1

MongoDB 2

Multi-threading 3

NAS 1

Network 11

NeuralNetwork 10

Node 1

OS 8

Public-speaking 1

Python 15

RESTful 1

Rails 9

React 1

Redis 1

Ruby 6

Shell 2

Spring 2

System 17

TCP 1

TDD 1

Thread 2

Vim 1

awk 1

git 1

jQuery 1

media 1

network 1

php 1