Public speaking course notes Read "Dynamo, Amazon’s Highly Available Key-value Store" Read "Bigtable, A Distributed Storage System for Structured Data" Read "Streaming Systems" 3, Watermarks Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems IOS development with Swift Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems Three cases of binary search TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction Overview in cloud computing 2 TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Overview in distributed systems and cloud computing 1 A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion TAMU Advanced Algorithms 3, Maximum Bandwidth Path (Dijkstra, MST, Linear) TAMU Advanced Algorithms 2, B+ tree and Segment Intersection TAMU Advanced Algorithms 1, BST, 2-3 Tree and Heap TAMU AI, Searching problems Factorization Machine and Field-aware Factorization Machine for CTR prediction TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons Princeton Algorithms P1W6 Hash Tables & Symbol Table Applications Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Princeton Algorithms P1W5 Balanced Search Trees TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Princeton Algorithms P1W4 Priority Queues and Symbol Tables Stanford ML 5 Neural Networks Learning Princeton Algorithms P1W3 Mergesort and Quicksort Stanford ML 4 Neural Networks Basics Princeton Algorithms P1W2 Stack and Queue, Basic Sorts Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Princeton Algorithms P1W1 Union and Find Stanford ML 1 Introduction and Parameter Learning

TAMU Neural Network 6 Deep Learning Overview

2017-04-15

Learning higher level abstraction/representations from data

Motivation: how the brain represents and processes sensory information in a hierarchical manner.

Deep learning is based on neural networks.

  • Complex models with large number of parameters
    • Hierarchical representations
    • More params = more accurate on training data
    • Simple learning rule for training (gradient)
  • Lots of data
    • Needed to get better generalization performance
    • High-dim input need exponentially many inputs (curse of dimensionality)
  • Lots of computing power: GPGPU
    • Time consuming

History

Long history

  • Fukushima’s Neocognitron (1980)
  • LeCun’s Convolutional neural networks (1989)
  • Schmidhuber’s work on stacked recurrent neural networks (1993). Vanishing gradient problem

Rise

  • Geoffrey Hinton (2006)
  • Andrew Ng & Jeff Dean (2012)
  • Schmidhuber, Recurrent neural network using LSTM (2011-)
  • Google Deepmind (2015,2016)
  • ICLR, meeting from 2013
  • Textbook (2016)

Current trend

  • Deep belief networks (based on Boltzmann machine), Hinton
  • Convolutional neural networks, LeCun
  • Deep Q-learning Network (extension to reinforcement learning)
  • Deep recurrent neural network using LSTM
  • Representation learning
  • Reinforcement learning
  • Extended memory

Boltzmann machine

Given a test data, return the closet data in training

Deep belief net

Deep belief net is layer-by-layer training using RBM

  • Overcome issues with logistic belief net, Hinton
  • Based on Restricted Boltzmann Machine, but no within-layer connections
  • RBM back-and forth update: update hidden given visible, then update visible given hidden

Training

  1. Train RBM based on input to form hidden representation
  2. Use hidden representation as input to train another RBM
  3. Repeat steps 2-3

Deep convolutional neural networks

  • Stride of n (sliding window by n pixels)
  • Convolution layer (kernels)
  • Max pooling

Deep Q-Network

Google Deepmind, Atari2600

  • Input: video screen
  • Output: (Q(s,a))
    • Action-value function
    • Value of taking action (a) when in state (s)
  • Reward: game score

Deep recurrent neural networks

  • Feedforward: No memory of past input
  • Recurrent:
    • Good: Past input affects present output
    • Bad: Cannot remember far into the past

Backprop in time:

  • Can unfold recurrent loop: make it into a feedforward net
  • Use the same backprop algorithm for training
  • Cannot remember too far into the past

Long Short-Term Memory

  • LSTM to the rescue (Schmidhuber, 2017)
  • Built-in recurrent memory that can be written (Input gate), reset (Forget gate), and outputted (Output gate)
  • Long-term retention possible with LSTM
  • Unfold in time and use backprop as usual
  • Application: Sequence classification, sequence translation, sequence prediction

Creative Commons License
Melon blog is created by melonskin. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2016-2024. All rights reserved by melonskin. Powered by Jekyll.