Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems IOS development with Swift Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems Three cases of binary search TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction Overview in cloud computing 2 TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Overview in distributed systems and cloud computing 1 A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion TAMU Advanced Algorithms 3, Maximum Bandwidth Path (Dijkstra, MST, Linear) TAMU Advanced Algorithms 2, B+ tree and Segment Intersection TAMU Advanced Algorithms 1, BST, 2-3 Tree and Heap TAMU AI, Searching problems Factorization Machine and Field-aware Factorization Machine for CTR prediction TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons Princeton Algorithms P1W6 Hash Tables & Symbol Table Applications Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Princeton Algorithms P1W5 Balanced Search Trees TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Princeton Algorithms P1W4 Priority Queues and Symbol Tables Stanford ML 5 Neural Networks Learning Princeton Algorithms P1W3 Mergesort and Quicksort Stanford ML 4 Neural Networks Basics Princeton Algorithms P1W2 Stack and Queue, Basic Sorts Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Princeton Algorithms P1W1 Union and Find Stanford ML 1 Introduction and Parameter Learning

TAMU Neural Network 8 Neurodynamics

2017-04-16

Inclusion of feedback gives temporal characteristics to neural networks: recurrent networks

Recurrent networks can become unstable or stable.

Main interest is in recurrent network’s stability: neurodynamics

Preliminaries: Dynamic systems

  • A dynamic system is a system whose state varies with time
  • State-space model: values of state variables change over time

Example:

  • A system of order
  • are state variables that hold different values under independent variable .
  • is called the state vector
  • The dynamics of the system is \[\frac{d}{dt}x_j(t) = F_j(x_j(t)), j=1,2,…,N\]

Or
\[\frac{d}{dt}\textbf{x}(t) = \textbf{F}(\textbf{x}(t))\]

System type

  • Autonomous: does not explicitly depend on time.
  • Non-autonomous: explicitly depend on time.

can be seen as a velocity vector field. In this field, every point is associated with a vector.

State space

Can view the state-space equation as describing the motion of a point in N-dimensional space.

The points traversed over time is called the trajectory or the orbit. The tangent vector on it is instantanous velocity.

Solution of the equation

  • Existing solution: is continuous in all of its args.
  • Uniqueness: it must meet the Lipschitz condition.

Lipschitz condition

Let and be a pair of vectors in an open set in a normal vector space. For some constant , a vector function that satisfies:

\[\Vert \textbf{F}(\textbf{x}) - \textbf{F}(\textbf{u}) \Vert \le K \Vert \textbf{x} - \textbf{u} \Vert\]

  • K is called Lipschitz constant for

If are finite everywhere, meet the Lipschitz condition.

Equilibrium states

is an equilibrium state when:

\[\left.\frac{d\textbf{x}}{dt} \right\vert_{\textbf{x}= \bar{\textbf{x}}} = \textbf{F}(\bar{\textbf{x}})=0\]

Linearize using Taylor expansion around

\[\textbf{F}(\textbf{x}) \approx \textbf{F}(\bar{\textbf{x}}) + \textbf{A} \Delta \textbf{x}(t)\]

where is the Jacobian:

\[\textbf{A} = \left.\frac{\partial}{\partial \mathbf{x}} \textbf{F}(\textbf{x}) \right\vert_{\textbf{x} =\bar{\textbf{x}}}\]

Stability of linearized system

Jacobian matrix determine the behavior near equilibrium points. (Eigenvalue of )

\[\frac{d}{dt}\Delta \textbf{x}(t) \approx \textbf{A} \Delta \textbf{x}(t)\]

Eigenvalue:

\[(\textbf{A}-\lambda \textbf{I}) \textbf{x} = 0\]

Example

2nd-Order System

  • Stable node (both eigenvalue real, negative)
  • Unstable focus (complex, positive real part)
  • Saddle point (real, + -)
  • Stable focus (complex, - real)
  • Ustable node (real, +)
  • Center (complex, 0 real)

Definition of stability

  • Uniformly stable: for an arbitrary , if there exists a postive such that implies for all .
  • Convergent: if there exists a positive such that implies
  • Asymptotically stable: if both stable and convergence
  • Globally asymptotically stable: if stable and all trajectories of the system converge to as time approaches infinity.

Lyapunov’s Theorem

  • Theorem 1: The equilibrium state is stable if in a small neighborhood of there exists a positive definite function such that its derivative with respect to time is negative semidefinite in that region.
  • Theorem 2: The equilibrium state is asymptotically stable if in a small neighborhood of there exists a positive definite function such that its derivative with respect to time is negative definite in that region.
  • A scalar function that satisfies these conditions is called a Lyapunov function for the equilibrium state

Attractors

  • Dissipative systems are characterized by attracting sets or manifolds of dimensionality lower than that of the embedding space. These are called attractors.
  • Regions of initial conditions of nonzero state space volume converge to these attractors as time increases.

Type

  • Point attractors
  • Limit cycle attractors
  • Strange (chaotic) attractors

Neurodynamical models

We will focus on state variables that are continuous-valued.

Propeties:

  • Large number of DOF
  • Nonlinearity
  • Dissipative (opposed to conservative), i.e. open system
  • Noise

Manipulation of attractors

As a recuccrent Nnet Paradigm

  • Can identify attractors with computational objects
  • In order to do so, must exercise control over the location of the attractors in the state space of the system
  • A learning algorithm will manipulate the equations governing the dynamical behavior so that a desired location of attractors are set
  • One good way to do this is to use the energy minimization paradigm. (e.g. Hopfield)

Hopfield model

  • units with full connection among every node(no self-feedback)
  • Given input patterns, each having the same dimensionality as the network, can be memorized in attractors of the network
  • Starting with an initial pattern, the dynamic will converge toward the attractor of the basin of attraction where the inital pattern was placed.

Discrete Hopfield model

Based on McCulloch-Pitts model

Energy function is defined as

\[E = - \frac{1}{2}\sum_{i=1}^{N}\sum_{j=1}^{N} w_{ji} x_i x_j (i \ne j)\]

Network dynamics will evolve in the direction that minimizes

Content-Addressable Memory

  • Map a set of patterns to be memorized onto fixed point in the dynamical system realized by the recurrent network
  • Encoding: to
  • Decoding: to

Storage

Learning is similar to Hebbian

\[w_{ji}=\frac{1}{N}\sum_{\mu=1}^{M}\xi_{\mu,j} \xi_{\mu,i}\]

with

Matrix form:

\[\textbf{W}=\frac{1}{N}\sum_{\mu=1}^{M} \xi_\mu \xi_\mu^T -M \textbf{I}\]

Result is symmetric.

Activation

  • Initialize network with probe pattern \[x_j(0) = \xi_{probe,j}\]
  • Update output of each neuron(picking them by random) as \[x_j(n+1)=sgn\left( \sum_{i=1}^{N} w_{ji}x_i(n) \right)\] until reaches a fixed point.

Spurious states

  • is symmetric, thus eigenvalues of it are all real.
  • For large number of patters , the matrix is degenerate, i.e., several eigenvectors can have the same eigenvalue.
  • These eigenvectors form a subspace, and when the associated eigenvalue is 0, it is called a null space.
  • This is due to being smaller than the number of neurons .
  • Hopfield network as content addressable memory:
    • Discrete Hopfield network acts as a vector projector (project probe vector onto subspace spanned by training patterns).
    • Underlying dynamics drive the network to converge to one of the corners of the unit hypercube.
  • Spurious states are those corners of the hypercube that do not belong to the training pattern set.

Storage capacity

Given a probe equal to the stored pattern , the activation of the th neuron can be decomposed into the signal term and the noise term.

\[v_j = \xi_{v,j}+\frac{1}{N}\sum_{\mu=1,\mu \ne v}^{M}\xi_{v,j} \sum_{i=1}^{N}\xi_{\mu,j} \xi_{v,i}\]

signal-to-noise ratio is defined as:

\[\rho = \frac{\mbox{variance of signal}}{\mbox{variance of noise}} = \frac{1}{(M-1)/N} \approx \frac{N}{M}\]

load parameter is the reciprocal of . should be less than 0.14.

For almost error-free performance, storage capacity is

\[M_c = \frac{N}{2\log_e N}\]

Thus the storage capacity of Hopfield network scales less than linearly with size N of the network. It’s a major limitation of the Hopfield model.

Cohen-Grossberg Theorem

Cohen and Grossberg (1983) showed how to access the stability of a certain class of neural networks.

\[\frac{d}{dt} \mu_j = a_j(\mu_j) \left[b_j(\mu_j)- \sum_{i=1}^{N}c_{ji}\varphi_i(\mu_i)\right], j= 1,2,…,N\]

Neural network with above dynamics admits a Lyapunov function defined as:

\[E = \frac{1}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}c_{ji}\varphi_i(\mu_i)\varphi_j(\mu_j)-\sum_{j=1}^{N}\int_0^{\mu_j}b-j(\lambda)\varphi_j’(\lambda)d\lambda\] where \[\varphi_j’(\lambda) = \frac{d}{d\lambda}(\varphi_j(\lambda))\]

Conditions to be met:

  • Synaptic weights are symmetric
  • Function satisfies the condition for nonnegativity
  • The nonlinear activation function needs to follow the monotonicity condition: \[\varphi_j’(\mu_j) = \frac{d}{d\mu_j}\varphi_j(\mu_j) \ge 0\]
  • With the above \[\frac{dE}{dt} \le 0\] ensuring global stability of the system
  • Hopfield model can be seen as a special case of the Cohen-Grossberg Theorem.

Creative Commons License
Melon blog is created by melonskin. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2016-2019. All rights reserved by melonskin. Powered by Jekyll.