Read "Dynamo, Amazon’s Highly Available Key-value Store" Read "Bigtable, A Distributed Storage System for Structured Data" Read "Streaming Systems" 3, Watermarks Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems IOS development with Swift Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems Three cases of binary search TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction Overview in cloud computing 2 TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Overview in distributed systems and cloud computing 1 A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion TAMU Advanced Algorithms 3, Maximum Bandwidth Path (Dijkstra, MST, Linear) TAMU Advanced Algorithms 2, B+ tree and Segment Intersection TAMU Advanced Algorithms 1, BST, 2-3 Tree and Heap TAMU AI, Searching problems Factorization Machine and Field-aware Factorization Machine for CTR prediction TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons Princeton Algorithms P1W6 Hash Tables & Symbol Table Applications Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Princeton Algorithms P1W5 Balanced Search Trees TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Princeton Algorithms P1W4 Priority Queues and Symbol Tables Stanford ML 5 Neural Networks Learning Princeton Algorithms P1W3 Mergesort and Quicksort Stanford ML 4 Neural Networks Basics Princeton Algorithms P1W2 Stack and Queue, Basic Sorts Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Princeton Algorithms P1W1 Union and Find Stanford ML 1 Introduction and Parameter Learning

Overview in distributed systems and cloud computing 1


Distributed systems

Start of distributed system:

  • Development of powerful microprocessors
  • High-speed networks, LAN, WAN


A distributed system is a collection of autonomous computing elements that appears to its users as a single coherent system.


  • A collection of autonomous computing elements, nodes
    • No global clock
    • Manage the group membership and organization
      • Authentication, indeed communication with group member or not, confidentiality
      • Open group: any node can join and send messages
      • Closed group: communicate with in the group
  • Users think they are dealing with a single system
    • Nodes needs to collaborate


As the organization of the collection, overlay network should be connected. There should always be a path between two nodes to route messages. It can be formed by P2P networks.

  • Structured overlay
    • Nodes have a well defined set of neighbors to communicate
  • Unstructured overlay
    • randomly selected other nodes

Single coherent system

Ask the system to appear to be coherent.

Distribution transparency: End users would not know which computer a process is running, where the data is stored.

But partial failures are difficult to hide.


Middleware is a distributed-system layer placed on top of OS of nodes, as the OS of the distributed system. It’s a container of commonly used components and functions that now provided by the system rather than applications. Provide services:

  • Resource management
  • Facilities for inter-application communication
  • Security
  • Accounting
  • Masking of and recovery from failures

Differences from OS:

  • Offered in a networked environment
  • Most services are useful to many applications

Typical services:

  • Communication
    • e.g. Remote Procedure Call (RPC): an application can invoke functions implemented and executed on a remote computer as if locally available
  • Transactions
    • Support in an all-or-nothing fashion, atomic
  • Service composition
    • Develop new application by combining existing ones. e.g. mashups.
  • Reliability
    • Enhanced functions for building reliable applications
    • e.g. Horus, messages are received in all-or-no fashion

Design goals

Supporting resource sharing

  • Cheaper to have a single high-end reliable storage facility be shared.
  • Best illustrated by P2P networks.

Making distribution transparent

Hide the fact that its processes and resources are physically distributed across multiple computers (possibly separated by large distances). Invisible to end users and applications.

Types of transparency:

  • Access
    • data representation
    • the access way
  • Location
    • Where the objects are stored
  • Relocation
    • Objects may be moved while in use
  • Migration
    • Objects may be moved
  • Replication
    • Replicated
  • Concurrency
    • Shared by several users
  • Failure
    • Failure and recovery

The price for achieving full transparency may be surprisingly high.

Being Open

System offers components that can be easily used by or integrated into other systems.

Define services with interface, using Interface Definition Language.

  • Interoperability
    • Two components can work together as specified by a common standard
  • Portability
    • An application running on system A, can be run on system B if implements the same interface as A
  • Extensibility
    • Easy to add parts
Separate policy from mechanism

It is crucial that the system be organized as a collection of small, easily replaceable components.

Being scalable


  • Size scalability
    • Add users and resources without losing performance
    • CPU, storage and I/O rate, network
  • Geographical scalability
    • Users and resources may lie far apart, with small communication delays
    • Synchronous communication, block until response. Slow on WAN
    • Communication less reliable on WAN
    • Limited bandwidth
    • WAN system has very limited facilities for multi-point communications
  • Administrative scalability
    • Easily managed even if it spans many admin orgs
    • Conflicting policies of resource usage, management and security
    • System needs to protect itself against attacks from the new domain
    • New domain users may only have read access
Scaling techniques
  • Scaling up: improve CPU, memory, network capacity
  • Scaling out: expanding by deploying more machines

Method for scaling out:

  • Hiding communication latencies
    • In the case of geographical scalability
    • Try to avoid waiting for remote response as much as possible
    • Asynchronous communication
    • Move job from server to client
  • Distribution of work
    • e.g. DNS lookup
  • Replication
    • Distribute copies across the system
    • Caching
    • Drawback: consistency
    • Global synchronization is hard to scale


False assumptions while developing systems:

  • Reliable network
  • Secure network
  • Homogeneous network
  • Topology does not change
  • Zero latency
  • Infinite bandwidth
  • Zero transport cost
  • One administrator

Types on distributed systems

High performance distributed computing

  • Cluster computing
    • Collection of similar workstations with the same OS and LAN
    • Super computer
    • Homogeneity
  • Grid computing
    • Federation of computer systems under different admin domains, software and hardware
    • Collaboration: virtual organization
    • Layers: fabric; connectivity & resource; collective; applications
  • Cloud computing
    • Constructed dynamically
    • Provides a lot of resources
    • Open resources to customers, virtualized
    • Layers: hardware, infrastructure, platform, application
    • Services: infrastructure, platform, software -as-a-service

Distributed information systems

Interoperability: Make it easier to integrate applications into an enterprise-wide information system.

Enterprise Application Integration: letting applications communicate directly with each other.

Distributed transaction (operation on a database) properties:

  • Atomic
  • Consistent
    • Do not violate system invariant
  • Isolated
    • Concurrent transactions do not interfere with each other
  • Durable
    • Once committed, changes are permanent

Nested transactions may consist of many sub transactions. Permanence only applys to the top-level transactions.

Transaction processing monitor handles distributed transactions.

Enterprise Application Integration

Applications communicate directly with each other.

Among several types of communication middleware, RPC allows an application can send request to and receive response from other application components by doing a local procedure call.

Remote method invocations (RMI) is similar to RPC, but on objects instead of functions.

RPC and RMI drawback: caller and callee need to be up and running for communication, and need to know how to approach each other. This leads to message-oriented middleware (MOM). Messages are sent to logical contact points. publish/subscribe is used; applications can indicate the messages they are interested.

Pervasive systems

By the introduction of mobile and embedded computing devices, pervasive systems are naturally distributed.

  • Have many sensors and actuators
  • Devices: small, battery, mobile, wireless


  • Ubiquitous computing systems
    • System is pervasive and continuously present
    • Requirement:
      • Distribution, in a transparent manner
      • Interaction, highly unobtrusive
      • Context awareness: System knows user context
      • Autonomy
        • Address allocation
        • Adding devices
        • Auto updates
      • Intelligence, handles a wide range of dynamic actions.
  • Mobile systems
    • Devices vary widely
    • Locations are changing over time
      • Flooding-based send messages to spread on a part of network
      • Disruption-tolerant network, intermediate node can store messages until finding the next node
  • Sensor networks
    • A collection of input devices

Work load

3-tier on cloud infrastructure:

  • Web (Client) tier
  • Business Logic tier
  • Data tier


From 1996 to 2013,

  • Server spending keeps steady.
  • Management and administration part are increasing.
  • Number of physical servers keeps going up until 2008 then has been constant.
  • Number of logical servers keeps increasing.

20 years ago,

  • Very low server utilization, 7%
  • Mainframe optimization very high
    • Run multiple OSs in one mainframe
    • Virtualizing the hardware
    • Hypervisors enable multiple OSs to run in the same hardware
    • But server-based industry didn’t do this

Virtualization is regarding a single server.

  • Hypervisor provides virtualized access to hardware for multiple OSs.
  • Virtual Machine Monitor is an efficient, isolated duplicate of the real machine.
    • fidelity, performance, safety and isolation
  • Until 2005, x86 was not virtualizable
    • 9 instructions break things
    • Evolution of virtual support: CPU, memory, I/O
  • Itanium, MIPS, ARM are mostly virtualizable

VMware software VMM:

  • Software VMMs runs very slow due to interpretation
  • Replace sensitive instructions in guest on-the-fly
    • Binary input, not source code


  • Present software interface to virtual machines that is similar but not identical to that of the underlying hardware
  • Allow the guests to handle difficult portion of code to VMM
  • Guest OS should be explicitly ported for para-API
    • A convention OS may not work on p-VMM


  • VMware
  • Xen
  • KVM
  • HyperV, MS

Nested virtualization

  • Turtle Project in IBM
  • Hypervisor running on top of Hypervisor
  • Ship hardware with a modified hypervisor. Other hypervisors can be run on top of this hypervisor.

Difference between cloud computing and virtualization

  • Virtualization is part of cloud computing
  • Human is needed to interact with hypervisors to create virtual machines
  • Cloud computing: API handles all automation to communicate with hypervisors


Container is user-level virtualization.

Now applications are:

  • constantly developed
  • New versions are being deployed often
  • built from loosely coupled components
  • deployed to a multitude of servers

Containers make apps portable

  • looks the same everywhere
  • No matter where it is running
  • Don’t need to install all dependencies on the host


  • contains project code and provisioning script
  • Can be pushed to VM, Github and various server
  • Put code in environment


  • Dockerfile: define dependencies
  • Docker image: A complete wrap of application and dependencies
  • Build environment

Distributed computing

Distributed system:

  • Multiple computers connected with each other
  • Working together
  • Seems like a single coherent system.
    • Failures are normal and tolerant


  • Failure
    • Expected to fail
  • Concurrency
    • Accessing the same data
  • Synchronization
    • No common lock
  • Scalability

Example of problems

  • Consensus
    • Nodes agree on a value of a piece of data
  • Leader election
  • Group management
  • Failure detection
  • Clock synchronization


  • Key-value
    • Redis
  • Column-oriented
    • Cassandra
  • Graph
    • Social network data
  • Document
    • MongoDB, XML

Workload nowadays:

  • Volume: large amounts of data
  • Unstructured: hard to fit into a schema
    • Images, videos
  • Random reads and writes, write heavy


  • Speed: queries quickly
  • Distributed: availability, no single point of failure
  • Low cost: include cost of system admin
  • Scalable: demand changes quickly

NoSQL features:

  • Non relational
  • No schema required
  • Easily distributed
  • Horizontal scaling, scale out (add more machines)

In-memory caching

Scenario to use:

  • Speed
    • Fetching data from database or remote API call is slow
  • Data access pattern
    • Frequency: high
  • Staleness
    • Mostly static data


  • Amazon’s ElastiCache
  • Redis
  • Memcached
  • DynamoDB


  • In-memory key-value store
  • Flexible
    • Values can be anything, webpages, numbers
  • Items in it have an expiration time
  • Items replaced in LRU fashion when cache is full
  • Operations
    • get, set, delete

Failure detection

  • Failures are the norm
  • Need failure detector to determine if a process has failed


  • ping-based
    • Send ping and receive ACK
    • Randomly pick target
    • If no ACK, ask other processes to check the target
  • Gossip protocol
    • Messages are like infection, go to neighboring nodes (or just a subset)
      • Every node maintains a membership list
      • Gossip messages may include membership list
      • On receiving msg, nodes update membership list
      • Mark a node as failed if no update after a certain period
  • Heartbeat
    • Nodes periodically send messages with sequence numbers
    • Centralized
      • Leader may fail
    • Or ring based
    • Or all to all

CAP theorem

  • C: Consistency
    • Operations on the distributed system act as if on a single machine
    • Every successful read receives the most recent write.
  • A: Availability
    • Every request to a non-failing node must result in a response
    • Translate to revenue
    • Read and write should complete with a low latency
  • P: Partition tolerance
    • Good even when all msg from one part of the system to other are cut off
    • Consistency: read the most recent write (the written node may be blocked)
    • Availability: read the latest copy (the node with that copy may be blocked)

Partition can happen when:

  • Internet outages
    • Between data centers
    • Between racks within a data center
  • Data centers may be blocked
    • Governments
    • Firewalls


A distributed data store can satisfy only at most two of CAP guarantees.

Consistency levels

  • Strong consistency
    • Read always see the most recent write
  • Eventual consistency
    • Read may see older copy sometimes
    • Eventually, the most recent
  • Causal consistency
    • Only causally related reads wait for the most recent write
    • Read y does not wait to update x value (unrelated)



  • Gnutella
    • Discover neighbors
      • Ping
      • Pong
    • Request files
      • Query
      • Query hit
  • Chord
    • Consistent hashing
    • Tradeoff between routing table size and lookup time
      • A node knows every other node
        • Fast lookup, large table
      • A node only knows neighbors
        • Slow lookup, small table
      • Finger table: A node knows m other nodes
        • Maps nodes at distance , , …,

Videos about

Creative Commons License
Melon blog is created by melonskin. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2016-2020. All rights reserved by melonskin. Powered by Jekyll.