Overview in distributed systems and cloud computing 1

Distributed systems

Start of distributed system:

Development of powerful microprocessors
High-speed networks, LAN, WAN

Definition

A distributed system is a collection of autonomous computing elements that appears to its users as a single coherent system.

Features

A collection of autonomous computing elements, nodes
- No global clock
- Manage the group membership and organization
  - Authentication, indeed communication with group member or not, confidentiality
  - Open group: any node can join and send messages
  - Closed group: communicate with in the group
Users think they are dealing with a single system
- Nodes needs to collaborate

Collections

As the organization of the collection, overlay network should be connected. There should always be a path between two nodes to route messages. It can be formed by P2P networks.

Structured overlay
- Nodes have a well defined set of neighbors to communicate
Unstructured overlay
- randomly selected other nodes

Single coherent system

Ask the system to appear to be coherent.

Distribution transparency: End users would not know which computer a process is running, where the data is stored.

But partial failures are difficult to hide.

Middleware

Middleware is a distributed-system layer placed on top of OS of nodes, as the OS of the distributed system. It’s a container of commonly used components and functions that now provided by the system rather than applications. Provide services:

Resource management
Facilities for inter-application communication
Security
Accounting
Masking of and recovery from failures

Differences from OS:

Offered in a networked environment
Most services are useful to many applications

Typical services:

Communication
- e.g. Remote Procedure Call (RPC): an application can invoke functions implemented and executed on a remote computer as if locally available
Transactions
- Support in an all-or-nothing fashion, atomic
Service composition
- Develop new application by combining existing ones. e.g. mashups.
Reliability
- Enhanced functions for building reliable applications
- e.g. Horus, messages are received in all-or-no fashion

Design goals

Cheaper to have a single high-end reliable storage facility be shared.
Best illustrated by P2P networks.

Making distribution transparent

Hide the fact that its processes and resources are physically distributed across multiple computers (possibly separated by large distances). Invisible to end users and applications.

Types of transparency:

Access
- data representation
- the access way
Location
- Where the objects are stored
Relocation
- Objects may be moved while in use
Migration
- Objects may be moved
Replication
- Replicated
Concurrency
- Shared by several users
Failure
- Failure and recovery

The price for achieving full transparency may be surprisingly high.

Being Open

System offers components that can be easily used by or integrated into other systems.

Define services with interface, using Interface Definition Language.

Interoperability
- Two components can work together as specified by a common standard
Portability
- An application running on system A, can be run on system B if implements the same interface as A
Extensibility
- Easy to add parts

Separate policy from mechanism

It is crucial that the system be organized as a collection of small, easily replaceable components.

Being scalable

Dimensions:

Size scalability
- Add users and resources without losing performance
- CPU, storage and I/O rate, network
Geographical scalability
- Users and resources may lie far apart, with small communication delays
- Synchronous communication, block until response. Slow on WAN
- Communication less reliable on WAN
- Limited bandwidth
- WAN system has very limited facilities for multi-point communications
Administrative scalability
- Easily managed even if it spans many admin orgs
- Conflicting policies of resource usage, management and security
- System needs to protect itself against attacks from the new domain
- New domain users may only have read access

Scaling techniques

Scaling up: improve CPU, memory, network capacity
Scaling out: expanding by deploying more machines

Method for scaling out:

Hiding communication latencies
- In the case of geographical scalability
- Try to avoid waiting for remote response as much as possible
- Asynchronous communication
- Move job from server to client
Distribution of work
- e.g. DNS lookup
Replication
- Distribute copies across the system
- Caching
- Drawback: consistency
- Global synchronization is hard to scale

Pitfalls

False assumptions while developing systems:

Reliable network
Secure network
Homogeneous network
Topology does not change
Zero latency
Infinite bandwidth
Zero transport cost
One administrator

Types on distributed systems

High performance distributed computing

Cluster computing
- Collection of similar workstations with the same OS and LAN
- Super computer
- Homogeneity
Grid computing
- Federation of computer systems under different admin domains, software and hardware
- Collaboration: virtual organization
- Layers: fabric; connectivity & resource; collective; applications
Cloud computing
- Constructed dynamically
- Provides a lot of resources
- Open resources to customers, virtualized
- Layers: hardware, infrastructure, platform, application
- Services: infrastructure, platform, software -as-a-service

Distributed information systems

Interoperability: Make it easier to integrate applications into an enterprise-wide information system.

Enterprise Application Integration: letting applications communicate directly with each other.

Distributed transaction (operation on a database) properties:

Atomic
Consistent
- Do not violate system invariant
Isolated
- Concurrent transactions do not interfere with each other
Durable
- Once committed, changes are permanent

Nested transactions may consist of many sub transactions. Permanence only applys to the top-level transactions.

Transaction processing monitor handles distributed transactions.

Enterprise Application Integration

Applications communicate directly with each other.

Among several types of communication middleware, RPC allows an application can send request to and receive response from other application components by doing a local procedure call.

Remote method invocations (RMI) is similar to RPC, but on objects instead of functions.

RPC and RMI drawback: caller and callee need to be up and running for communication, and need to know how to approach each other. This leads to message-oriented middleware (MOM). Messages are sent to logical contact points. publish/subscribe is used; applications can indicate the messages they are interested.

Pervasive systems

By the introduction of mobile and embedded computing devices, pervasive systems are naturally distributed.

Have many sensors and actuators
Devices: small, battery, mobile, wireless

Types:

Ubiquitous computing systems
- System is pervasive and continuously present
- Requirement:
  - Distribution, in a transparent manner
  - Interaction, highly unobtrusive
  - Context awareness: System knows user context
  - Autonomy
    - Address allocation
    - Adding devices
    - Auto updates
  - Intelligence, handles a wide range of dynamic actions.
Mobile systems
- Devices vary widely
- Locations are changing over time
  - Flooding-based send messages to spread on a part of network
  - Disruption-tolerant network, intermediate node can store messages until finding the next node
Sensor networks
- A collection of input devices

Work load

3-tier on cloud infrastructure:

Web (Client) tier
Business Logic tier
Data tier

Virtualization

From 1996 to 2013,

Server spending keeps steady.
Management and administration part are increasing.
Number of physical servers keeps going up until 2008 then has been constant.
Number of logical servers keeps increasing.

20 years ago,

Very low server utilization, 7%
Mainframe optimization very high
- Run multiple OSs in one mainframe
- Virtualizing the hardware
- Hypervisors enable multiple OSs to run in the same hardware
- But server-based industry didn’t do this

Virtualization is regarding a single server.

Hypervisor provides virtualized access to hardware for multiple OSs.
Virtual Machine Monitor is an efficient, isolated duplicate of the real machine.
- fidelity, performance, safety and isolation
Until 2005, x86 was not virtualizable
- 9 instructions break things
- Evolution of virtual support: CPU, memory, I/O
Itanium, MIPS, ARM are mostly virtualizable

VMware software VMM:

Software VMMs runs very slow due to interpretation
Replace sensitive instructions in guest on-the-fly
- Binary input, not source code

Paravirtualization

Present software interface to virtual machines that is similar but not identical to that of the underlying hardware
Allow the guests to handle difficult portion of code to VMM
Guest OS should be explicitly ported for para-API
- A convention OS may not work on p-VMM

Hypervisors:

VMware
Xen
KVM
HyperV, MS

Nested virtualization

Turtle Project in IBM
Hypervisor running on top of Hypervisor
Ship hardware with a modified hypervisor. Other hypervisors can be run on top of this hypervisor.

Difference between cloud computing and virtualization

Virtualization is part of cloud computing
Human is needed to interact with hypervisors to create virtual machines
Cloud computing: API handles all automation to communicate with hypervisors

Container

Container is user-level virtualization.

Now applications are:

constantly developed
New versions are being deployed often
built from loosely coupled components
deployed to a multitude of servers

Containers make apps portable

looks the same everywhere
No matter where it is running
Don’t need to install all dependencies on the host

Vagrant:

contains project code and provisioning script
Can be pushed to VM, Github and various server
Put code in environment

Docker:

Dockerfile: define dependencies
Docker image: A complete wrap of application and dependencies
Build environment

Distributed computing

Distributed system:

Multiple computers connected with each other
Working together
Seems like a single coherent system.
- Failures are normal and tolerant

Challenges:

Failure
- Expected to fail
Concurrency
- Accessing the same data
Synchronization
- No common lock
Scalability

Example of problems

Consensus
- Nodes agree on a value of a piece of data
Leader election
Group management
Failure detection
Clock synchronization

NoSQL

Key-value
- Redis
Column-oriented
- Cassandra
Graph
- Social network data
Document
- MongoDB, XML

Workload nowadays:

Volume: large amounts of data
Unstructured: hard to fit into a schema
- Images, videos
Random reads and writes, write heavy

Needs:

Speed: queries quickly
Distributed: availability, no single point of failure
Low cost: include cost of system admin
Scalable: demand changes quickly

NoSQL features:

Non relational
No schema required
Easily distributed
Horizontal scaling, scale out (add more machines)

In-memory caching

Scenario to use:

Speed
- Fetching data from database or remote API call is slow
Data access pattern
- Frequency: high
Staleness
- Mostly static data

Example:

Amazon’s ElastiCache
Redis
Memcached
DynamoDB

memchached

In-memory key-value store
Flexible
- Values can be anything, webpages, numbers
Items in it have an expiration time
Items replaced in LRU fashion when cache is full
Operations
- get, set, delete

Failure detection

Failures are the norm
Need failure detector to determine if a process has failed

Types:

ping-based
- Send ping and receive ACK
- Randomly pick target
- If no ACK, ask other processes to check the target
Gossip protocol
- Messages are like infection, go to neighboring nodes (or just a subset)
  - Every node maintains a membership list
  - Gossip messages may include membership list
  - On receiving msg, nodes update membership list
  - Mark a node as failed if no update after a certain period
Heartbeat
- Nodes periodically send messages with sequence numbers
- Centralized
  - Leader may fail
- Or ring based
- Or all to all

CAP theorem

C: Consistency
- Operations on the distributed system act as if on a single machine
- Every successful read receives the most recent write.
A: Availability
- Every request to a non-failing node must result in a response
- Translate to revenue
- Read and write should complete with a low latency
P: Partition tolerance
- Good even when all msg from one part of the system to other are cut off
- Consistency: read the most recent write (the written node may be blocked)
- Availability: read the latest copy (the node with that copy may be blocked)

Partition can happen when:

Internet outages
- Between data centers
- Between racks within a data center
Data centers may be blocked
- Governments
- Firewalls

Theorem

A distributed data store can satisfy only at most two of CAP guarantees.

Consistency levels

Strong consistency
- Read always see the most recent write
Eventual consistency
- Read may see older copy sometimes
- Eventually, the most recent
Causal consistency
- Only causally related reads wait for the most recent write
- Read y does not wait to update x value (unrelated)

P2P

Protocol:

Gnutella
- Discover neighbors
  - Ping
  - Pong
- Request files
  - Query
  - Query hit
Chord
- Consistent hashing
- Tradeoff between routing table size and lookup time
  - A node knows every other node
    - Fast lookup, large table
  - A node only knows neighbors
    - Slow lookup, small table
  - Finger table: A node knows m other nodes
    - Maps nodes at distance $2^0$ , $2^1$ , …, $2^{m-1}$

Videos about

AI 2

Algorithm 17

Amazon 1

Authorization 1

Blog 3

Bootstrap 1

C++ 1

CCpp 5

CSS 2

Cloud 3

Code 1

Crawler 1

DNS 1

Database 17

DeepLearning 1

Design 16

Development 1

Docker 1

English 1

Express 1

GDB 1

Go 3

Google 4

HTML 3

IOS 1

Java 17

Javascript 4

Jekyll 1

Linux 4

MacOS 2

MachineLearning 16

Markdown 4

Mobile 1

MongoDB 2

Multi-threading 3

NAS 1

Network 11

NeuralNetwork 10

Node 1

OS 8

Public-speaking 1

Python 15

RESTful 1

Rails 9

React 1

Redis 1

Ruby 6

Shell 2

Spring 2

System 16

TCP 1

TDD 1

Thread 2

Vim 1

awk 1

git 1

jQuery 1

media 1

network 1

php 1