Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems IOS development with Swift Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems Three cases of binary search TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction Overview in cloud computing 2 TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Overview in distributed systems and cloud computing 1 A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion TAMU Advanced Algorithms 3, Maximum Bandwidth Path (Dijkstra, MST, Linear) TAMU Advanced Algorithms 2, B+ tree and Segment Intersection TAMU Advanced Algorithms 1, BST, 2-3 Tree and Heap TAMU AI, Searching problems Factorization Machine and Field-aware Factorization Machine for CTR prediction TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons Princeton Algorithms P1W6 Hash Tables & Symbol Table Applications Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Princeton Algorithms P1W5 Balanced Search Trees TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Princeton Algorithms P1W4 Priority Queues and Symbol Tables Stanford ML 5 Neural Networks Learning Princeton Algorithms P1W3 Mergesort and Quicksort Stanford ML 4 Neural Networks Basics Princeton Algorithms P1W2 Stack and Queue, Basic Sorts Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Princeton Algorithms P1W1 Union and Find Stanford ML 1 Introduction and Parameter Learning

A tour of C++, Basics

2019-01-06

This is my notes taken while reading A tour of C++, to refresh C++ knowledge for future job.

Very basics

For a program to run, its source text has to be processed by a compiler, producing object files, which are combined by a linker yielding an executable program.

  • int main() {} is needed for every program.
    • Return value 0 indicates success.
  • #include <iostream>
    • instructs the compiler to include the declarations of the standard stream I/O facilities as found in iostream.
  • using namespace std;
    • make names from std visible without std::.

Functions

// def
Elem* next_elem(); // no argument; return a pointer to Elem (an Elem*) 
void exit(int); // int argument; return nothing
double sqrt(double); // double argument; return a double
double get(const vector<double>& vec, int index); // type: double(const vector<double>&,int) 
char& String::operator[](int index); // type: char& String::(int)


// usage
double s2 = sqrt(2); // call sqrt() with the argument double{2}
double s3 = sqrt("three"); // error: sqrt() requires an argument of type double

Function overloading:

  • Use it to perform similar tasks.
  • If two functions are defined with the same name, but with different argument types, the compiler will choose the most appropriate function to invoke for each call.
  • If two alternative functions could be called, but neither is better than the other, the call is deemed ambiguous and the compiler gives an error.

Types, variables and constants

  • bool
    • 1B
  • char
    • 1B
  • int
    • 4B
  • double
    • 8B
  • unsigned
    • 4B
  • sizeof(char) is often 4.
  • Conversions that lose information, narrowing conversions, such as double to int and int to char are allowed and implicitly applied.
    • Avoid this.

Operators:

// arithmetic
x+y // plus
+x // unary plus
xy // minus
x // unary minus
xy // multiply
x/y // divide
x%y // remainder (modulus) for integers

x+=y // x=x+y
++x // increment: x = x+1 
x=y // x=x-y
−−x // decrement: x = x-1 
x*=y // scaling: x = x*y 
x/=y // scaling: x = x/y 
x%=y // x=x%y


// comparison
x==y //equal
x!=y // not equal
x<y // less than
x>y // greater than
x<=y // less than or equal 
x>=y // greater than or equal


// logical
x&y // bitwise and
x|y // bitwise or
x^y // bitwise exclusive or 
~x // bitwise complement 
x&&y // logical and
x||y // logical or

Variables: Don’t declare a variable until you have a value to initialize it with.

double d1 = 2.3; // initialize d1 to 2.3 
double d2 {2.3}; // initialize d2 to 2.3
complex<double> z = 1; // a complex number with double-precision floating-point scalars
complex<double> z2 {d1,d2}; 
complex<double> z3 = {1,2}; // the = is optional with { ... }
vector<int> v {1,2,3,4,5,6}; // a vector of ints

// auto
auto ch = 'x'; // a char
auto z = sqrt(y); // z has the type of whatever sqrt(y) returns

Constants:

  • const
    • I promise not to change this value.
  • constexpr
    • to be evaluated at compile time.
    • to be placed in read-only memory.
      • unlikely to be corrupted;
      • for performance.
    • For a function to be usable in a constant expression, it must be defined constexpr.
      • To be constexpr, a function must be rather simple: just a return-statement computing a value.
      • A constexpr function can be used for non-constant arguments, but when that is done the result is not a constant expression.
constexpr double square(double x) { return x*x; }

const int dmv = 17;
constexpr double max1 = 1.4square(dmv); // dmv must be const

Scope

Types of scopes:

  • Local scope
    • A name declared in a function or lambda.
    • Its scope extends from its point of declaration to the end of the block {} in which its declaration occurs.
    • Function argument names are considered local names.
  • Class scope
    • It is defined in a class, outside any function, lambda, or enum class.
    • Its scope extends from the opening { of its enclosing declaration to the end of that declaration.
  • Namespace scope
    • It is defined in a namespace, outside any function, lambda, class, or enum class.
    • Its scope extends from the point of declaration to the end of its namespace.
  • Global namespace
    • A name not declared inside any other construct.

An object must be constructed (initialized) before it is used and will be destroyed at the end of its scope.

  • For a namespace object the point of destruction is the end of the program.
  • For a member, the point of destruction is determined by the point of destruction of the object of which it is a member.
  • An object created by new lives until destroyed by delete.
  • Keep common and local names short, and keep uncommon and non-local names longer.

Pointers, arrays and references

char v[6]; // array of 6 characters
char* p = &v[3]; // p points to v's fourth element
char x = *p; // *p is the object that p points to
  • Size of an array must be constexpr.
  • In an expression, prefix unary * means “contents of”.
  • In an expression, prefix unary & means “address of”.
  • In a declaration, the unary suffix & means “reference to”.
    • Don’t need to use prefix *.
    • Do not copy the value for function argument.
      • void sort(vector<double>& v);
    • Can change the value.
      • If we don’t want to change the value.
        • Use const double sum(const vector<double>&)
  • nullptr stands for “no object available”.
    • double* pd = nullptr;
    • Wise to check null pointer for function arguments.
    • nullptr is falsy.
  • We can move a pointer to point to the next element in an array with ++.
// have x refer to an element
for (auto& x : v) ++x;

Declarations:


T a[n]; // T[n]: array of n Ts
T* p; // T*: pointer to T
T& r; // T&: reference to T
T f(A); // T(A): function taking an argument of type A returning a result of type T

User-defined types

Structures

struct MyVector {
  int size;
  double* elem;
}

// create and access
Vector v;
v.elem = new double[10];
v.size = 10;
Vector* pv = &v;
int s1 = pv->size;

Classes

class Vector { 
  public:
    Vector(int s) :elem{new double[s]}, sz{s} { } // construct a Vector with member initializer list
    double& operator[](int i) { return elem[i]; } // element access: subscripting
    int size() { return sz; }
  private:
    double* elem; // pointer to the elements 
    int sz; // the number of elements
};

// create and access
Vector v(6);
double d = v[2];

Unions

A union is a struct in which all members are allocated at the same address so that the union occupies only as much space as its largest member. To avoid errors, one can encapsulate a union so that the correspondence between a type field and access to the union members is guaranteed. (wrap them in a class together with a type field)

// use either v.s or v.i
union Value { 
  char* s;
  int i; 
};

Enumerations

Enumerations are used to represent small sets of integer values. They are used to make code more readable and less error-prone.

enum class Color { red, blue, green };
enum class Traffic_light { green, yellow, red };

Color col = Color::red;
Traffic_light light = Traffic_light::red;

// define operators
Traffic_light& operator++(Traffic_light& t) 
  // prefix increment: ++
{
  switch (t) {
    case Traffic_light::green: return t=Traffic_light::yellow;
    case Traffic_light::yellow: return t=Traffic_light::red;
    case Traffic_light::red: return t=Traffic_light::green;
  }
}
Traffic_light next = ++light; // next becomes Traffic_light::green

If we don’t use class after enum, we get a plain enum, but less behaved.

enum Color { red, green, blue }; 
int col = green;

Modularity

Interface is represented by declarations. It’s differed from its implementation.

// declaration
double sqrt(double); // the square root function takes a double and returns a double

class Vector { 
public:
  Vector(int s); 
  double& operator[](int i); 
  int size();
private:
  double elem; // elem points to an array of sz doubles 
  int sz;
};

// implementation
double sqrt(double d) // definition of sqrt() 
{
  // ... algorithm as found in math textbook ... 
}

Separate compilation

Separate library functions from the application codes in different files.

In library header file, we define the declarations. In library implementation file, we need to include the header file.

// Vector.h:
class Vector { 
  public:
    explicit Vector(int s); // no implicit conversion from int to Vector
    ~Vector(); // destructor
    double& operator[](int i); 
    int size();
  private:
    double elem; // elem points to an array of sz doubles 
    int sz;
}

Implementation:

// Vector.cpp:

#include "Vector.h" // get the interface 

Vector::Vector(int s)
  :elem{new double[s]}, sz{s}
{
}

Vector::~Vector() { delete[] elem; } // free spaces allocated by new

double& Vector::operator[](int i) 
{
  return elem[i]; 
}

int Vector::size() 
{
  return sz; 
}

In user code, we include useful files as header file to access that interface.

We should avoid naked new and delete; use constructor and destructor instead.

// user.cpp:
#include "Vector.h" // get Vector’s interface
#include <cmath> // get the the standard-library math function interface including sqrt()

using namespace std; // make std members visible (§3.3)

double sqrt_sum(Vector& v) 
{
  double sum = 0;
  for (int i=0; i!=v.size(); ++i)
    sum+=sqrt(v[i]); // sum of square roots
  return sum;
}

Namespaces

We can put self-defined functions and classes in a self-defined namespace to avoid conflicts between them and std.

namespace My_code 
{ 
  class complex 
  {
    // ... 
  };
  complex sqrt(complex); 
  // ...
  int main(); 
}
int My_code::main() {
  complex z {1,2};
  auto z2 = sqrt(z);
  std::cout << '{' << z2.real() << ',' << z2.imag() << "}\n"; 
  // ...
};

We can access functions inside with std::cout or My_code::main(). std:: is not needed if we do using namespace std (using-directive). Don’t put a using-directive in a header file

Error handling

Exceptions:

The throw transfers control to a handler for exceptions of type out_of_range. To do that, the implementation will unwind the function call stack as needed to get back to the context of that caller. The out_of_range type is defined in the standard library (in <stdexcept>) and is in fact used by some standard-library container access functions.

double& Vector::operator[](int i) 
{
  if (i<0 || size()<=i)
    throw out_of_range{"Vector::operator[]"};
  return elem[i]; 
}

A function that should never throw an exception can be declared noexcept. If all good intent and planning fails, so that user() still throws, the standard-library function terminate() is called to immediately terminate the program.

void user(int sz) noexcept 
{
  Vector v(sz);
  iota(&v[0],&v[sz],1); // fill v with 1,2,3,4... 
  // ...
}

We can rethrow the exception after it was caught.

void test() 
{
  try {
    Vector v(27);
  }
  catch (std::length_error) {
    cout << "test failed: length error\n";
    throw; // rethrow 
  }
  catch (std::bad_alloc) {
    // Ouch! test() is not designed to handle memory exhaustion 
    std::terminate(); // terminate the program
  } 
}

Invariant: e.g., make sure the input arguments reasonable.

Vector::Vector(int s) 
{
  if (s<0)
    throw length_error{};
  elem = new double[s];
  sz = s; 
}

Static assertions: we can also perform simple checks on other properties that are known at compile time and report failures as compiler error messages. The static_assert mechanism can be used for anything that can be expressed in terms of constant expressions. The most important uses of static_assert come when we make assertions about types used as parameters in generic programming.

static_assert(4<=sizeof(int), "integers are too small"); // check integer size

Classes

The central language feature of C++ is the class.

Concrete types

The basic idea of concrete classes is that they behave just like built-in types. For example, a complex number type and an infinite-precision integer.

We can:

  • place objects of concrete types on the stack, in statically allocated memory, and in other objects.
  • refer to objects directly (and not just through pointers or references).
  • initialize objects immediately and completely.
  • copy objects.
class complex 
{
  double re, im; // representation: two doubles
public:
  complex(double r, double i) :re{r}, im{i} {} 
  complex(double r) :re{r}, im{0} {} 
  complex() :re{0}, im{0} {}

  double real() const { return re; }  // const indicates this function do not modify the object
  void real(double d) { re=d; } 
  double imag() const { return im; } 
  void imag(double d) { im=d; }

  complex& operator+=(complex z) { re+=z.re, im+=z.im; return *this; }
  complex& operator=(complex z) { re=z.re, im=z.im; return *this; }
  complex& operator*=(complex); // defined out-of-class somewhere
  complex& operator/=(complex); // defined out-of-class somewhere 
};

complex operator+(complex a, complex b) { return a+=b; } 
complex operator(complex a, complex b) { return a=b; } 
complex operator(complex a) { return {a.real(), a.imag()}; } 

Abstract types

An abstract type is a type that completely insulates a user from implementation details.

class Container 
{ 
public:
  virtual double& operator[](int) = 0; // pure virtual function
  virtual int size() const = 0;
  virtual ~Container() {}
};

class Vector_container : public Container 
{ // Vector_container implements Container 
  Vector v;
public:
  Vector_container(int s) : v(s) { } // Vector of s elements  
  ~Vector_container() {}

  double& operator[](int i) { return v[i]; } // override
  int size() const { return v.size(); } 
};

void use(Container& c) 
{
  const int sz = c.size();
  for (int i=0; i!=sz; ++i) 
    cout << c[i] << '\n';
}

void g() 
{
  Vector_container vc {10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0};
  use(vc); // use only knows Container
}
  • virtual: may be redefined later in a class derived from this one.
  • Pure virtual function is indicated by =0. It means some class derived this class must define this function.

Virtual functions

Each class with virtual functions has its own virtual function table identifying its virtual functions. The compiler can convert the name of a virtual function into an index in this table, thus know which function to use.

Class Hierarchies

A class hierarchy is a set of classes ordered in a lattice created by derivation (e.g., :public).

  • Keyword override requires some base of this class must have a function with the same name to be override.
    • void move() override;
  • We can use the objects of the derived class wherever an object of a base object is required.
  • Functions from base classes simplifies implementations of derived classes.
  • unique_ptr from std can be used to own object and delete object when it is not needed. (when the unique_ptr object is out of scope.
    • unique_ptr<Container>

Copy and move

Copying of an object is defined by two members: a copy constructor and a copy assignment.

class Vector 
{
  // ...
public:
  Vector(const Vector& a); // copy constructor, const indicate that we don't modify a; & to avoid a copy
  Vector& operator=(const Vector& a); // copy assignment
  // ...
}

Vector::Vector(const Vector& a) // copy constructor 
  :elem{new double[a.sz]}, // allocate space for elements
  sz{a.sz} 
{
  for (int i=0; i!=sz; ++i) // copy elements 
    elem[i] = a.elem[i];
}

Vector& Vector::operator=(const Vector& a) 
{
  double* p = new double[a.sz]; 
  for (int i=0; i!=a.sz; ++i)
    p[i] = a.elem[i];
  }
  delete[] elem; // delete old elements
  elem = p;
  sz = a.sz; 
  return *this; // avoid a copy
}

Moving objects by reference, not copying:

class Vector 
{
  // ...
  Vector(Vector&& a); // move constructor
  Vector& operator=(Vector&& a); // move assignment
}

Vector::Vector(Vector&& a)
  :elem{a.elem}, // "grab the elements" from a
  sz{a.sz} 
{
  a.elem = nullptr; // now a has no elements
  a.sz = 0; 
}

Vector& Vector::operator=(Vector&& a) 
{
  delete[] elem; // delete old elements
  elem = a.elem;
  sz = a.sz; 
  a.elem = nullptr;
  a.sz = 0;
  return *this; 
}

Vector x(1000); 
Vector y(1000); 
Vector z(1000); 
z = x; // we get a copy, x is lvalue
y = std::move(x); // we get a move 
return z; // we get a move
  • && means rvalue reference, something on the right-hand side of an assignment.
  • We can easily move object out of scope.
  • If a default constructor, assignment, or destructor is appropriate, let the compiler generate it.
  • Copy and move operators are needed when a destructor is needed.
  • Initialization with = doesn’t use copy-assignment operator. It uses copy constructor instead.

Let each resource have an owner in some scope and by default be released at the end of its owners scope, e.g, memory, locks, sockets, file handles, and thread handles.

Using the default copy or move for a class in a hierarchy is typically a disaster: given only a pointer to a base, we simply don’t know what members the derived class has. So, the best thing to do is usually to delete the default copy and move operations, that is, to eliminate the default definitions of those two operations:

class Shape { 
public:
  Shape(const Shape&) =delete; // no copy operations
  Shape& operator=(const Shape&) =delete;

  Shape(Shape&&) =delete; // no move operations
  Shape& operator=(Shape&&) =delete;
  ~Shape();
// ... 
};

=delete is general; can be used to suppress other operations.

Templates

A template is a class or a function that we parameterize with a set of types or values.

  • Use templates to express algorithms that apply to many argument type.
  • Use templates to express containers.
  • When defining a template, first design and debug a non-template version; later generalize by adding parameters.
  • Templates are type-safe, but checking happens too late.
  • A template can pass argument types without loss of information.
  • A virtual function member cannot be a template member function.
  • There is no separate compilation of templates: #include template definitions in every translaion unit that uses them.
template<typename T> 
class Vector {
private:
  T* elem; // elem points to an array of sz elements of type T
  int sz; 
public:
  explicit Vector(int s); // constructor: establish invariant, acquire resources 
   ~Vector() { delete[] elem; } // destructor: release resources
  
  // ... copy and move operations ...

  T& operator[](int i);
  const T& operator[](int i) const; 
  int size() const { return sz; }
};

// member functions
template<typename T> 
Vector<T>::Vector(int s) 
{
  if (s<0)
    throw Negative_size{};
  elem = new T[s];
  sz = s; 
}

template<typename T>
const T& Vector<T>::operator[](int i) const 
{
  if (i<0 || size()<=i)
    throw out_of_range{"Vector::operator[]"};
  return elem[i]; 
}

// for range-for
template<typename T> 
T* begin(Vector<T>& x) {
  return x.size() ? &x[0] : nullptr;
}

template<typename T> 
T* end(Vector<T>& x) {
  return begin(x)+x.size();
}
  • template<typename T> prefix makes T a parameter of the declaration it prefixes.
  • Templates are a compile-time mechanism, so their use incurs no run-time overhead.

We can define different Vector like:

Vector<char> vc(200); 
Vector<string> vs(17); 
Vector<list<int>> vli(45);

// range-for
void f2(Vector<string>& vs) 
{
  for (auto& s : vs) 
    cout << s << '\n';
}

A template can take value arguments.

template<typename T, int N> 
struct Buffer {
  using value_type = T; 
  constexpr int size() { return N; } 
  T[N];
  // ...
};

Buffer<char,1024> glob; // global buffer of characters (statically allocated)

void fct() 
{
  Buffer<int,10> buf; // local buffer of integers (on the stack)
  // ... 
}
  • The alias (value_type) and the constexpr function are provided to allow users (read-only) access to the template arguments.
  • Buffer allows us to create arbitrarily sized buffers in static stack, instead of dynamic memory.
  • A template value argument must be a constant expression.

Function templates

template<typename Container, typename Value> 
Value sum(const Container& c, Value v)
{
  for (auto x : c) 
    v+=x;
  return v; 
}

// use
void user(Vector<int>& vi, std::list<double>& ld, std::vector<complex<double>>& vc) {
  int x = sum(vi,0); // the sum of a vector of ints (add ints) double d = sum(vi,0.0); // the sum of a vector of ints (add doubles) 
  double dd = sum(ld,0.0); // the sum of a list of doubles
  auto z = sum(vc,complex<double>{});
}

Function objects

Objects that can be called like functions. It can be a comparator or some operation that can modify a value.

template<typename T> 
class Less_than {
  const T val; // value to compare against 
public:
  Less_than(const T& v) :val(v) { }
  bool operator()(const T& x) const { return x<val; } // call operator 
};

// initialize
Less_than<int> lti {42}; // lti(i) will compare i to 42 using < (i<42) 
Less_than<string> lts {"Backus"}; // lts(s) will compare s to "Backus" using < (s<"Backus")

// use
bool b1 = lti(n); // true if n<42
bool b2 = lts(s); // true if s<"Backus"


// operator example
template<typename C, typename Oper>
void for_all(C& c, Oper op) // assume that C is a container of pointers 
{
  for (auto& x : c)
    op(*x); // pass op() a reference to each element pointed to
}

// use
vector<unique_ptr<Shape>> v; 
while (cin)
  v.push_back(read_shape(cin)); 
for_all(v,[](Shape& s){ s.draw(); }); 
for_all(v,[](Shape& s){ s.rotate(45); });
  • The function called operator() implements the “function call”, “call”, or “application” operator ().
  • Lambda expression: [&](const string& a){ return a<s; }
    • [&]: capture all local names such as x.
    • [&x]: capture only x.
    • [=x]: capture a copy of x.
    • [=]: capture all by value.
    • [] is ok, nothing captured.

Variadic templates

A template can be defined to accept an arbitrary number of arguments of arbitrary types. Such a template is called a variadic template. The key to implementing a variadic template is to note that when you pass a list of arguments to it, you can separate the first argument from the rest.

void f() { } // do nothing

template<typename T, typename... Tail> 
void f(T head, Tail... tail)
{
  g(head); // do something to head
  f(tail...); // try again with tail 
}

f(1,2.2,"hello"); // call f(2.2, "hello") which calls f("hello") which calls f()
  • Variadic templates can accept anh arguments you can give them.
  • But the type checking of the interface is a possibly elaborate template program.

Aliases

We can use alias to get the value type of a generic type.

using size_t = unsigned int; // size_t is an alias to unsigned int

Template<typename T> 
class Vector {
public:
  using value_type = T;
  // ... 
};


// every std container provides value_type as the name of its value type
template<typename C>
using Element_type = typename C::value_type;

template<typename Container> 
void algo(Container& c)
{
  Vector<Element_type<Container>> vec;
  // ... 
}


// we can bind some or all template arguments
template<typename Key, typename Value> 
class Map {
  // ... 
};

template<typename Value>
using String_map = Map<string,Value>;
String_map<int> m; // m is a Map<string,int>

Creative Commons License
Melon blog is created by melonskin. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2016-2019. All rights reserved by melonskin. Powered by Jekyll.