Public speaking course notes Read "Dynamo, Amazon’s Highly Available Key-value Store" Read "Bigtable, A Distributed Storage System for Structured Data" Read "Streaming Systems" 3, Watermarks Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems IOS development with Swift Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems Three cases of binary search TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction Overview in cloud computing 2 TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Overview in distributed systems and cloud computing 1 A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion TAMU Advanced Algorithms 3, Maximum Bandwidth Path (Dijkstra, MST, Linear) TAMU Advanced Algorithms 2, B+ tree and Segment Intersection TAMU Advanced Algorithms 1, BST, 2-3 Tree and Heap TAMU AI, Searching problems Factorization Machine and Field-aware Factorization Machine for CTR prediction TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons Princeton Algorithms P1W6 Hash Tables & Symbol Table Applications Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Princeton Algorithms P1W5 Balanced Search Trees TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Princeton Algorithms P1W4 Priority Queues and Symbol Tables Stanford ML 5 Neural Networks Learning Princeton Algorithms P1W3 Mergesort and Quicksort Stanford ML 4 Neural Networks Basics Princeton Algorithms P1W2 Stack and Queue, Basic Sorts Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Princeton Algorithms P1W1 Union and Find Stanford ML 1 Introduction and Parameter Learning

Multi-Threading with Cpp

2018-02-10

Multi-threading technique indeed improves the efficiency of a program dramatically. For example, a web crawler I recently created can crawl one million URLs (parsing URL, doing DNS, sending/receiving with Socket, parsing response, etc) in a minute. Here I will list some key contents in order to share or reuse in the future.

Shared data

There should be some shared data between threads. For example, a CRITICAL_SECTION object handling locking and unlocking (much faster than mutexes), data to be consumed (like a queue containing all URLs to be crawled). It’s recommended that we create a class to hold those shared data. Note InitializeCriticalSection() needs to be called in the constructor of this class in order to use it.

Create/close threads

Threads can be created/closed with a function like below. threadStat and threadCraw are user-defined functions for single threads with def UINT threadCrawl(LPVOID pParam).

HANDLE *handles = new HANDLE[numThreads + 1];
// start stat thread
handles[numThreads] = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)threadStat, &sharedData, 0, NULL);
// start N crawling threads
for (int i = 0; i < numThreads; i++)
{
    handles[i] = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)threadCrawl, &sharedData, 0, NULL);
}
// wait for N crawing threads to finish
// signal stats thread to quit, wait for it to terminate
for (int i = 0; i < numThreads + 1; i++)
{
    WaitForSingleObject(handles[i], INFINITE);
    CloseHandle(handles[i]);
}

Lock/unlock

With CRITICAL_SECTION, we can lock and unlock as follows. Note we need to unlock before break to avoid deadlock.

// lock
EnterCriticalSection(&(sharedData->cs));
if (sharedData->urlsQueue.empty())
{
    // unlock
    LeaveCriticalSection(&(sharedData->cs));
    break;
}
url = sharedData->urlsQueue.front();
sharedData->urlsQueue.pop();
InterlockedIncrement(&(sharedData->numExtractedURLs));
// unlock
LeaveCriticalSection(&(sharedData->cs));

Atomic operations

To update the stats, we can use locking/unlocking, but it is often faster to directly use interlocked operations, each mapping to a single CPU instruction. Two examples of such functions is as below.

// increment by 1
InterlockedIncrement(&(sharedData->numExtractedURLs));

// add a number
InterlockedAdd(&(sharedData->numActiveThreads), -1);

Creative Commons License
Melon blog is created by melonskin. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2016-2025. All rights reserved by melonskin. Powered by Jekyll.