Multi-Threading with Cpp | Melon blog

C++ data structures for leetcode Reading "Clean code" A tour of C++, Standard library A tour of C++, Basics Golang basics Multi-Threading with Cpp Golang tricks Some Golang concurrency concepts Javascript basics PHP basics Java boxing and unboxing How to safely stop threads in Java Java sort tricks Override equals and hashcode method in Java Convert between list and array in Java Java Programming 4, Tool Class Java Programming 3, Deeper Understanding, Exception Java Programming 2, Class Java Programming 1, Basics, I/O, Data Type C advanced Java basics 2 Java basics 1 C basics Use Google Drive OCR Ruby Tutorial on Lynda Python Tutorial from Pandas Binary Heap in Java Use Itertools for Better Iteration More data type from collections module File manipulation with os module in python Deal with zip file in python Regular expression usage with sample Basic json usage with python Simple file IO Glob module for simple file matching Exception Handling in python Logging in python Functional Programming Set data type in python Loop through dictionary Ruby practice 4 Ruby practice 3 Ruby practice 2 Ruby practice 1 Rails tutorial by Ihower

Migrate blog from Github Pages to Gitlab Pages React basics 7 RESTful routes and use PUT in HTML form Use custom domain for Github page and add HTTPS support Node.js，Express, API and MongoDB basics jQuery basics Document Object Model DOM HTML CSS Bootstrap basics Some tricks found during Spring/SpringMVC development How to create a static blog Rails tutorial by Ihower HTML from W3School 1 Rspec rocks Use markdown in your rails blog A little filter in your rails app Use devise to authenticate your app Better url with friendly id Use will paginate in your rails app Add disqus to your rails app

MacOS bluetooth connectivity issues Home network, media center setup Enable Meta key on Linux Deploy a shad0ws0cks server on VPS and enable BBR Use Soundflower to enable volume control for external monitor Useful commands related to network Some tools for fast development Write smart with Markdown (citations, footnotes, figure, table and equation references) Add GDB support in Bochs environment and terminal output for an OS console Redis basics Some queries in MongoDB awk, sed, cut, content manipulating techniques in shell Remove annoying DS_Store in git File and folder manipulating techniques in shell Markdown quick syntax Quick Reference of Vim Basic linux commands

Design patterns Mixture of Experts model papers English phrases Public speaking course notes Read "Dynamo, Amazon’s Highly Available Key-value Store" Read "Bigtable, A Distributed Storage System for Structured Data" Read "Streaming Systems" 3, Watermarks Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems IOS development with Swift Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems Three cases of binary search TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction Overview in cloud computing 2 TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Overview in distributed systems and cloud computing 1 A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion TAMU Advanced Algorithms 3, Maximum Bandwidth Path (Dijkstra, MST, Linear) TAMU Advanced Algorithms 2, B+ tree and Segment Intersection TAMU Advanced Algorithms 1, BST, 2-3 Tree and Heap TAMU AI, Searching problems Factorization Machine and Field-aware Factorization Machine for CTR prediction TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons Princeton Algorithms P1W6 Hash Tables & Symbol Table Applications Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Princeton Algorithms P1W5 Balanced Search Trees TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Princeton Algorithms P1W4 Priority Queues and Symbol Tables Stanford ML 5 Neural Networks Learning Princeton Algorithms P1W3 Mergesort and Quicksort Stanford ML 4 Neural Networks Basics Princeton Algorithms P1W2 Stack and Queue, Basic Sorts Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Princeton Algorithms P1W1 Union and Find Stanford ML 1 Introduction and Parameter Learning

A tool for docker container safety determination A TCP-like reliable data transfer protocol built on UDP A multi-threading high-performance web crawler TinySQL, a SQL interpreter implementation Algorithms solving maximum bandwidth path problem in network A simple DNS resolver, similar to nslookup A guess color game practice developed with HTML, CSS and Javascript CourseLookr, a DB-based web app searching courses in A&M Solve blocksWorld problem with A* Search

读人类简史读浪潮之巅记忆力抢救日更子在川上曰 Night asks day, A song by JJ Lin

AI 2

TAMU AI, Searching problems Solve blocksWorld problem with A* Search

Algorithm 17

C++ data structures for leetcode Three cases of binary search Algorithms solving maximum bandwidth path problem in network A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion TAMU Advanced Algorithms 3, Maximum Bandwidth Path (Dijkstra, MST, Linear) TAMU Advanced Algorithms 2, B+ tree and Segment Intersection TAMU Advanced Algorithms 1, BST, 2-3 Tree and Heap Solve blocksWorld problem with A* Search Factorization Machine and Field-aware Factorization Machine for CTR prediction Princeton Algorithms P1W6 Hash Tables & Symbol Table Applications Princeton Algorithms P1W5 Balanced Search Trees Princeton Algorithms P1W4 Priority Queues and Symbol Tables Princeton Algorithms P1W3 Mergesort and Quicksort Princeton Algorithms P1W2 Stack and Queue, Basic Sorts Princeton Algorithms P1W1 Union and Find Binary Heap in Java

Amazon 1

Read "Dynamo, Amazon’s Highly Available Key-value Store"

Authorization 1

Read "Zanzibar, Google’s Consistent, Global Authorization System"

Blog 3

Migrate blog from Github Pages to Gitlab Pages Use custom domain for Github page and add HTTPS support How to create a static blog

Bootstrap 1

HTML CSS Bootstrap basics

C++ 1

C++ data structures for leetcode

CCpp 5

A tour of C++, Standard library A tour of C++, Basics Multi-Threading with Cpp C advanced C basics

CSS 2

A guess color game practice developed with HTML, CSS and Javascript HTML CSS Bootstrap basics

Cloud 3

Overview in cloud computing 2 A tool for docker container safety determination Overview in distributed systems and cloud computing 1

Code 1

Reading "Clean code"

Crawler 1

A multi-threading high-performance web crawler

DNS 1

A simple DNS resolver, similar to nslookup

Database 17

Read "Dynamo, Amazon’s Highly Available Key-value Store" Read "Bigtable, A Distributed Storage System for Structured Data" Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems TinySQL, a SQL interpreter implementation Redis basics Some queries in MongoDB

DeepLearning 1

TAMU Neural Network 6 Deep Learning Overview

Design 17

Design patterns Read "Dynamo, Amazon’s Highly Available Key-value Store" Read "Bigtable, A Distributed Storage System for Structured Data" Read "Streaming Systems" 3, Watermarks Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems

Development 1

Some tools for fast development

Docker 1

A tool for docker container safety determination

English 1

English phrases

Express 1

Node.js，Express, API and MongoDB basics

GDB 1

Add GDB support in Bochs environment and terminal output for an OS console

Go 3

Golang basics Golang tricks Some Golang concurrency concepts

Google 4

Read "Bigtable, A Distributed Storage System for Structured Data" Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database"

HTML 3

A guess color game practice developed with HTML, CSS and Javascript HTML CSS Bootstrap basics HTML from W3School 1

IOS 1

IOS development with Swift

Java 17

Three cases of binary search A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion Java boxing and unboxing CourseLookr, a DB-based web app searching courses in A&M Some tricks found during Spring/SpringMVC development How to safely stop threads in Java Java sort tricks Override equals and hashcode method in Java Convert between list and array in Java Java Programming 4, Tool Class Java Programming 3, Deeper Understanding, Exception Java Programming 2, Class Java Programming 1, Basics, I/O, Data Type Java basics 2 Java basics 1 Binary Heap in Java

Javascript 4

jQuery basics A guess color game practice developed with HTML, CSS and Javascript Document Object Model DOM Javascript basics

Jekyll 1

How to create a static blog

Linux 4

Enable Meta key on Linux awk, sed, cut, content manipulating techniques in shell File and folder manipulating techniques in shell Basic linux commands

MacOS 2

MacOS bluetooth connectivity issues Use Soundflower to enable volume control for external monitor

MachineLearning 17

Mixture of Experts model papers TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Stanford ML 5 Neural Networks Learning Stanford ML 4 Neural Networks Basics Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Stanford ML 1 Introduction and Parameter Learning

Markdown 4

Write smart with Markdown (citations, footnotes, figure, table and equation references) How to create a static blog Markdown quick syntax Use markdown in your rails blog

Mobile 1

IOS development with Swift

MongoDB 2

Some queries in MongoDB Node.js，Express, API and MongoDB basics

Multi-threading 3

A multi-threading high-performance web crawler Multi-Threading with Cpp How to safely stop threads in Java

NAS 1

Home network, media center setup

Network 11

Deploy a shad0ws0cks server on VPS and enable BBR TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer A TCP-like reliable data transfer protocol built on UDP TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Useful commands related to network A multi-threading high-performance web crawler Algorithms solving maximum bandwidth path problem in network A simple DNS resolver, similar to nslookup

NeuralNetwork 10

TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction

Node 1

Node.js，Express, API and MongoDB basics

OS 8

TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading Add GDB support in Bochs environment and terminal output for an OS console

Public-speaking 1

Public speaking course notes

Python 15

Use Google Drive OCR Python Tutorial from Pandas Use Itertools for Better Iteration More data type from collections module File manipulation with os module in python Deal with zip file in python Regular expression usage with sample Basic json usage with python Simple file IO Glob module for simple file matching Exception Handling in python Logging in python Functional Programming Set data type in python Loop through dictionary

RESTful 1

7 RESTful routes and use PUT in HTML form

Rails 9

Rails tutorial by Ihower Rspec rocks Use markdown in your rails blog A little filter in your rails app Use devise to authenticate your app Better url with friendly id Use will paginate in your rails app Add disqus to your rails app Rails tutorial by Ihower

React 1

React basics

Redis 1

Redis basics

Ruby 6

Ruby Tutorial on Lynda Ruby practice 4 Ruby practice 3 Ruby practice 2 Ruby practice 1 Rails tutorial by Ihower

Shell 2

awk, sed, cut, content manipulating techniques in shell File and folder manipulating techniques in shell

Spring 2

CourseLookr, a DB-based web app searching courses in A&M Some tricks found during Spring/SpringMVC development

System 17

Design patterns Read "Dynamo, Amazon’s Highly Available Key-value Store" Read "Bigtable, A Distributed Storage System for Structured Data" Read "Streaming Systems" 3, Watermarks Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems

TCP 1

A TCP-like reliable data transfer protocol built on UDP

TDD 1

Rspec rocks

Thread 2

Multi-Threading with Cpp How to safely stop threads in Java

Vim 1

Quick Reference of Vim

awk 1

awk, sed, cut, content manipulating techniques in shell

git 1

Remove annoying DS_Store in git

jQuery 1

jQuery basics

media 1

Home network, media center setup

network 1

Home network, media center setup

php 1

PHP basics

Multi-Threading with Cpp

2018-02-10

Multi-threading technique indeed improves the efficiency of a program dramatically. For example, a web crawler I recently created can crawl one million URLs (parsing URL, doing DNS, sending/receiving with Socket, parsing response, etc) in a minute. Here I will list some key contents in order to share or reuse in the future.

Shared data

There should be some shared data between threads. For example, a CRITICAL_SECTION object handling locking and unlocking (much faster than mutexes), data to be consumed (like a queue containing all URLs to be crawled). It’s recommended that we create a class to hold those shared data. Note InitializeCriticalSection() needs to be called in the constructor of this class in order to use it.

Create/close threads

Threads can be created/closed with a function like below. threadStat and threadCraw are user-defined functions for single threads with def UINT threadCrawl(LPVOID pParam).

HANDLE *handles = new HANDLE[numThreads + 1];
// start stat thread
handles[numThreads] = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)threadStat, &sharedData, 0, NULL);
// start N crawling threads
for (int i = 0; i < numThreads; i++)
{
    handles[i] = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)threadCrawl, &sharedData, 0, NULL);
}
// wait for N crawing threads to finish
// signal stats thread to quit, wait for it to terminate
for (int i = 0; i < numThreads + 1; i++)
{
    WaitForSingleObject(handles[i], INFINITE);
    CloseHandle(handles[i]);
}

Lock/unlock

With CRITICAL_SECTION, we can lock and unlock as follows. Note we need to unlock before break to avoid deadlock.

// lock
EnterCriticalSection(&(sharedData->cs));
if (sharedData->urlsQueue.empty())
{
    // unlock
    LeaveCriticalSection(&(sharedData->cs));
    break;
}
url = sharedData->urlsQueue.front();
sharedData->urlsQueue.pop();
InterlockedIncrement(&(sharedData->numExtractedURLs));
// unlock
LeaveCriticalSection(&(sharedData->cs));

Atomic operations

To update the stats, we can use locking/unlocking, but it is often faster to directly use interlocked operations, each mapping to a single CPU instruction. Two examples of such functions is as below.

// increment by 1
InterlockedIncrement(&(sharedData->numExtractedURLs));

// add a number
InterlockedAdd(&(sharedData->numActiveThreads), -1);

Copyright claim: Multi-Threading with Cpp is created by melonskin on 2018/02/10. Its copyright belongs to the author. Commercial usage must be authorized by the author. The source should be included for non-commercial purposes.
Link to the article: https://amelon.org/2018/02/10/multi-thread-cpp.html

Creative Commons License

Melon blog is created by melonskin. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2016-2025. All rights reserved by melonskin. Powered by Jekyll.