Read "Streaming Systems" 1&2, Streaming 101 Read "F1, a distributed SQL database that scales" Read "Zanzibar, Google’s Consistent, Global Authorization System" Read "Spanner, Google's Globally-Distributed Database" Read "Designing Data-intensive applications" 12, The Future of Data Systems IOS development with Swift Read "Designing Data-intensive applications" 10&11, Batch and Stream Processing Read "Designing Data-intensive applications" 9, Consistency and Consensus Read "Designing Data-intensive applications" 8, Distributed System Troubles Read "Designing Data-intensive applications" 7, Transactions Read "Designing Data-intensive applications" 6, Partitioning Read "Designing Data-intensive applications" 5, Replication Read "Designing Data-intensive applications" 3&4, Storage, Retrieval, Encoding Read "Designing Data-intensive applications" 1&2, Foundation of Data Systems Three cases of binary search TAMU Operating System 2 Memory Management TAMU Operating System 1 Introduction Overview in cloud computing 2 TAMU Operating System 7 Virtualization TAMU Operating System 6 File System TAMU Operating System 5 I/O and Disk Management TAMU Operating System 4 Synchronization TAMU Operating System 3 Concurrency and Threading TAMU Computer Networks 5 Data Link Layer TAMU Computer Networks 4 Network Layer TAMU Computer Networks 3 Transport Layer TAMU Computer Networks 2 Application Layer TAMU Computer Networks 1 Introduction Overview in distributed systems and cloud computing 1 A well-optimized Union-Find implementation, in Java A heap implementation supporting deletion TAMU Advanced Algorithms 3, Maximum Bandwidth Path (Dijkstra, MST, Linear) TAMU Advanced Algorithms 2, B+ tree and Segment Intersection TAMU Advanced Algorithms 1, BST, 2-3 Tree and Heap TAMU AI, Searching problems Factorization Machine and Field-aware Factorization Machine for CTR prediction TAMU Neural Network 10 Information-Theoretic Models TAMU Neural Network 9 Principal Component Analysis TAMU Neural Network 8 Neurodynamics TAMU Neural Network 7 Self-Organizing Maps TAMU Neural Network 6 Deep Learning Overview TAMU Neural Network 5 Radial-Basis Function Networks TAMU Neural Network 4 Multi-Layer Perceptrons TAMU Neural Network 3 Single-Layer Perceptrons Princeton Algorithms P1W6 Hash Tables & Symbol Table Applications Stanford ML 11 Application Example Photo OCR Stanford ML 10 Large Scale Machine Learning Stanford ML 9 Anomaly Detection and Recommender Systems Stanford ML 8 Clustering & Principal Component Analysis Princeton Algorithms P1W5 Balanced Search Trees TAMU Neural Network 2 Learning Processes TAMU Neural Network 1 Introduction Stanford ML 7 Support Vector Machine Stanford ML 6 Evaluate Algorithms Princeton Algorithms P1W4 Priority Queues and Symbol Tables Stanford ML 5 Neural Networks Learning Princeton Algorithms P1W3 Mergesort and Quicksort Stanford ML 4 Neural Networks Basics Princeton Algorithms P1W2 Stack and Queue, Basic Sorts Stanford ML 3 Classification Problems Stanford ML 2 Multivariate Regression and Normal Equation Princeton Algorithms P1W1 Union and Find Stanford ML 1 Introduction and Parameter Learning

Regular expression usage with sample


Regular Expression

import re

The most common method is match =, str). But sometimes we can not find the desired string, and a if statement is necessary to handle this case.

str = 'an example word:cat!!'
match ='word:\w\w\w', str)
if match:
    print 'found', ## 'found word:cat'
    print 'did not find'
found word:cat
# return a match object, which contains lots of info
print type(match)
<type '_sre.SRE_Match'>
print match.string # source string
print match.start() # position of w
print match.end() # position of t
print match.endpos # position of last !
print match.span()
an example word:cat!!
(11, 19)


re.match() and are pretty the same. The only difference is that re.match() match from the very beginning. You can think re.match() post a ^ restriction on

s = 'python tuts'
match = re.match(r'py',s)
if match:
s = 'python tuts'
match ='^py',s)
if match:


  • Chars like a, X, 9 match itselves; meta char don’t, like . ^ $ * + ? { }[ ] \ | ( )
  • . period, match any chars, excluding ‘\n’
  • \w ‘word’, [a-zA-Z0-9]
  • \W Non - ‘word’
  • \b match boundary between ‘word’ and ‘non-word’
  • \s match a single whitespace
  • \S match non-whitespace
  • \t, \n, \r match tab, newline, return
  • \d match [0-9]
  • ^ start
  • $ end


  • ‘+’: one or more
  • ‘*’: zero or more
  • ‘?’: zero or one
  • []: like ‘or’, indicate a set of chars, so [abc] matches ‘a’ or ‘b’ or ‘c’.
match ='[\w.-]+@[\w.-]+',string)
if match:

Group Extraction圆括号()


string = 'purple monkey dishwasher'
match ='([\w\.-]+)@([\w\.-]+)',string)
if match:
    # Return subgroup(s) of the match by indices or names.
    print # or
if match:
    # Return a tuple containing all the subgroups of the match, from 1.
    print match.groups()
('alice-b', '')

findall and groups

()findall()结合,如果包括一或多个group,就返回a list of tuples

str = 'purple, blah monkey blah dishwasher'
tuples = re.findall(r'([\w\.-]+)@([\w\.-]+)', str)
print tuples  # [('alice', ''), ('bob', '')]
for tuple in tuples:
    print tuple[0] # username
    print tuple[1] # host
[('alice', ''), ('bob', '')]


re.sub(pat, replacement, str)在str里寻找和pattern匹配的字符串,然后用replacement替换。replacement可以包含\1或者\2来代替相应的group,然后实现局部替换。

# replace hostname
str = ', and'
#returns new string with all replacements,
# \1 is group(1), \2 group(2) in the replacement
print re.sub(r'([\w\.-]+)@([\w\.-]+)', r'\', str), and

Creative Commons License
Melon blog is created by melonskin. This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
© 2016-2019. All rights reserved by melonskin. Powered by Jekyll.