itertools
This module contains lots of function on sequence-like objects. It has been never easier for iteration.
Iterator is lazy, which will not be generated until called. This is more memory efficient.
import itertools as itls
Combine: chain()
与izip()
chain()
takes n iterable objects and combine them together. Let me show you an example.
for i in itls.chain([1,2,3],['a','b','c']):
print i,
1 2 3 a b c
izip()
takes n iterable objects and combine them as tuples. Just like zip()
but return iterator instead of list.
for i, j in itls.izip([1,2,3],['a','b','c']):
print i, j
1 a
2 b
3 c
Slice: islice()
islice()
takes a iterator and return its slice. Just like the slice operation on list. There are three params start
, stop
and step
.
print "Stop at 5:"
for i in itls.islice(itls.count(),5):
print i,
Stop at 5:
0 1 2 3 4
print "Start at 5, Stop at 10:"
for i in itls.islice(itls.count(),5,10):
print i,
Start at 5, Stop at 10:
5 6 7 8 9
print "By tens to 100:"
for i in itls.islice(itls.count(),0,100,10):
print i,
By tens to 100:
0 10 20 30 40 50 60 70 80 90
Duplicate: tee()
Just like the tee in Unix. tee()
takes a iterator and return n same iterators.
r = itls.islice(itls.count(),4)
i1, i2, i3 = itls.tee(r,3) # i1 and i2, like a copy
for i, j, k in itls.izip(i1,i2,i3):
print i, j, k
0 0 0
1 1 1
2 2 2
3 3 3
One thing to notice is you should be careful when using the original iterator. Check this example to see why.
r = itls.islice(itls.count(),4)
i1, i2 = itls.tee(r)
for i in r:
print 'r:', i
if i > 0:
break
for i in i1:
print 'i1:', i
for i in i2:
print 'i2:', i
r: 0
r: 1
i1: 2
i1: 3
i2: 2
i2: 3
The original iterator consumed 0,1 and will not be generated in i1 and i2.
Map
imap()
function transform iterator. Just like the built in map()
. Let’s multiply xrange(5)
by 2.
print "Doubles:"
for i in itls.imap(lambda x: 2*x, xrange(5)):
print i,
Doubles:
0 2 4 6 8
imap()
can take more than one iterator and map it.
print "Multiples:"
for i in itls.imap(lambda x,y:(x, y, x*y), xrange(5),xrange(5,10)):
print '%d * %d = %d' % i
Multiples:
0 * 5 = 0
1 * 6 = 6
2 * 7 = 14
3 * 8 = 24
4 * 9 = 36
starmap()
is kind of similar to imap()
, but with small difference. starmap()
can parse several params from a tuple
, while imap()
get several params from multiple iterators.
values = [(0, 5), (1, 6), (2, 7), (3, 8), (4, 9)]
for i in itls.starmap(lambda x,y:(x,y,x*y), values):
print '%d * %d = %d' % i
0 * 5 = 0
1 * 6 = 6
2 * 7 = 14
3 * 8 = 24
4 * 9 = 36
Create new iterator
count()
,cycle()
and repeat()
for iterator generation.
count()
Continous integers, with lower bound 0 and no upper bound(upper bound with xrange()).
for i in itls.izip(itls.count(1),['a','b','c']):
print i
(1, 'a')
(2, 'b')
(3, 'c')
cycle()
Cycle iterable unlimitted times.
i = 0
for item in itls.cycle(['a','b','c']):
i += 1
if i == 7:
break
print (i, item)
(1, 'a')
(2, 'b')
(3, 'c')
(4, 'a')
(5, 'b')
(6, 'c')
repeat()
Repeat n times.
for i in itls.repeat('over-and-over',3):
print i
over-and-over
over-and-over
over-and-over
When we want to add a constant to a sequence, a repeat()
and imap()
combo is very powerful.
for i,s in itls.izip(itls.count(), itls.repeat('over-and-over',3)):
print i, s
0 over-and-over
1 over-and-over
2 over-and-over
for i in itls.imap(lambda x,y:(x,y,x*y),itls.repeat(2),xrange(5)):
print '%d * %d = %d' % i
2 * 0 = 0
2 * 1 = 2
2 * 2 = 4
2 * 3 = 6
2 * 4 = 8
Filter
Just like the built in filter()
function.
dropwhile()
Test the item, if True, drop it and continue; if False, stop dropping and take this element and all the rest.
def should_drop(x):
print 'Testing:', x
return x < 1
for i in itls.dropwhile(should_drop,[ -1, 0, 1, 2, 3, 1, -2 ]):
print 'Yielding:', i
Testing: -1
Testing: 0
Testing: 1
Yielding: 1
Yielding: 2
Yielding: 3
Yielding: 1
Yielding: -2
takewhile()
Different from dropwhile()
. Test the item, if True, take it and continue; if False, stop and do not take this one and the rest.
def should_take(x):
print 'Testing:', x
return x < 2
for i in itls.takewhile(should_take,[ -1, 0, 1, 2, 3, 4, 1, -2 ]):
print 'Yielding:', i
Testing: -1
Yielding: -1
Testing: 0
Yielding: 0
Testing: 1
Yielding: 1
Testing: 2
ifilter()
dropwhile()
and takewhile()
apply only on part of all elements. But ifilter()
applies to all elements. ifilterfalse
is the same but only take those with False returned.
def check_item(x):
print 'Testing:', x
return x < 1
for i in itls.ifilter(check_item, [ -1, 0, 1, 2, 3, -2 ]):
print 'Yielding:', i
Testing: -1
Yielding: -1
Testing: 0
Yielding: 0
Testing: 1
Testing: 2
Testing: 3
Testing: -2
Yielding: -2
Group iterators
groupby(iterable[, keyfunc])
Create an iterator which returns(key, sub-iterator) grouped by each value of key(value)
Group base on key
to sub-iterator
.
things = [("animal", "bear"), ("animal", "duck"),
("plant", "cactus"), ("vehicle", "speed boat"), ("vehicle", "school bus")]
groupby()
takes two params, one is the data to group
and another is the function to group it with
.
for key, group in itls.groupby(things, lambda x: x[0]):
print key, group
animal <itertools._grouper object at 0x10bce2150>
plant <itertools._grouper object at 0x10bce2190>
vehicle <itertools._grouper object at 0x10bce2150>
As can seen from the result, three sub-iterator
are returned and we can use another level loop on sub-iterator.
for key, group in itls.groupby(things, lambda x:x[0]):
for thing in group:
print "A %s is a %s." % (thing[1], key)
print ""
A bear is a animal.
A duck is a animal.
A cactus is a plant.
A speed boat is a vehicle.
A school bus is a vehicle.
But one thing to notice is that before group, make sure iterable is sorted base on key. Because new group will be created if different key encountered.
things = [("animal", "bear"), ("plant", "cactus"), ("animal", "duck")]
for key, group in itls.groupby(things, lambda x: x[0]):
print key, group
animal <itertools._grouper object at 0x10bce2410>
plant <itertools._grouper object at 0x10bce2490>
animal <itertools._grouper object at 0x10bce2410>
We expect two groups but got three because it is unsorted.
new_things = sorted(things,key=lambda x: x[0])
for key, group in itls.groupby(new_things, lambda x:x[0]):
print key, group
animal <itertools._grouper object at 0x10bce2350>
plant <itertools._grouper object at 0x10bce2410>
Now it seems like right.