CPython vs PyPy vs Cython

http://jaredforsyth.com/media/uploads/images/cy_py_py.png

One problem that I needed to solve while making CodeTalker was fast tokenization — so I benchmarked several implementations of matching the WHITE and ID tokens.

Here’s the python implementation, so you understand what’s going on:

def white(at, st, ln):
    i = at
    while i<ln and st[i] == ' ':
        i+=1
    return i - at

def id(at, st, ln):
    i = at
    if i < ln and ('a' <= st[i] <= 'z' or 'A' <= st[i] <= 'Z' or st[i] == '_'):
        i += 1
        while i < ln and  ('a' <= st[i] <= 'z' or 'A' <= st[i] <= 'Z' or st[i] == '_' or '0' <= st[i] <= '9'):
            i += 1
    return i - at

The pypy is of course no different, and for cython I only changed the first two lines of each. e.g.

cpdef white(int at, char* st, int ln):
    cdef int i = at

I also checked Regex to see if that would be any faster.

Anyway, the raw results are below, but here’s a summary:

cython wins by about a factor of 10 each time. pypy is about par w/ regex, which is 10x faster than straight python.

Note

I’m using most recent stable release of pypy: 1.3

pypy

2.82211303711e-05 at 0
4.7474861145e-05 at 2
3.19180488586e-05 at 500
5.06615638733e-06 at 1007
5.14602661133e-06 at 1014
for IDS
0.00018682718277 at 0
7.82380104065e-05 at 10
2.23803520203e-06 at 1025
2.27212905884e-06 at 1028

cpython

0.000272939920425 at 0
0.000285755872726 at 2
0.000156256914139 at 500
6.81185722351e-06 at 1007
2.60829925537e-07 at 1014
for IDS
0.00100137805939 at 0
0.000964962005615 at 10
8.36133956909e-07 at 1025
3.35931777954e-07 at 1028

cython

9.29117202759e-07 at 0
9.21964645386e-07 at 2
5.56945800781e-07 at 500
1.21116638184e-07 at 1007
1.23023986816e-07 at 1014
for IDS
1.86841487885e-05 at 0
8.15391540527e-06 at 10
1.25169754028e-07 at 1025
1.23023986816e-07 at 1028

cpython + regex

2.03487873077e-05 at 0
3.93490791321e-05 at 2
1.27689838409e-05 at 500
1.77097320557e-06 at 1007
1.10411643982e-06 at 1014
for IDS
1.30689144135e-05 at 0
1.26049518585e-05 at 10
1.98006629944e-06 at 1025
9.75847244263e-07 at 1028

pypy + regex

0.000184435129166 at 0
4.27179336548e-05 at 2
5.02660274506e-05 at 500
3.67810726166e-05 at 1007
1.60229206085e-05 at 1014
for IDS
5.36160469055e-05 at 0
7.2783946991e-05 at 10
1.04148387909e-05 at 1025
1.33678913116e-05 at 1028
blog comments powered by Disqus