There are Lies, Damn Lies, and Benchmarks

Pure Python

Hello again, today I would like to share some remarkable benchmark findings that demonstrate the way Python may be boosted by embedding Lua into Python code. The process has started with simple task my friend asked me to fulfil in order to compare Python to other languages. So, below is the benchmark we are going to test:

fig 1.0

We have created 2 arrays of 5M random integers each and then we have compared them. If elements with corresponding indexes are different (merely they are) we would make “sum and update” action, which gives us the following result:

Pure Python init 7.01951003075
Pure Python sum 0.525348901749

Thus, Python requires about 7 seconds to initialize the arrays and another 0.52 seconds to perform “sum and update” action.
Are we capable to get more out of Python? Yes, for sure, we are!  Let’s see below:

NumPy

Apparently NumPy is capable to sum up two arrays in very efficient way. It is also superior in array initialization.

Consider the following code:

The output:

Numpy init 0.211035966873
Numpy sum 0.0101628303528

We can see that NumPY requires just 0.2 seconds for arrays initialization versus 7 seconds of Python, which is 35x faster of pure Python time. Same thing is with the “sum and update” action. NumPy used 0.01 second versus 0.5, which is 50x time. Pretty impressive numbers!

PyPy

Running the pure Python code (fig 1.0) with PyPy gives us the same results as NumPy:

Regular init 0.203986883163
Regular sum 0.0113749504089

C

Now let’s see what we are able to get out of C. My friend claimed he could write this function with C and it would take him no more than 5 minutes. Well, he actually did it within 5 minutes and then spent another hour trying to figure out why the function was not working. Probably, every experienced C programmer would immediately spot the problem. Namely, in C you cannot just declare the array of size 5000000 because it is too big to be placed in app stack. What you should do, is to place it in a heap by malloc function.. While my friend was straggling with C code, I wanted to figure out how to speed up Python code without writing custom module in C. And I have found this simple solution:

Lua

Lua is an amazing language! It’s available as open source, it’s easy to read and understand, which makes it easy to learn. You can find a detailed and authoritative introduction to all aspects of Lua programming written by Lua’s chief architect Roberto Ierusalimschy here. In spite being simple, it is, nevertheless, a very powerful tool and a solid piece of programming art. To say the least, Lua is one of the fastest dynamic languages today. So please do not underestimate it. An important feature for us is that Lua has a very small memory footprint – less than 200K for the whole interpreter. This allows us to seamlessly embed it into any environment in order to facilitate and speed up the process.

Here is our benchmark written with Lua but yet wrapped with Python:

Output:

LUA init: 0.080163
LUA sum: 0.007911

The result is amazing! We can see that Lua is faster than PyPy and faster than NumPy by 1.5-3x, and faster than pure Python by 50-100x!

  1. Python   1x
  2. NumPy  50x
  3. PyPy     50x
  4. Lua       100x
  5. C          100x

As you can see, it was easy to embed Lua in Python and the results are outstanding! Thus, we can say that Lua is definitely of a great help when you need to speed up some critical parts of your Python code. Many commercial products today use Lua. Just to name a few: Adobe Photoshop Lightroom, World of Warcraft, FarCry, SimCity 4 and others.

Hope you enjoyed this post and it will be useful to you.