Pypy : « Make Python faster ! »

I inaugurate this blog with an article on PyPy, an implementation of Python very interesting in terms of performances because, as we shall see in this article, it is possible to compete with statically compiled languages ​​such as C or C++. The article will break down as follows:

  1. What is Python ?
  2. What are the different implementations of the language ?
  3. What are the specificities of Pypy ?
  4. Just-in-time compilation.
  5. How do you compile and use Pypy ?
  6. Brief performances overview.

 

What is Python – Python is a modern programming language classified as a scripting language ​​(it is interpreted, at least in its official implementation, CPython). Being very high level, it offers developers to increase productivity by allowing them to forget about low levels that you need to deal with in other languages ​​such as C, making it a suitable language for learning programming. Indeed, it is easily accessible and quick to learn. Its syntax based on indentation (no braces or keywords to separate « blocks » of instructions) makes it more readable than most languages​​. Finally, here are some specifics of the language: Type Inference, adanced Object features (OOP), integer of arbitrary size, list comprehension for creating list in an elegent maniere, etc. ..

What are the different implementations of the language ? - The Python language is placed under a free license, there are many implementations, each with their specificities. Here are a few:

  • CPython : Official implementation (the best known) written in C.
  • Pypy : Implementation that provides a Just-in-Time compiler (JIT).
  • ShedSkin : Implementation that use a Python-to-C++ translator. This one provides good performance frone the execution time point of view.
  • Stackless : Implementation that does not use the call stack from C and replacing it with a Python-specific data structure . This implementation is largely based on the paradigm of concurrency programming, involving tasklets (a kind of micro-threads) to simulate a parallel execution of programs. The tasklets could be very numerous (hundreds or thousands), communicating one with each other easily, and be paused or awakened instantly, etc. ..
  • Jython : Implementation written in Java.
  • Iron Python : Implementation written in Java C#.

 

What are the specificities of Pypy ? - PyPy is an implementation of Python written in Python and featuring (among others), a JIT compiler (see next paragraph) that enable it to deliver performances comparables to most statically-compiled languages (C, C++, etc…). Pypy also offers a reduced memory consumption compared to CPython, and supports most of the libraries available with the official implementation.

It is also possible to compile PyPy with an integrated sandbox or Stackless Python. Other languages ​​are also supported by the PyPy project (Javascript, Scheme, etc. ..) since it uses a processing chain that can analyze programs written in RPython (a subset of the Python language), translate them into C code and then compile them into native machine code, which allows any language to be used with PyPy provided they can be translated into RPython. For more information, I invite you to visit the official website of PyPy (the  link is provided at the end of article).

Just-in-Time compilation – Since the Python language provides high-level tools (such as advanced object features), it is generally difficult and inefficient (compared to the low-level languages ​​such as C) to compile it statically, so if you search for honorable performances you shouldn’t use static compilation. Fortunately, there is a high performance alternative to the static compilation, which is the JIT compilation (Just-In-Time, or « on the fly »). It is this technology that is used in the PyPy project, offering performance of statically compiled languages ​​such as C or C++ to Python. Let’s see what it is.

The Just-In-Time compilation (or dynamic compilation) is a hybrid approach between the static compilation and interpretation, it aims to achieve performance equal or superior to the conventional static compilation. The JIT compilers convert bytecode into machine code on the fly (at runtime). But to limit the performance loss, several tweaks are used. First, it is possible to cache parts of a program to avoid having to retranslate it unnecessarily during subsequent runs if source code was not modified in the meantime. Second, thanks to a system able to detect hot spots (parts of a program being used very often during the execution), the compiler may decide to bring a part of the program into machine code if this is more interesting than simple interpretation in terms of execution time.

In addition, many optimizations are possible at runtime. A JIT compiler can use the execution context informations to improve program performances. In particular, it is possible to adapt the machine code produced according to the processor architecture or to perform the inlining of dynamic libraries, without losing the advantages of dynamic linkage.

JIT compilation is used by the Java Virtual Machine or the C #.

How do you compile and use Pypy ? - There are two ways to use PyPy. Either your favorite Linux distribution you propose directly from its deposits, in which case you can simply install it, or you can download the sources from the mercurial of PyPy. Beware, however, if you install Pypy from the repositories of your distribution, it may replace an existing version of Python on your computer.

Building from sources : We will see how to download the sources of PyPy, compile it and use the resulting binary. First and foremost, you must know that the compilation of PyPy is extremely resource-hangry (memory and CPU), if you do not have 4 GB of RAM on your computer, you should prefere to install it through you distribution depositories, otherwise you may enter in a endless swap loop ..

Dependencies : http://pypy.readthedocs.org/en/latest/getting-started-python.html#translating-the-pypy-python-interpreter

Once all the dependencies have been installed, download the source of PyPy by running the following command:

1
hg clone https://bitbucket.org/pypy/pypy

Then go to the goal directory:

1
cd pypy/pypy/translator/goal

Finally run the script with Python (or with PyPy itself if you have already installed it, which will save you time and memory):

1
python2 translate.py -Ojit

Enjoy the Mandelbrot’s fractal that will be drawn into your terminal during compilation.

After completing the long process, a binary called pypy-c is normally present in the directory. You can rename and / or move it to use as you wish. It replaces the traditional Python binary, you can use it as follows:

1
./pypy mon_script.py #Your Python script

Brief performances overview. - Finally, here is a brief overview of PyPy’s performances compared to other implementations / languages ​​available on the official website of PyPy: http://speed.pypy.org/

Bibliography -

Last world - All proposals and comments are welcome of course :)