Sieve of Eratosthene : « An algo to find them .. »

Some time ago I became interested in the Sieve of Eratosthenes for finding all prime numbers below a given limit. The naive application of the algorithm is not very fast, I tried to optimize it to achieve good performances. Here are ways to make this algorithm more efficient. Source files in C or Python will be provided throughout the article.

  1. What is a prime number ?
  2. Sieve of Eratosthene.
  3. Optimisations.
  4. Performances.
  5. To put it in a nutshell.

 

What is a prime number – A prime number is an integer which admits only two divisors, 1 and itself. Which excludes all other natural numbers (the sieve of Eratosthenes is based on this definition). There are an infinity of prime numbers.

Sieve of Eratosthene - The Sieve of Eratosthenes algorithm is very simple. Consider an array containing all the integers from 2 to n (if you want to compute all the prime numbers bellow n) which we assume all primes. For each element in the array, we will eliminate its multiples. The pseudo code could be as follows:

Sieve(integer n):
  array = integers from 2 to n
  for each i in array
    delete_multiples of i in array
  end for
end Sieve

This pseudo-code can be easily implemented in Python :

def delete_multiples(i, array):
  for x in array:
    if x > i and x % i == 0:
      array.remove(x)

def Sieve(n):
  array = [i for i in range(2, n)]
  for i in array:
    delete_multiples(i, array)
  return array

def main():
  array = Sieve(100)
  print array

main()

Note that the complexity of this implementation is not at all suitable for very large values of n (try with n > 10000, the result will take a long time to be calculated). Let’s see how to optimize it.

Optimisations – First point, if storing all numbers in an array and then removing them if they aren’t primes may seem more readable, it’s not a good method if you want performance. A first optimization is to declare an array of n boxes each of which can contain either 1 or 0 (true or false) depending on the primality (or non-primality) of the number corresponding to the index of the array. We will see that with this technique we can completely avoid comparisons and tests (except for the loops of course), which will greatly reduce the computation time.

Second, we note that if we want to find all the multiples of a number i in our array, it is not necessary to browse the entire table by testing each time the modulo. To do this simply, we could start from the index i and then go from i in i in the array, and we are certain to have all the multiples of i (including itself). This will greatly improve our algorithm.

Another remark that can be made ​​is that to test the primality of a number n, it is not necessary to test the division by all numbers from 2 to n, just go to the square root of n (a number beyond cannot be a divisor).

Finally, a small improvement related to the C implementation of the algorithm, since we will have to reset our array to 1, we could immediatly remove all the even numbers (since 2 is the only one which is prime). We can therefore initialize our array with two separate loops, one that will eliminate all the even numbers, and the other that will set to 1 the odd numbers. You can the the program in C by downloading the following file (sources: www.pythux.com/exemples/erato/crible_1.c). To compile the source file, you can use either gcc or clang (or other C compiler) like this:

1
2
<strong>clang</strong> -lm -O3 crible_1.c
<strong>gcc</strong> -lm -o3 crible_1.c

We could stop there, because performance is already very good (11 ms for a Sieve up to one million and one billion until 40-45 seconds), but we’ll see that it is still possible to reduce drastically the computation time. In fact, we know that apart from the number 2, no even number is prime. So we could find a way to store only the odd numbers in the array. We therefore divide the memory footprintby two. For a Sieve up to n = 30, we would have to store the following numbers:

1
2
Numbers   -&gt; <strong>[</strong>1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29<strong>]</strong>
Offsets   -&gt; <strong>[</strong>0, 1, 2, 3, 4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14<strong>]</strong>

But we cannot simply use the index in the array to know the numbers involved. In fact, one can notice that to find the number associated with each cell of the array, we need to calculate: « step = index * 2 + 1″. Second, to find the multiples of a number i in the array, we need to go through an array using a step of i starting from the cell (index + i). Finally, we note that with this tweak, the algorithm is more compact and we can simply initialize the array to 1 (with the function memset from library <string> of the C language). The source code corresponding to this implementation is as follows (source: www.pythux.com/examples/erato/crible.c):

char *erato_opti(int n)
{
  char  *tab = malloc(n / 2);
  int   i = 1, j, borne = sqrt(n / 2), step;

  tab = memset(tab, 1, n / 2);

  for (; i <= borne; i++)
  {
    step = 2 * i + 1;
    for (j = i + step; j < n / 2; j += step)
      tab[j] = 0;
  }

  return (tab);
}

If we want to display the prime numbers produced by the function erato_opti we just use the trick of « number = index * 2 + 1″ and do not forget to display the number 2 (which is not in the array):

void print_erato_space(char *tab, int n)
{
  int i = 1;

  printf("2\n");
  for (; i < n / 2; i++)
    if (tab[i])
      printf("%i\n", i * 2 + 1);
}

You can compile the source file with one of the following command:

1
2
<strong>clang</strong> -lm -O3 crible.c
<strong>gcc</strong> -lm -O3 crible.c
1
 

Performances – Here is a summary of the performances obtained with the different implementations :

Value of N               1000   10.000   1.000.000   1.000.000.000
Pypy's implementation    30ms    370ms       /             /
CPython's implementation 42ms    710ms       /             /
Implementation in C      1ms      1ms      7ms           40s
Fastest implementation   1ms      1ms      6ms           32s

To put it in a nutshell - We saw some possible optimizations of the algorithm of the Sieve of Eratosthenes. This article is far from complete, you can find other optimizations more or less effective. However, most implementations that you can find on the internet and that are more efficient than this one are less readable and more complex. If you know other optimizations, please post a comment :)

Pypy : « Make Python faster ! »

I inaugurate this blog with an article on PyPy, an implementation of Python very interesting in terms of performances because, as we shall see in this article, it is possible to compete with statically compiled languages ​​such as C or C++. The article will break down as follows:

  1. What is Python ?
  2. What are the different implementations of the language ?
  3. What are the specificities of Pypy ?
  4. Just-in-time compilation.
  5. How do you compile and use Pypy ?
  6. Brief performances overview.

 

What is Python – Python is a modern programming language classified as a scripting language ​​(it is interpreted, at least in its official implementation, CPython). Being very high level, it offers developers to increase productivity by allowing them to forget about low levels that you need to deal with in other languages ​​such as C, making it a suitable language for learning programming. Indeed, it is easily accessible and quick to learn. Its syntax based on indentation (no braces or keywords to separate « blocks » of instructions) makes it more readable than most languages​​. Finally, here are some specifics of the language: Type Inference, adanced Object features (OOP), integer of arbitrary size, list comprehension for creating list in an elegent maniere, etc. ..

What are the different implementations of the language ? - The Python language is placed under a free license, there are many implementations, each with their specificities. Here are a few:

  • CPython : Official implementation (the best known) written in C.
  • Pypy : Implementation that provides a Just-in-Time compiler (JIT).
  • ShedSkin : Implementation that use a Python-to-C++ translator. This one provides good performance frone the execution time point of view.
  • Stackless : Implementation that does not use the call stack from C and replacing it with a Python-specific data structure . This implementation is largely based on the paradigm of concurrency programming, involving tasklets (a kind of micro-threads) to simulate a parallel execution of programs. The tasklets could be very numerous (hundreds or thousands), communicating one with each other easily, and be paused or awakened instantly, etc. ..
  • Jython : Implementation written in Java.
  • Iron Python : Implementation written in Java C#.

 

What are the specificities of Pypy ? - PyPy is an implementation of Python written in Python and featuring (among others), a JIT compiler (see next paragraph) that enable it to deliver performances comparables to most statically-compiled languages (C, C++, etc…). Pypy also offers a reduced memory consumption compared to CPython, and supports most of the libraries available with the official implementation.

It is also possible to compile PyPy with an integrated sandbox or Stackless Python. Other languages ​​are also supported by the PyPy project (Javascript, Scheme, etc. ..) since it uses a processing chain that can analyze programs written in RPython (a subset of the Python language), translate them into C code and then compile them into native machine code, which allows any language to be used with PyPy provided they can be translated into RPython. For more information, I invite you to visit the official website of PyPy (the  link is provided at the end of article).

Just-in-Time compilation – Since the Python language provides high-level tools (such as advanced object features), it is generally difficult and inefficient (compared to the low-level languages ​​such as C) to compile it statically, so if you search for honorable performances you shouldn’t use static compilation. Fortunately, there is a high performance alternative to the static compilation, which is the JIT compilation (Just-In-Time, or « on the fly »). It is this technology that is used in the PyPy project, offering performance of statically compiled languages ​​such as C or C++ to Python. Let’s see what it is.

The Just-In-Time compilation (or dynamic compilation) is a hybrid approach between the static compilation and interpretation, it aims to achieve performance equal or superior to the conventional static compilation. The JIT compilers convert bytecode into machine code on the fly (at runtime). But to limit the performance loss, several tweaks are used. First, it is possible to cache parts of a program to avoid having to retranslate it unnecessarily during subsequent runs if source code was not modified in the meantime. Second, thanks to a system able to detect hot spots (parts of a program being used very often during the execution), the compiler may decide to bring a part of the program into machine code if this is more interesting than simple interpretation in terms of execution time.

In addition, many optimizations are possible at runtime. A JIT compiler can use the execution context informations to improve program performances. In particular, it is possible to adapt the machine code produced according to the processor architecture or to perform the inlining of dynamic libraries, without losing the advantages of dynamic linkage.

JIT compilation is used by the Java Virtual Machine or the C #.

How do you compile and use Pypy ? - There are two ways to use PyPy. Either your favorite Linux distribution you propose directly from its deposits, in which case you can simply install it, or you can download the sources from the mercurial of PyPy. Beware, however, if you install Pypy from the repositories of your distribution, it may replace an existing version of Python on your computer.

Building from sources : We will see how to download the sources of PyPy, compile it and use the resulting binary. First and foremost, you must know that the compilation of PyPy is extremely resource-hangry (memory and CPU), if you do not have 4 GB of RAM on your computer, you should prefere to install it through you distribution depositories, otherwise you may enter in a endless swap loop ..

Dependencies : http://pypy.readthedocs.org/en/latest/getting-started-python.html#translating-the-pypy-python-interpreter

Once all the dependencies have been installed, download the source of PyPy by running the following command:

1
hg clone https://bitbucket.org/pypy/pypy

Then go to the goal directory:

1
cd pypy/pypy/translator/goal

Finally run the script with Python (or with PyPy itself if you have already installed it, which will save you time and memory):

1
python2 translate.py -Ojit

Enjoy the Mandelbrot’s fractal that will be drawn into your terminal during compilation.

After completing the long process, a binary called pypy-c is normally present in the directory. You can rename and / or move it to use as you wish. It replaces the traditional Python binary, you can use it as follows:

1
./pypy mon_script.py #Your Python script

Brief performances overview. - Finally, here is a brief overview of PyPy’s performances compared to other implementations / languages ​​available on the official website of PyPy: http://speed.pypy.org/

Bibliography -

Last world - All proposals and comments are welcome of course :)

Hello World!

Comme il est de coutume de le dire dans le vaste monde de l’informatique __ Hello World ! __ cette simple expression annonce le début d’un programme ou d’autre chose. Ici elle marque le début d’une aventure, d’une expérience, d’un blog. A quoi devez-vous vous attendre ?

Au travers de ce blog, mon but est de partager avec vous mes expériences et découvertes dans le monde de l’informatique et des sciences.

Cette envie est le fruit d’une simple observation. Au travers de mes recherches sur internet (ou d’autres sources d’informations), la compréhension d’un concept ou d’une idée est souvent le fruit d’une synthèses de plusieurs sources d’informations. Il est dès lors très probable que d’autres avant moi aient déjà entrepris ces recherches, et d’autres le feront certainement dans le futur. Je veux donc partager ici des articles traitant de sujets variés, fruits de mes recherches et découvertes, afin que d’autres puissent en tirer partie.

Vous êtes libres de vous inspirer ou de réutiliser le contenu disponible sur ce blog, mais n’oubliez pas de citer vos sources :)

Je vous souhaite donc une bonne lecture, en espérant répondre à certaines de vos intérogations.