Writing parallel applications in Python

Exercises

In computer science, the dining philosophers problem is an example problem often used in concurrent algorithm design to illustrate synchronization issues and techniques for resolving them. It was originally formulated in 1965 by Edsger Dijkstra as a student exam exercise, in terms of computers competing for access to tape drive peripherals. Soon after, Tony Hoare gave the problem its present formulation.

Problem Statement

Five silent philosophers sit at a table around a bowl of spaghetti. A fork is placed between each pair of adjacent philosophers.

Each philosopher must alternately think and eat. Eating is not limited by the amount of spaghetti left: assume an infinite supply. However, a philosopher can only eat while holding both the fork to the left and the fork to the right.

Each philosopher can pick up an adjacent fork, when available, and put it down, when holding it. These are separate actions: forks must be picked up and put down one by one.

The problem is how to design a discipline of behavior (a concurrent algorithm) such that each philosopher won't starve, i.e. can forever continue to alternate between eating and thinking.

2. Sum of primes (Processes)

In this task you will parallelize a computing intensive task in order to make it quicker.

Given a list of numbers, your program should calculate for each number the sum of all primes which are smaller than the given number. It should output the pairs [n, sum_primes(n)] sorted by n.

Example: The first number is n, the second number is the sum of all primes < n.

```  >> [[10, 17],
>>  [20, 77],
>>  [30, 129],
>>  [40, 197],
>>  [50, 328],
>>  [60, 440],
>>  [70, 568],
>>  [80, 791],
>>  [90, 963]]```

You can use the following methods to calculate the sum of primes below n:

```  import math

def isprime(n):
"""Test if n is a prime number or not."""
if n < 2:
return False
if n == 2:
return True
for i in range(2, int(math.ceil(math.sqrt(n)))+1):
if n % i == 0:
return False
return True

def sum_primes(n):
"""Returns the sum of all primes below n."""
return sum([x for x in range(2, n) if isprime(x)])```

```  [100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000,
1000000, 1100000, 1200000, 1300000, 1400000, 1500000, 1600000, 1700000,
1800000, 1900000, 2000000, 2100000, 2200000, 2300000, 2400000]```

(or range(100000, 2500000, 100000) for short). Please use multiprocessing.Process for the calculations and multiprocessing.Queue for the distribution of the tasks to the workers as well as for returning the results.

Please note that you cannot join() a process which has put items in a queue before all items of the queue are consumed. See:

The computation of the results will take very long with a single process. Please measure the speed of the computation with 1, 2, 4, 8 and 16 processes. To test that you have to run your program on a multicore system. You can use also use a remote machine for that:

• Later:
```  # copy your program to gnu

python prime.py```

This is a virtual machine on a host that has 8 cores and ideally you should see a 8 times speedup compared to the single core version of your program.

Parallel Cython

Lecture

In this lecture, I'll explain what the GIL is, how affects to your computations, and finally, how to get rid of it while staying in the Python ecosphere. Slides here.

Exercises

Fetch the excersises tarball from here.

Solutions

The solutions are available here.

Setting up SSH access: server for exercises

The exercises need to be performed on a machine with more power than a normal notebook.

Please save the details of the ssh connection to a config file:

`echo -e 'Host gnu\n\tHostName gnu.fuw.edu.pl\n\tPort 2005\n\tVisualHostkey yes' >>  .ssh/config`

`ssh <login>@gnu`