Writing parallel applications in Python

Lecture

Exercises

1. Dining philosophers (Deadlocks)

In computer science, the dining philosophers problem is an example problem often used in concurrent algorithm design to illustrate synchronization issues and techniques for resolving them. It was originally formulated in 1965 by Edsger Dijkstra as a student exam exercise, in terms of computers competing for access to tape drive peripherals. Soon after, Tony Hoare gave the problem its present formulation.

Problem Statement

Five silent philosophers sit at a table around a bowl of spaghetti. A fork is placed between each pair of adjacent philosophers.

Each philosopher must alternately think and eat. Eating is not limited by the amount of spaghetti left: assume an infinite supply. However, a philosopher can only eat while holding both the fork to the left and the fork to the right.

Each philosopher can pick up an adjacent fork, when available, and put it down, when holding it. These are separate actions: forks must be picked up and put down one by one.

The problem is how to design a discipline of behavior (a concurrent algorithm) such that each philosopher won't starve, i.e. can forever continue to alternate between eating and thinking.

Your task is to modify the philosophers.py which can be affected by a possible deadlock in a way that avoids it. Please make sure your philosophers obey the above rules in your solution!

2. Sum of primes (Processes)

In this task you will parallelize a computing intensive task in order to make it quicker.

Given a list of numbers, your program should calculate for each number the sum of all primes which are smaller than the given number. It should output the pairs [n, sum_primes(n)] sorted by n.

Example: The first number is n, the second number is the sum of all primes < n.

  >> [[10, 17],
  >>  [20, 77],
  >>  [30, 129],
  >>  [40, 197],
  >>  [50, 328],
  >>  [60, 440],
  >>  [70, 568],
  >>  [80, 791],
  >>  [90, 963]]

You can use the following methods to calculate the sum of primes below n:

  import math
  
  def isprime(n):
      """Test if n is a prime number or not."""
      if n < 2:
          return False
      if n == 2:
          return True
      for i in range(2, int(math.ceil(math.sqrt(n)))+1):
          if n % i == 0:
              return False
      return True
  
  
  def sum_primes(n):
      """Returns the sum of all primes below n."""
      return sum([x for x in range(2, n) if isprime(x)])

Your task is to calculate the sum of all primes for:

  [100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000,
   1000000, 1100000, 1200000, 1300000, 1400000, 1500000, 1600000, 1700000,
   1800000, 1900000, 2000000, 2100000, 2200000, 2300000, 2400000]

(or range(100000, 2500000, 100000) for short). Please use multiprocessing.Process for the calculations and multiprocessing.Queue for the distribution of the tasks to the workers as well as for returning the results.

Please note that you cannot join() a process which has put items in a queue before all items of the queue are consumed. See:

http://docs.python.org/library/multiprocessing.html#pipes-and-queues http://docs.python.org/library/multiprocessing.html#multiprocessing-programming

The computation of the results will take very long with a single process. Please measure the speed of the computation with 1, 2, 4, 8 and 16 processes. To test that you have to run your program on a multicore system. You can use also use a remote machine for that:

  • First setup your access to the remote machine. See below.
  • Later:
  # copy your program to gnu
  scp prime.py <login>@gnu:
  
  # login
  ssh <login>@gnu
  
  # run your program
  python prime.py

This is a virtual machine on a host that has 8 cores and ideally you should see a 8 times speedup compared to the single core version of your program.

Parallel Cython

Lecture

In this lecture, I'll explain what the GIL is, how affects to your computations, and finally, how to get rid of it while staying in the Python ecosphere. Slides here.

Exercises

Fetch the excersises tarball from here.

Solutions

The solutions are available here.

Setting up SSH access: server for exercises

The exercises need to be performed on a machine with more power than a normal notebook.

Please save the details of the ssh connection to a config file:

echo -e 'Host gnu\n\tHostName gnu.fuw.edu.pl\n\tPort 2005\n\tVisualHostkey yes' >>  .ssh/config

Please login:

ssh <login>@gnu

The machine thinks of itself as 'debian'.

The login is the same as the one used for git.

 
parallel.txt · Last modified: 2011/09/16 11:51 by francesc
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki